Recommendations on parsing .eml files in C#

Solution 1:

Added August 2017: Check out MimeKit: https://github.com/jstedfast/MimeKit. It supports .NET Standard, so will run cross-platform.

Original answer: I posted a sample project to illustrate this answer to Github

The CDO COM DLL is part of Windows/IIS and can be referenced in .net. It will provide accurate parsing and a nice object model. Use it in conjuction with a reference to ADODB.DLL.

public CDO.Message LoadEmlFromFile(String emlFileName)
{
    CDO.Message msg = new CDO.MessageClass();
    ADODB.Stream stream = new ADODB.StreamClass();

    stream.Open(Type.Missing, ADODB.ConnectModeEnum.adModeUnknown, ADODB.StreamOpenOptionsEnum.adOpenStreamUnspecified, String.Empty, String.Empty);
    stream.LoadFromFile(emlFileName);
    stream.Flush();
    msg.DataSource.OpenObject(stream, "_Stream");
    msg.DataSource.Save();

    stream.Close();
    return msg;
}

Solution 2:

Follow this link for a good solution:

The summary of the article is 4 steps(The second step below is missing in the article but needed):

  1. Add a reference to "Microsoft CDO for Windows 2000 Library", which can be found on the ‘COM’ tab in the Visual Studio ‘Add reference’ dialog. This will add 2 references to "ADODB" and "CDO" in your project.

  2. Disable embedding of Interop types for the 2 reference "ADODB" and "CDO". (References -> ADODB -> Properties -> Set 'Embed Interop Types' to False and repeat the same for CDO)

  3. Add the following method in your code:

    protected CDO.Message ReadMessage(String emlFileName)
    {
        CDO.Message msg = new CDO.MessageClass();
        ADODB.Stream stream = new ADODB.StreamClass();
        stream.Open(Type.Missing, 
                       ADODB.ConnectModeEnum.adModeUnknown, 
                       ADODB.StreamOpenOptionsEnum.adOpenStreamUnspecified,                                                                         
                       String.Empty, 
                       String.Empty);
        stream.LoadFromFile(emlFileName);
        stream.Flush();
        msg.DataSource.OpenObject(stream, "_Stream");
        msg.DataSource.Save();
        return msg;
    }
    
  4. Call this method by passing the full path of your eml file and the CDO.Message object it returns will have all the parsed info you need including To,From, Subject, Body.

Solution 3:

LumiSoft includes a Mime parser.

Sasa includes a Mime parser as well.

Solution 4:

Getting a decent MIME parser would be probably a way to go. You may try to use a free MIME parser (such as this one from codeproject) but comments from code author like this

I worked on this at about the same time that I worked on a wrapper class for MSG files. Big difference in difficulty. Where the EML wrapper class maybe took a day to read the spec and get right, the MSG wrapper class took a week.

made me curious about the code quality. I'm sure that you can hack a mime parser which parses 95% of email correctly in a few days/hours. I'm also sure that getting right the remaining 5% will take months. Consider handling S/MIME (encrypted and signed email), unicode, malformed emails produced by misbehaving mail clients and servers, several encoding schemas, internationalization issues, making sure that intentionally mallformed emails will not crash your app, etc...

If email you need to parse are comming from single source the quick & dirty parser may be enough. If you need to parse emails from the wild a better solution could be needed.

I would recommend our Rebex Secure Mail component, but I'm sure that you get decent result with components from other vendors as well.

Making sure that the parser of your choice is working correctly on the infamous "Mime Torture Sample message" prepared by Mike Crispin (co-author of MIME and IMAP RFCs). The testing message is displayed in MIME Explorer sample and can be downloaded in the installation package.

Following code shows how to read and parse EML file:

using Rebex.Mail;

MailMessage message = new MailMessage();
message.Load("file.eml");

Solution 5:

What you probably need is an email/MIME parser. Parsing all the header field is not very hard, but separating out various MIME types like images, attachments, various text and html parts etc. can get very complex.

We use a third party tool but there are many C# tools/libraries out there. Search for free C# email MIME parser in Google. Like I got this one:

http://www.codeproject.com/Articles/11882/Advanced-MIME-Parser-Creator-Editor http://www.lumisoft.ee/lswww/download/downloads/Net/info.txt