Read Big TXT File, Out of Memory Exception

I want to read big TXT file size is 500 MB, First I use

var file = new StreamReader(_filePath).ReadToEnd();  
var lines = file.Split(new[] { '\n' });

but it throw out of memory Exception then I tried to read line by line but again after reading around 1.5 million lines it throw out of memory Exception

  using (StreamReader r = new StreamReader(_filePath))
         {            
             while ((line = r.ReadLine()) != null)            
                 _lines.Add(line);            
         }

or I used

  foreach (var l in File.ReadLines(_filePath))
            {
                _lines.Add(l);
            }

but Again I received

An exception of type 'System.OutOfMemoryException' occurred in mscorlib.dll but was not handled in user code

My Machine is powerful machine with 8GB of ram so it shouldn't be my machine problem.

p.s: I tried to open this file in NotePadd++ and I received 'the file is too big to be opened' exception.


Solution 1:

Just use File.ReadLines which returns an IEnumerable<string> and doesn't load all the lines at once to the memory.

foreach (var line in File.ReadLines(_filePath))
{
    //Don't put "line" into a list or collection.
    //Just make your processing on it.
}

Solution 2:

The cause of exception seem to be growing _lines collection but not reading big file. You are reading line and adding to some collection _lines which will be taking memory and causing out of memory execption. You can apply filters to only put the required lines to _lines collection.

Solution 3:

I know this is an old post but Google sent me here in 2021..

Just to emphasize igrimpe's comments above:

I've run into an OutOfMemoryException on StreamReader.ReadLine() recently looping through folders of giant text files.

As igrimpe mentioned, you can sometimes encounter this where your input file exhibits a lack of uniformity in line breaks. If you are looping through a textfile and encounter this, double check your input file for unexpected characters / ascii encoded hex or binary strings, etc.

In my case, I split the 60 gb problematic file into 256mb chunks, had my file iterator stash the problematic textfiles as part of the exception trap and later remedied the problem textfiles by removing the problematic lines.

Solution 4:

Edit:

loading the whole file in memory will be causing objects to grow, and .net will throw OOM exceptions if it cannot allocate enough contiguous memory for an object.

The answer is still the same, you need to stream the file, not read the entire contents. That may require a rearchitecture of your application, however using IEnumerable<> methods you can stack up business processes in different areas of the applications and defer processing.


A "powerful" machine with 8GB of RAM isn't going to be able to store a 500GB file in memory, as 500 is bigger than 8. (plus you don't get 8 as the operating system will be holding some, you can't allocate all memory in .Net, 32-bit has a 2GB limit, opening the file and storing the line will hold the data twice, there is an object size overhead....)

You can't load the whole thing into memory to process, you will have to stream the file through your processing.