Objective-C: Reading a file line by line

What is the appropriate way of dealing with large text files in Objective-C? Let's say I need to read each line separately and want to treat each line as an NSString. What is the most efficient way of doing this?

One solution is using the NSString method:

+ (id)stringWithContentsOfFile:(NSString *)path 
      encoding:(NSStringEncoding)enc 
      error:(NSError **)error 

and then split the lines with a newline separator, and then iterate over the elements in the array. However, this seems fairly inefficient. Is there no easy way to treat the file as a stream, enumerating over each line, instead of just reading it all in at once? Kinda like Java's java.io.BufferedReader.


This will work for general reading a String from Text. If you would like to read longer text (large size of text), then use the method that other people here were mentioned such as buffered (reserve the size of the text in memory space).

Say you read a Text File.

NSString* filePath = @""//file path...
NSString* fileRoot = [[NSBundle mainBundle] 
               pathForResource:filePath ofType:@"txt"];

You want to get rid of new line.

// read everything from text
NSString* fileContents = 
      [NSString stringWithContentsOfFile:fileRoot 
       encoding:NSUTF8StringEncoding error:nil];

// first, separate by new line
NSArray* allLinedStrings = 
      [fileContents componentsSeparatedByCharactersInSet:
      [NSCharacterSet newlineCharacterSet]];

// then break down even further 
NSString* strsInOneLine = 
      [allLinedStrings objectAtIndex:0];

// choose whatever input identity you have decided. in this case ;
NSArray* singleStrs = 
      [currentPointString componentsSeparatedByCharactersInSet:
      [NSCharacterSet characterSetWithCharactersInString:@";"]];

There you have it.


That's a great question. I think @Diederik has a good answer, although it's unfortunate that Cocoa doesn't have a mechanism for exactly what you want to do.

NSInputStream allows you to read chunks of N bytes (very similar to java.io.BufferedReader), but you have to convert it to an NSString on your own, then scan for newlines (or whatever other delimiter) and save any remaining characters for the next read, or read more characters if a newline hasn't been read yet. (NSFileHandle lets you read an NSData which you can then convert to an NSString, but it's essentially the same process.)

Apple has a Stream Programming Guide that can help fill in the details, and this SO question may help as well if you're going to be dealing with uint8_t* buffers.

If you're going to be reading strings like this frequently (especially in different parts of your program) it would be a good idea to encapsulate this behavior in a class that can handle the details for you, or even subclassing NSInputStream (it's designed to be subclassed) and adding methods that allow you to read exactly what you want.

For the record, I think this would be a nice feature to add, and I'll be filing an enhancement request for something that makes this possible. :-)


Edit: Turns out this request already exists. There's a Radar dating from 2006 for this (rdar://4742914 for Apple-internal people).


This should do the trick:

#include <stdio.h>

NSString *readLineAsNSString(FILE *file)
{
    char buffer[4096];

    // tune this capacity to your liking -- larger buffer sizes will be faster, but
    // use more memory
    NSMutableString *result = [NSMutableString stringWithCapacity:256];

    // Read up to 4095 non-newline characters, then read and discard the newline
    int charsRead;
    do
    {
        if(fscanf(file, "%4095[^\n]%n%*c", buffer, &charsRead) == 1)
            [result appendFormat:@"%s", buffer];
        else
            break;
    } while(charsRead == 4095);

    return result;
}

Use as follows:

FILE *file = fopen("myfile", "r");
// check for NULL
while(!feof(file))
{
    NSString *line = readLineAsNSString(file);
    // do stuff with line; line is autoreleased, so you should NOT release it (unless you also retain it beforehand)
}
fclose(file);

This code reads non-newline characters from the file, up to 4095 at a time. If you have a line that is longer than 4095 characters, it keeps reading until it hits a newline or end-of-file.

Note: I have not tested this code. Please test it before using it.