In Python, is read() , or readlines() faster?

Solution 1:

For a text file just iterating over it with a for loop is almost always the way to go. Never mind about speed, it is the cleanest.

In some versions of python readline() really does just read a single line while the for loop reads large chunks and splits them up into lines so it may be faster. I think that more recent versions of Python use buffering also for readline() so the performance difference will be minuscule (for is probably still microscopically faster because it avoids a method call). However choosing one over the other for performance reasons is probably premature optimisation.

Edit to add: I just checked back through some Python release notes. Python 2.5 said:

It’s now illegal to mix iterating over a file with for line in file and calling the file object’s read()/readline()/readlines() methods.

Python 2.6 introduced TextIOBase which supports both iterating and readline() simultaneously.

Python 2.7 fixed interleaving read() and readline().

Solution 2:

If file is huge, read() is definitevely bad idea, as it loads (without size parameter), whole file into memory.

Readline reads only one line at time, so I would say that is better choice for huge files.

And just iterating over file object should be as effective as using readline.

See http://docs.python.org/tutorial/inputoutput.html#methods-of-file-objects for more info