How can I read a large text file line by line using Java?
I need to read a large text file of around 5-6 GB line by line using Java.
How can I do this quickly?
A common pattern is to use
try (BufferedReader br = new BufferedReader(new FileReader(file))) {
String line;
while ((line = br.readLine()) != null) {
// process the line.
}
}
You can read the data faster if you assume there is no character encoding. e.g. ASCII-7 but it won't make much difference. It is highly likely that what you do with the data will take much longer.
EDIT: A less common pattern to use which avoids the scope of line
leaking.
try(BufferedReader br = new BufferedReader(new FileReader(file))) {
for(String line; (line = br.readLine()) != null; ) {
// process the line.
}
// line is not visible here.
}
UPDATE: In Java 8 you can do
try (Stream<String> stream = Files.lines(Paths.get(fileName))) {
stream.forEach(System.out::println);
}
NOTE: You have to place the Stream in a try-with-resource block to ensure the #close method is called on it, otherwise the underlying file handle is never closed until GC does it much later.
Look at this blog:
- Java Read File Line by Line - Java Tutorial
The buffer size may be specified, or the default size may be used. The default is large enough for most purposes.
// Open the file
FileInputStream fstream = new FileInputStream("textfile.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
// Print the content on the console
System.out.println (strLine);
}
//Close the input stream
fstream.close();
Once Java 8 is out (March 2014) you'll be able to use streams:
try (Stream<String> lines = Files.lines(Paths.get(filename), Charset.defaultCharset())) {
lines.forEachOrdered(line -> process(line));
}
Printing all the lines in the file:
try (Stream<String> lines = Files.lines(file, Charset.defaultCharset())) {
lines.forEachOrdered(System.out::println);
}
Here is a sample with full error handling and supporting charset specification for pre-Java 7. With Java 7 you can use try-with-resources syntax, which makes the code cleaner.
If you just want the default charset you can skip the InputStream and use FileReader.
InputStream ins = null; // raw byte-stream
Reader r = null; // cooked reader
BufferedReader br = null; // buffered for readLine()
try {
String s;
ins = new FileInputStream("textfile.txt");
r = new InputStreamReader(ins, "UTF-8"); // leave charset out for default
br = new BufferedReader(r);
while ((s = br.readLine()) != null) {
System.out.println(s);
}
}
catch (Exception e)
{
System.err.println(e.getMessage()); // handle exception
}
finally {
if (br != null) { try { br.close(); } catch(Throwable t) { /* ensure close happens */ } }
if (r != null) { try { r.close(); } catch(Throwable t) { /* ensure close happens */ } }
if (ins != null) { try { ins.close(); } catch(Throwable t) { /* ensure close happens */ } }
}
Here is the Groovy version, with full error handling:
File f = new File("textfile.txt");
f.withReader("UTF-8") { br ->
br.eachLine { line ->
println line;
}
}