How to remove line breaks from a file in Java?

How can I replace all line breaks from a string in Java in such a way that will work on Windows and Linux (ie no OS specific problems of carriage return/line feed/new line etc.)?

I've tried (note readFileAsString is a function that reads a text file into a String):

String text = readFileAsString("textfile.txt");
text.replace("\n", "");

but this doesn't seem to work.

How can this be done?


Solution 1:

You need to set text to the results of text.replace():

String text = readFileAsString("textfile.txt");
text = text.replace("\n", "").replace("\r", "");

This is necessary because Strings are immutable -- calling replace doesn't change the original String, it returns a new one that's been changed. If you don't assign the result to text, then that new String is lost and garbage collected.

As for getting the newline String for any environment -- that is available by calling System.getProperty("line.separator").

Solution 2:

As noted in other answers, your code is not working primarily because String.replace(...) does not change the target String. (It can't - Java strings are immutable!) What replace actually does is to create and return a new String object with the characters changed as required. But your code then throws away that String ...


Here are some possible solutions. Which one is most correct depends on what exactly you are trying to do.

// #1
text = text.replace("\n", "");

Simply removes all the newline characters. This does not cope with Windows or Mac line terminations.

// #2
text = text.replace(System.getProperty("line.separator"), "");

Removes all line terminators for the current platform. This does not cope with the case where you are trying to process (for example) a UNIX file on Windows, or vice versa.

// #3
text = text.replaceAll("\\r|\\n", "");

Removes all Windows, UNIX or Mac line terminators. However, if the input file is text, this will concatenate words; e.g.

Goodbye cruel
world.

becomes

Goodbye cruelworld.

So you might actually want to do this:

// #4
text = text.replaceAll("\\r\\n|\\r|\\n", " ");

which replaces each line terminator with a space1. Since Java 8 you can also do this:

// #5
text = text.replaceAll("\\R", " ");

And if you want to replace multiple line terminator with one space:

// #6
text = text.replaceAll("\\R+", " ");

1 - Note there is a subtle difference between #3 and #4. The sequence \r\n represents a single (Windows) line terminator, so we need to be careful not to replace it with two spaces.

Solution 3:

This function normalizes down all whitespace, including line breaks, to single spaces. Not exactly what the original question asked for, but likely to do exactly what is needed in many cases:

import org.apache.commons.lang3.StringUtils;

final String cleansedString = StringUtils.normalizeSpace(rawString);

Solution 4:

If you want to remove only line terminators that are valid on the current OS, you could do this:

text = text.replaceAll(System.getProperty("line.separator"), "");

If you want to make sure you remove any line separators, you can do it like this:

text = text.replaceAll("\\r|\\n", "");

Or, slightly more verbose, but less regexy:

text = text.replaceAll("\\r", "").replaceAll("\\n", "");