How to escape comma and double quote at same time for CSV file?
Solution 1:
There are several libraries. Here are two examples:
❐ Apache Commons Lang
Apache Commons Lang includes a special class to escape or unescape strings (CSV, EcmaScript, HTML, Java, Json, XML): org.apache.commons.lang3.StringEscapeUtils
.
-
Escape to CSV
String escaped = StringEscapeUtils .escapeCsv("I said \"Hey, I am 5'10\".\""); // I said "Hey, I am 5'10"." System.out.println(escaped); // "I said ""Hey, I am 5'10""."""
-
Unescape from CSV
String unescaped = StringEscapeUtils .unescapeCsv("\"I said \"\"Hey, I am 5'10\"\".\"\"\""); // "I said ""Hey, I am 5'10"".""" System.out.println(unescaped); // I said "Hey, I am 5'10"."
* You can download it from here.
❐ OpenCSV
If you use OpenCSV, you will not need to worry about escape or unescape, only for write or read the content.
-
Writing file:
FileOutputStream fos = new FileOutputStream("awesomefile.csv"); OutputStreamWriter osw = new OutputStreamWriter(fos, "UTF-8"); CSVWriter writer = new CSVWriter(osw); ... String[] row = { "123", "John", "Smith", "39", "I said \"Hey, I am 5'10\".\"" }; writer.writeNext(row); ... writer.close(); osw.close(); os.close();
-
Reading file:
FileInputStream fis = new FileInputStream("awesomefile.csv"); InputStreamReader isr = new InputStreamReader(fis, "UTF-8"); CSVReader reader = new CSVReader(isr); for (String[] row; (row = reader.readNext()) != null;) { System.out.println(Arrays.toString(row)); } reader.close(); isr.close(); fis.close();
* You can download it from here.
Solution 2:
Excel has to be able to handle the exact same situation.
Put those things into Excel, save them as CSV, and examine the file with a text editor. Then you'll know the rules Excel is applying to these situations.
Make Java produce the same output.
The formats used by Excel are published, by the way...
****Edit 1:**** Here's what Excel does
****Edit 2:**** Note that php's fputcsv
does the same exact thing as excel if you use " as the enclosure.
[email protected]
Richard
"This is what I think"
gets transformed into this:
Email,Fname,Quoted
[email protected],Richard,"""This is what I think"""
Solution 3:
Thanks to both Tony and Paul for the quick feedback, its very helpful. I actually figure out a solution through POJO. Here it is:
if (cell_value.indexOf("\"") != -1 || cell_value.indexOf(",") != -1) {
cell_value = cell_value.replaceAll("\"", "\"\"");
row.append("\"");
row.append(cell_value);
row.append("\"");
} else {
row.append(cell_value);
}
in short if there is special character like comma or double quote within the string in side the cell, then first escape the double quote("\""
) by adding additional double quote (like "\"\""
), then put the whole thing into a double quote (like "\""+theWholeThing+"\""
)