Read CSV file column by column
You should use the excellent OpenCSV for reading and writing CSV files. To adapt your example to use the library it would look like this:
public class ParseCSV {
public static void main(String[] args) {
try {
//csv file containing data
String strFile = "C:/Users/rsaluja/CMS_Evaluation/Drupal_12_08_27.csv";
CSVReader reader = new CSVReader(new FileReader(strFile));
String [] nextLine;
int lineNumber = 0;
while ((nextLine = reader.readNext()) != null) {
lineNumber++;
System.out.println("Line # " + lineNumber);
// nextLine[] is an array of values from the line
System.out.println(nextLine[4] + "etc...");
}
}
}
}
Reading a CSV file in very simple and common in Java. You actually don't require to load any extra third party library to do this for you. CSV (comma separated value) file is just a normal plain-text file, store data in column by column, and split it by a separator (e.g comma ",").
In order to read specific columns from the CSV file, there are several ways. Simplest of all is as below:
Code to read CSV without any 3rd party library
BufferedReader br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
// use comma as separator
String[] cols = line.split(cvsSplitBy);
System.out.println("Coulmn 4= " + cols[4] + " , Column 5=" + cols[5]);
}
If you notice, nothing special is performed here. It is just reading a text file, and spitting it by a separator – ",".
Consider an extract from legacy country CSV data at GeoLite Free Downloadable Databases
"1.0.0.0","1.0.0.255","16777216","16777471","AU","Australia"
"1.0.1.0","1.0.3.255","16777472","16778239","CN","China"
"1.0.4.0","1.0.7.255","16778240","16779263","AU","Australia"
"1.0.8.0","1.0.15.255","16779264","16781311","CN","China"
"1.0.16.0","1.0.31.255","16781312","16785407","JP","Japan"
"1.0.32.0","1.0.63.255","16785408","16793599","CN","China"
"1.0.64.0","1.0.127.255","16793600","16809983","JP","Japan"
"1.0.128.0","1.0.255.255","16809984","16842751","TH","Thailand"
Above code will output as below:
Column 4= "AU" , Column 5="Australia"
Column 4= "CN" , Column 5="China"
Column 4= "AU" , Column 5="Australia"
Column 4= "CN" , Column 5="China"
Column 4= "JP" , Column 5="Japan"
Column 4= "CN" , Column 5="China"
Column 4= "JP" , Column 5="Japan"
Column 4= "TH" , Column 5="Thailand"
You can, in fact, put
the columns in a Map
and then get the values simply by using the key
.
Shishir
I am sorry, but none of these answers provide an optimal solution. If you use a library such as OpenCSV you will have to write a lot of code to handle special cases to extract information from specific columns.
For example, if you have rows with less columns than what you're after, you'll have to write a lot of code to handle it. Using the OpenCSV example:
CSVReader reader = new CSVReader(new FileReader(strFile));
String [] nextLine;
while ((nextLine = reader.readNext()) != null) {
//let's say you are interested in getting columns 20, 30, and 40
String[] outputRow = new String[3];
if(parsedRow.length < 40){
outputRow[2] = null;
} else {
outputRow[2] = parsedRow[40]
}
if(parsedRow.length < 30){
outputRow[1] = null;
} else {
outputRow[1] = parsedRow[30]
}
if(parsedRow.length < 20){
outputRow[0] = null;
} else {
outputRow[0] = parsedRow[20]
}
}
This is a lot of code for a simple requirement. It gets worse if you are trying to get values of columns by name. You should use a more modern parser such as the one provided by uniVocity-parsers.
To reliably and easily get the columns you want, simply write:
CsvParserSettings settings = new CsvParserSettings();
parserSettings.selectIndexes(20, 30, 40);
CsvParser parser = new CsvParser(settings);
List<String[]> allRows = parser.parseAll(new FileReader(yourFile));
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).
I sugges to use the Apache Commons CSV https://commons.apache.org/proper/commons-csv/
Here is one example:
Path currentRelativePath = Paths.get("");
String currentPath = currentRelativePath.toAbsolutePath().toString();
String csvFile = currentPath + "/pathInYourProject/test.csv";
Reader in;
Iterable<CSVRecord> records = null;
try
{
in = new FileReader(csvFile);
records = CSVFormat.EXCEL.withHeader().parse(in); // header will be ignored
}
catch (IOException e)
{
e.printStackTrace();
}
for (CSVRecord record : records) {
String line = "";
for ( int i=0; i < record.size(); i++)
{
if ( line == "" )
line = line.concat(record.get(i));
else
line = line.concat("," + record.get(i));
}
System.out.println("read line: " + line);
}
It automaticly recognize , and " but not ; (maybe it can be configured...).
My example file is:
col1,col2,col3
val1,"val2",val3
"val4",val5
val6;val7;"val8"
And output is:
read line: val1,val2,val3
read line: val4,val5
read line: val6;val7;"val8"
Last line is considered like one value.