Bash: Parse CSV with quotes, commas and newlines

As chepner said, you are encouraged to use a programming language which is able to parse csv.

Here comes an example in python:

import csv

with open('a.csv', 'rb') as csvfile:
    reader = csv.reader(csvfile, quotechar='"')
    for row in reader:
        print(row[-1]) # row[-1] gives the last column

As said here

gawk -v RS='"' 'NR % 2 == 0 { gsub(/\n/, "") } { printf("%s%s", $0, RT) }' file.csv \
 | awk -F, '{print $NF}'

To handle specifically those newlines that are in doubly-quoted strings and leave those alone that are outside them, using GNU awk (for RT):

gawk -v RS='"' 'NR % 2 == 0 { gsub(/\n/, "") } { printf("%s%s", $0, RT) }' file

This works by splitting the file along " characters and removing newlines in every other block.

Output

time
2016-03-28T20:26:39
2016-03-28T20:26:41

Then use awk to split the columns and display the last column


CSV is a format which needs a proper parser (i.e. which can't be parsed with regular expressions alone). If you have Python installed, use the csv module instead of plain BASH.

If not, consider csvkit which has a lot of powerful tools to process CSV files from the command line.

See also:

  • https://unix.stackexchange.com/questions/7425/is-there-a-robust-command-line-tool-for-processing-csv-files