How to parse a CSV file in Bash?

I'm working on a long Bash script. I want to read cells from a CSV file into Bash variables. I can parse lines and the first column, but not any other column. Here's my code so far:


  cat myfile.csv|while read line
  do
    read -d, col1 col2 < <(echo $line)
    echo "I got:$col1|$col2"
  done

It's only printing the first column. As an additional test, I tried the following:

read -d, x y < <(echo a,b,)

And $y is empty. So I tried:

read x y < <(echo a b)

And $y is b. Why?

You need to use IFS instead of -d:

while IFS=, read -r col1 col2
do
    echo "I got:$col1|$col2"
done < myfile.csv

Note that for general purpose CSV parsing you should use a specialized tool which can handle quoted fields with internal commas, among other issues that Bash can't handle by itself. Examples of such tools are cvstool and csvkit.

From the man page:

-d delim The first character of delim is used to terminate the input line, rather than newline.

You are using -d, which will terminate the input line on the comma. It will not read the rest of the line. That's why $y is empty.

We can parse csv files with quoted strings and delimited by say | with following code

while read -r line
do
    field1=$(echo "$line" | awk -F'|' '{printf "%s", $1}' | tr -d '"')
    field2=$(echo "$line" | awk -F'|' '{printf "%s", $2}' | tr -d '"')

    echo "$field1 $field2"
done < "$csvFile"

awk parses the string fields to variables and tr removes the quote.

Slightly slower as awk is executed for each field.

How to parse a CSV file in Bash?

Related

Recent Posts