How to parse a CSV in a Bash script?

I am trying to parse a CSV containing potentially 100k+ lines. Here is the criteria I have:

The index of the identifier
The identifier value

I would like to retrieve all lines in the CSV that have the given value in the given index (delimited by commas).

Any ideas, taking in special consideration for performance?

As an alternative to cut- or awk-based one-liners, you could use the specialized csvtool aka ocaml-csv:

$ csvtool -t ',' col "$index" - < csvfile | grep "$value"

According to the docs, it handles escaping, quoting, etc.

See this youtube video: BASH scripting lesson 10 working with CSV files

CSV file:

Bob Brown;Manager;16581;Main
Sally Seaforth;Director;4678;HOME

Bash script:

#!/bin/bash
OLDIFS=$IFS
IFS=";"
while read user job uid location
 do

    echo -e "$user \
    ======================\n\
    Role :\t $job\n\
    ID :\t $uid\n\
    SITE :\t $location\n"
 done < $1
 IFS=$OLDIFS

Output:

Bob Brown     ======================
    Role :   Manager
    ID :     16581
    SITE :   Main

Sally Seaforth     ======================
    Role :   Director
    ID :     4678
    SITE :   HOME

First prototype using plain old grep and cut:

grep "${VALUE}" inputfile.csv | cut -d, -f"${INDEX}"

If that's fast enough and gives the proper output, you're done.

How to parse a CSV in a Bash script?

Related

Recent Posts