How convert file to json from bash

I have a .file like this and want to convert it to JSON Format.

2022/01/22 21:27:56 [notice] 40#40: signal process started
2022/01/22 21:27:56 [error] 40#40: open() "/run/nginx.pid" failed (2: No such file or directory)
2022/01/22 21:28:04 [notice] 42#42: signal process started
2022/01/22 21:28:04 [error] 42#42: open() "/run/nginx.pid" failed (2: No such file or directory)

Code

#!/bin/bash
json=''
while read -r line; do
    dt=$(echo ${line} | awk '{print ($1" "$2)}')
    info=$(echo ${line} | awk '{print ($3)}')
    error=$(echo ${line} | awk '{print ($4$5$6$7$8$9$10)}')
    json+="{\"date\":\"$dt\","\"info\":\"$info\",\"error\":\"$error\""}"
done < "$file"
    
res="$json"

How to make var res be like JSON format with separator comma "," every object {}


I want to make output like this

[{"date":"2022/01/22 21:27:56","info":"[notice]","error":"40#40:signalprocessstarted"},
{"date":"2022/01/22 21:27:56","info":"[error]","error":"40#40:open()"/run/nginx.pid"failed(2:Nosuch"},
{"date":"2022/01/22 21:28:04","info":"[notice]","error":"42#42:signalprocessstarted"},
{"date":"2022/01/22 21:28:04","info":"[error]","error":"42#42:open()"/run/nginx.pid"failed(2:Nosuch"}]

but right now my output look like this

{"date":"2022/01/22 21:27:56","info":"[notice]","error":"40#40:signalprocessstarted"}{"date":"2022/01/22 21:27:56","info":"[error]","error":"40#40:open()"/run/nginx.pid"failed(2:Nosuch"}{"date":"2022/01/22 21:28:04","info":"[notice]","error":"42#42:signalprocessstarted"}{"date":"2022/01/22 21:28:04","info":"[error]","error":"42#42:open()"/run/nginx.pid"failed(2:Nosuch"}

without any separator between object

and also how to handle string like "/run/nginx.pid" this in JSON


Would you please try the following:

#!/bin/bash

echo -n "["
while read -r day hms info err; do              # split on blank characters up to 4 words
    if (( nr++ )); then                         # count the line number
        echo ","                                # append a comma after the previous line
    fi
    err="$(sed -E 's/"/\\&/g' <<< "$err")"      # escape double quotes in the message
    printf '{"date":"%s %s","info":"%s","error":"%s"}' "$day" "$hms" "$info" "$err"
done < file
echo "]"

Output:

[{"date":"2022/01/22 21:27:56","info":"[notice]","error":"40#40: signal process started"},
{"date":"2022/01/22 21:27:56","info":"[error]","error":"40#40: open() \"/run/nginx.pid\" failed (2: No such file or directory)"},
{"date":"2022/01/22 21:28:04","info":"[notice]","error":"42#42: signal process started"},
{"date":"2022/01/22 21:28:04","info":"[error]","error":"42#42: open() \"/run/nginx.pid\" failed (2: No such file or directory)"}]

[Edit]
If you want to assign a variable (e.g, var) to the generated json formatted string, enclose the entire code with $( and ), as a command substitution then assign the variable to it such as:

var=$(
echo -n "["
while read -r day hms info err; do              # split on blank characters up to 4 words
    if (( nr++ )); then                         # count the line number
        echo ","                                # append a comma after the previous line
    fi
    printf '{"date":"%s %s","info":"%s","error":"%s"}' "$day" "$hms" "$info" "${err//\"/\\\"}"
                                                # escape double quotes in $err according to Ed Morton's suggestion
done < file
echo "]"
)
echo "$var"                                     # just to see the result

If jq is available, you can also say:

echo -n "["
while read -r day hms info err; do              # split on blank characters up to 4 words
    if (( nr++ )); then                         # count the line number
        echo ","                                # append a comma after the previous line
    fi
    jq -cn --arg date "$day $hms" --arg info "$info" --arg error "$err" '{date:$date, info:$info, error:$error}'
done < file
echo "]"

as Gordon Davisson suggests.


Please read why-is-using-a-shell-loop-to-process-text-considered-bad-practice. The people who invented shell also invented awk for shell to call to process text.

Assuming you don't REALLY want to truncate and remove all white space from the message at the end of each input row and that you want all double quotes in the input escaped in the output you can do this efficiently and portably using any awk in any shell on every Unix box:

$ cat tst.awk
BEGIN {
    printf "["
    ofmt = "%s{\"date\":\"%s %s\",\"info\":\"%s\",\"error\":\"%s\"}"
}
{
    error = $0
    sub(/([^ ]+ ){3}/,"",error)
    gsub(/"/,"\\\"",error)
    printf ofmt, sep, $1, $2, $3, error
}
END { print "]" }

$ awk -f tst.awk file
[{"date":"2022/01/22 21:27:56","info":"[notice]","error":"40#40: signal process started"},
{"date":"2022/01/22 21:27:56","info":"[error]","error":"40#40: open() \"/run/nginx.pid\" failed (2: No such file or directory)"},
{"date":"2022/01/22 21:28:04","info":"[notice]","error":"42#42: signal process started"},
{"date":"2022/01/22 21:28:04","info":"[error]","error":"42#42: open() \"/run/nginx.pid\" failed (2: No such file or directory)"}]