I need to use sed/awk to get desired output
Order:479959,60=20130624-09:45:02.046|35=D|11=884|38=723|21=1|1=30532|10=085|59=0|114=Y|56=MBT|40=1|43=Y|100=MBTX|55=/GCQ3|49=11342|54=1|8=FIX.4.4|34=388|553=2453|9=205|52=20130624-09:45:02.046|
Order:24780,100=MBTX|43=Y|40=1|34=388|553=2453|52=2013062409:45:02.046|9=205|49=11342|54=1|8=FIX.4.4|55=/GCQ3|11=405|35=D|60=20130624-09:45:02.046|56=MBT|59=0|114=Y|10=085|21=1|38=470|1=30532|
Order:799794,55=/GCQ3|49=11342|54=1|8=FIX.4.4|34=388|553=2453|9=205|52=2013062409:45:02.046|40=1|43=Y|100=MBTX|38=350|21=1|1=30532|10=085|59=0|114=Y|56=MBT|60=20130624-09:45:02.046|35=D|11=216|
Order:72896,11=735|35=D|60=2013062409:45:02.046|56=MBT|59=0|114=Y|10=085|1=30532|38=17|21=1|100=MBTX|43=Y|40=1|553=2453|9=205|52=20130624-09:45:02.046|34=388|8=FIX.4.4|54=1|49=11342|55=/GCQ3|
I want to get the number after 38=
and the number after 11=
which should be renamed Clientid
The output should be:-
Orderid-479959 38= 723 Clientid=884
Orderid-24780 38= 470 Clientid=405
Orderid-799794 38= 350 Clientid=216
Orderid-72896 38= 17 Clientid=735
Any help will be appreciated.
Solution 1:
You can use
sed -nr 's/Order:([0-9]+),.*[,\|]38=([0-9]+)[,\|].*/Orderid-\1 38= \2/p' file | tee file2
Then
sed -nr 's/.*[,\|]11=([0-9]+)[,\|].*/Clientid=\1/p' file | tee file3
Then
paste -d ' ' file2 file3
You get your output on stdout - redirect as you please.
I can't get it in one line (although someone obviously can) since the 11=
and 38=
fields could be in either order - I have to read the file twice. You could roll it into a script like this:
#!/bin/bash
sed -nr 's/Order:([0-9]+),.*[,\|]38=([0-9]+)[,\|].*/Orderid-\1 38= \2/p' "$1" > file2
sed -nr 's/.*[,\|]11=([0-9]+)[,\|].*/Clientid=\1/p' "$1" > file3
paste -d ' ' file2 file3 > outfile
rm file2 file3
(this cleans up the files we write in the process and writes the final output to a file outfile
)
Usage:
- paste the script into an empty file and save it
- give it execute permission:
chmod u+x script
- run it with the name of your input file as argument:
./script file
- change
file2
andfile3
in the script if you have existing files with those names in the current directory!
Explanation
-
s/old/new
replaceold
withnew
-
-r
use ERE -
-n
don't print until we ask (this is just going to take out empty lines) -
[,\|]
match,
OR literal|
-
([0-9]+)
some digits to save for later -
\1
backreference to saved pattern -
tee
write to a file and print to stdout too so you can check it -
> somefile
redirect output tosomefile
instead of stdout -
paste -d ' ' file2 file3
paste columns of file3 after columns of file2 using a space as delimiter. -
rm file2 file3
delete file2 and file3
Solution 2:
Using awk
Assuming your data is in a file called data.txt
, create a file called script.awk
and give it the following contents:
BEGIN { FS="[,|]" }
NF > 0 {
for(i=1; i <= NF; i++) {
split($i, f, "[:=]")
map[f[1]] = f[2]
}
printf "Orderid-%s 38= %s Clientid=%s\n", map["Order"], map[38], map[11]
}
Then execute the following command to process the data and get output.
awk -f script.awk < data.txt
See also
- Getting started with
awk
BEGIN
pattern- Associative arrays
split
functionprintf
statementNF
variableFS
variable
In the above code, the map
variable is an associative array. I called it map because it's typically called a map in other languages (HashMap in Java, Hash in Ruby, or Dictionary in Python).
Solution 3:
One liners aren't always nice:
$ sed 's/[|,]\(11=[^|]*\).*\(|38=[^|]*|\).*/\2\1|/; s/Order:\([0-9]*\).*|38=\([0-9]*\).*|11=\([0-9]*\)|.*/Orderid-\1 38= \2 Clientid=\3/' foo
Orderid-479959 38= 723 Clientid=884
Orderid-24780 38= 470 Clientid=405
Orderid-799794 38= 350 Clientid=216
Orderid-72896 38= 17 Clientid=735
Explanation
-
s/old/new/
replaceold
withnew
-
[|,]
match|
or,
-
\(11=[^|]*\)
match any number of any characters except|
after11=
and save11=whatever
for later use as\1
-
.*
any number of any characters -
\(|38=[^|]*|\)
save|38=whatever|
for later use as\2
-
\2\1|
backreferences in replacement (this makes the fields consistent so we can deal with them in the next command) -
;
separates commands, like in the shell -
Order:\([0-9]*\).*|38=\([0-9]*\).*|11=\([0-9]*\)|.*
match this pattern (now we've cleaned it up) saving the parts we want to reuse in\(parentheses\)
again -
Orderid-\1 38= \2 Clientid=\3
replacement with\1
\2
and\3
backreferences to the numbers we saved with\(\)