Convert all number abbreviations to numeric values in a text file
I'd like to convert all number abbreviations such as 1K, 100K, 1M, etc. in a text file into plain numeric values such as 1000, 100000, 1000000, etc.
So for example, if I have the following text file:
1.3K apples
87.9K oranges
156K mangos
541.7K carrots
1.8M potatoes
I would like to convert it to the following in bash:
1300 apples
87900 oranges
156000 mangos
541700 carrots
1800000 potatoes
The command I have used is to replace matching strings of number abbreviations with their full numeric values like so:
sed -e 's/1K/1000/g' -e 's/1M/1000000/g' text-file.txt
My problem is that I cannot find and replace ALL of the possible number abbreviations when variation occurs. I'd like to do this until at least up to one decimal abbreviations.
Use numfmt
from GNU coreutils, don't reinvent the wheel.
$ numfmt --from=si <file
1300 apples
87900 oranges
156000 mangos
541700 carrots
1800000 potatoes
If abbreviated numbers may appear as any field, then you can use:
numfmt --from=si --field=- --invalid=ignore <file
Could you please try following, written and tested with shown samples in GNU awk
.
awk '
{
if(sub(/[kK]$/,"",$1)){
$1*=1000
}
if(sub(/[mM]$/,"",$1)){
$1*=1000000
}
}
1
' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
{
if(sub(/[kK]$/,"",$1)){ ##Checking condition if 1st field ends with k/K then do following. Substituting k/K in first field with NULL here.
$1*=1000 ##Multiplying 1000 with current 1st field value here.
}
if(sub(/[mM]$/,"",$1)){ ##Checking condition if 1st field ends with m/M then do following. Substituting m/M in first field with NULL here.
$1*=1000000 ##Multiplying 1000000 with current 1st field value here.
}
}
1 ##1 will print current line here.
' Input_file ##Mentioning Input_file name here.
Output will be as follows.
1300 apples
87900 oranges
156000 mangos
541700 carrots
1800000 potatoes