playing with awk command trying to remove first line in input
below is my input file, results keep returning "Breed has Votes votes"
Breed, Votes
Black Lab, 30
Chihuahua, 2
Pug, 1
Corgi, 45
Shar Pei, 21
Shih Tzu, 5
Maltese, 7
#!/bin/sh/awk
##comment The below awk script runs on the file dog_breed.txt, FS refers to field separator which is a comma in this case. We initialize our max variable to 0 and max_breed to the first breed, then iterate over the rows to find the max voted breed.
BEGIN{
FS=", "
max=0;
max_breed=$1
}
{
if(max<($2)){
max=$2;
max_breed=$1;
}
}
END{
print max_breed " has " max " votes"
}
Solution 1:
You can skip the first record (line) by adding a rule to the block:
NR > 1 {
if(max<($2)){
max=$2;
max_breed=$1;
}
}
However it's worth trying to understand why you get the result that you do when you don't exclude the first line - that's because:
-
when
NR==1
, the value ofmax
is0
(numeric - assigned in theBEGIN
block) but the value of$2
isVotes
(which is a string). So the expressionmax<($2)
convertsmax
to a string and performs a lexicographical comparison. If0
is less thanV
in your locale, then the result is TRUE, andmax
is assigned the string valueVotes
-
for subsequent lines,
$2
is numeric, butmax
is now a string so$2
gets converted to a string and again the comparison is lexicographical. AssumingV
has a greater lexicographical weight than any of the digits0
thru9
,V
always wins.
As an aside, your shebang looks invalid - it likely should be
#!/usr/bin/awk -f
or
#!/bin/awk -f
depending on your Ubuntu version. Also, assignments like max_breed=$1
aren't really meaningful in a BEGIN
block, since it is executed before any records have been processed.