What are NR and FNR and what does "NR==FNR" imply?
In awk, FNR
refers to the record number (typically the line number) in the current file and NR
refers to the total record number. The operator ==
is a comparison operator, which returns true when the two surrounding operands are equal.
This means that the condition NR==FNR
is only true for the first file, as FNR
resets back to 1 for the first line of each file but NR
keeps on increasing.
This pattern is typically used to perform actions on only the first file. The next
inside the block means any further commands are skipped, so they are only run on files other than the first.
The condition FNR==NR
compares the same two operands as NR==FNR
, so it behaves in the same way.
Look for keys (first word of line) in file2 that are also in file1.
Step 1: fill array a with the first words of file 1:
awk '{a[$1];}' file1
Step 2: Fill array a and ignore file 2 in the same command. For this check the total number of records until now with the number of the current input file.
awk 'NR==FNR{a[$1]}' file1 file2
Step 3: Ignore actions that might come after }
when parsing file 1
awk 'NR==FNR{a[$1];next}' file1 file2
Step 4: print key of file2 when found in the array a
awk 'NR==FNR{a[$1];next} $1 in a{print $1}' file1 file2
Look up NR
and FNR
in the awk manual and then ask yourself what is the condition under which NR==FNR
in the following example:
$ cat file1
a
b
c
$ cat file2
d
e
$ awk '{print FILENAME, NR, FNR, $0}' file1 file2
file1 1 1 a
file1 2 2 b
file1 3 3 c
file2 4 1 d
file2 5 2 e
There are awk
built-in variables.
NR
- It gives the total number of records processed.
FNR
- It gives the total number of records for each input file.