Remove line if field is duplicate
Looking for an awk (or sed) one-liner to remove lines from the output if the first field is a duplicate.
An example for removing duplicate lines I've seen is:
awk 'a !~ $0; {a=$0}'
Tried using it for a basis with no luck (I thought changing the $0's to $1's would do the trick, but didn't seem to work).
awk '{ if (a[$1]++ == 0) print $0; }' "$@"
This is a standard (very simple) use for associative arrays.
this is how to remove duplicates
awk '!_[$1]++' file
If you're open to using Perl:
perl -ane 'print if ! $a{$F[0]}++' file
-a
autosplits the line into the @F
array, which is indexed starting at 0
The %a
hash remembers if the first field has already been seen
This related solution assumes your field separator is a comma, rather than whitespace
perl -F, -ane 'print if ! $a{$F[0]}++' file