Linux command or script counting duplicated lines in a text file?
If I have a text file with the following conent
red apple
green apple
green apple
orange
orange
orange
Is there a Linux command or script that I can use to get the following result?
1 red apple
2 green apple
3 orange
Solution 1:
Send it through sort
(to put adjacent items together) then uniq -c
to give counts, i.e.:
sort filename | uniq -c
and to get that list in sorted order (by frequency) you can
sort filename | uniq -c | sort -nr
Solution 2:
Almost the same as borribles' but if you add the d
param to uniq
it only shows duplicates.
sort filename | uniq -cd | sort -nr
Solution 3:
uniq -c file
and in case the file is not sorted already:
sort file | uniq -c