Select unique or distinct values from a list in UNIX shell script
Solution 1:
You might want to look at the uniq
and sort
applications.
./yourscript.ksh | sort | uniq
(FYI, yes, the sort is necessary in this command line, uniq
only strips duplicate lines that are immediately after each other)
EDIT:
Contrary to what has been posted by Aaron Digulla in relation to uniq
's commandline options:
Given the following input:
class jar jar jar bin bin java
uniq
will output all lines exactly once:
class jar bin java
uniq -d
will output all lines that appear more than once, and it will print them once:
jar bin
uniq -u
will output all lines that appear exactly once, and it will print them once:
class java
Solution 2:
./script.sh | sort -u
This is the same as monoxide's answer, but a bit more concise.
Solution 3:
With zsh you can do this:
% cat infile
tar
more than one word
gz
java
gz
java
tar
class
class
zsh-5.0.0[t]% print -l "${(fu)$(<infile)}"
tar
more than one word
gz
java
class
Or you can use AWK:
% awk '!_[$0]++' infile
tar
more than one word
gz
java
class
Solution 4:
For larger data sets where sorting may not be desirable, you can also use the following perl script:
./yourscript.ksh | perl -ne 'if (!defined $x{$_}) { print $_; $x{$_} = 1; }'
This basically just remembers every line output so that it doesn't output it again.
It has the advantage over the "sort | uniq
" solution in that there's no sorting required up front.