Number of non repeating lines - unique count
Here is my problem: Any number of lines of text is given from standard input. Output: number of non repeating lines
INPUT:
She is wearing black shoes.
My name is Johny.
I hate mondays.
My name is Johny.
I don't understand you.
She is wearing black shoes.
OUTPUT:
2
You could try using uniq man uniq
and do the following
sort file | uniq -u | wc -l
Here's how I'd solve the problem:
... | awk '{n[$0]++} END {for (line in n) if (n[line]==1) num++; print num}'
But that's pretty opaque. Here's a (slightly) more legible way to look at it (requires bash version 4)
... | {
declare -A count # count is an associative array
# iterate over each line of the input
# accumulate the number of times we've seen this line
#
# the construct "IFS= read -r line" ensures we capture the line exactly
while IFS= read -r line; do
(( count["$line"]++ ))
done
# now add up the number of lines who's count is only 1
num=0
for c in "${count[@]}"; do
if (( $c == 1 )); then
(( num++ ))
fi
done
echo $num
}