【发布时间】:2013-05-01 22:34:37
【问题描述】:
这是我的问题:标准输入给出了任意数量的文本行。 输出:非重复行数
输入:
She is wearing black shoes.
My name is Johny.
I hate mondays.
My name is Johny.
I don't understand you.
She is wearing black shoes.
输出:
2
【问题讨论】:
这是我的问题:标准输入给出了任意数量的文本行。 输出:非重复行数
输入:
She is wearing black shoes.
My name is Johny.
I hate mondays.
My name is Johny.
I don't understand you.
She is wearing black shoes.
输出:
2
【问题讨论】:
您可以尝试使用 uniq man uniq 并执行以下操作
sort file | uniq -u | wc -l
【讨论】:
sort 命令。不错的收获......我把它搞砸了
sort -u' without uniq'。此外,比较遵循“LC_COLLATE”指定的规则。它也有效....
sort file | uniq -u 与sort -u file 的输出不同。 sort -u file 给出了正确的输出。
这是我解决问题的方法:
... | awk '{n[$0]++} END {for (line in n) if (n[line]==1) num++; print num}'
但这很不透明。这是一种(稍微)更清晰的查看方式(需要 bash 版本 4)
... | {
declare -A count # count is an associative array
# iterate over each line of the input
# accumulate the number of times we've seen this line
#
# the construct "IFS= read -r line" ensures we capture the line exactly
while IFS= read -r line; do
(( count["$line"]++ ))
done
# now add up the number of lines who's count is only 1
num=0
for c in "${count[@]}"; do
if (( $c == 1 )); then
(( num++ ))
fi
done
echo $num
}
【讨论】: