使用 awk 计算来自另一个文件的模式的出现次数答案

【问题标题】：using awk to count the number of occurrences of pattern from another file使用 awk 计算来自另一个文件的模式的出现次数
【发布时间】：2021-02-19 16:34:53
【问题描述】：

我正在尝试获取一个包含列表的文件，并计算该列表中的项目在目标文件中出现的次数。类似：

list.txt
blonde
red
black

target.txt
bob blonde male
sam blonde female

desired_output.txt
blonde 2
red 0
black 0

我已使用以下代码来获取 target.txt 中存在的值：

awk '{count[$2]++} END {for (word in count) print word, count[word]}' target.txt

但输出不包含所需的项目，这些项目在 list.txt 但不包含在 target.txt 中

current_output.txt
blonde 2

我已经尝试了一些方法来使其正常工作，包括：

awk '{word[$1]++;next;count[$2]++} END {for (word in count) print word, count[word]}' list.txt target.txt

但是，我没有成功。

谁能帮我做这个 awk 语句读取 key.txt 文件？对代码的任何解释也将不胜感激。谢谢！

【问题讨论】：

标签： linux bash awk

【解决方案1】：

awk '
  NR==FNR{a[$0]; next}
  {
    for(i=1; i<=NF; i++){
      if ($i in a){ a[$i]++ }
    }
  }
  END{
    for(key in a){ printf "%s %d\n", key, a[key] }
  }
' list.txt target.txt

NR==FNR{a[$0]; next} 条件NR==FNR 只对第一个文件为真，所以数组a 的键是list.txt 的行。
for(i=1; i<=NF; i++) 现在对于第二个文件，它会遍历所有它的字段。
- if ($i in a){ a[$i]++ } 这将检查字段 $i 是否作为键存在在数组a 中。如果是，则与该键关联的值（最初为零）递增。
在END，我们只打印key，后跟a[key]的出现次数和换行符(\n)。

输出：

blonde 2
red 0
black 0

注意事项：

由于%d，printf 语句强制将a[key] 转换为整数，以防它仍未设置。整个语句可以用更简单的print key, a[key]+0 代替。我在写答案时错过了这一点，但现在你知道做同一件事的两种方法。 ;)
在您的尝试中，出于某种原因，您只寻址字段 2 ($2)，而忽略了其他列。

【讨论】：