使用多个唯一词对多个文件进行 grep答案

【问题标题】：Grep multiple files with multiple unique words使用多个唯一词对多个文件进行 grep
【发布时间】：2019-08-11 08:59:30
【问题描述】：

我正在尝试从 ~1200 个文件中提取行。我现在拥有的是一个格式如下的文本文件：

"1" "keyword1" "filename1"
"2" "keyword2" "filename2"
"3" "keyword3" "filename3"
"4" "keyword4" "filename4"
and so on.

我要做的是检查包含关键字“n”的行的文件名“n”。我猜这可以在 bash 脚本中使用某种循环来完成，如下所示

for (i in 1:n){ 
grep "dataframe[i, 2]" dataframe[i,3]}

但我真的很难弄清楚如何在 BASH 脚本中实际编程，因为我习惯于使用 R。

【问题讨论】：

可能是这样的？ while read -r id keyword file; do grep $keyword $file; done < inputFile
没有。 shell（例如 bash）是一种环境，可以从该环境中调用具有某种语言的工具来对这些调用进行排序。任何时候你发现自己编写一个 shell 循环只是为了操作文本文件，你的方法是错误的。有关某些问题，请参阅 unix.stackexchange.com/q/169716/133219。

标签： linux bash grep

【解决方案1】：

试试这个：

#Iterate over the file, reading one line at a time
#For each line read 3 columns
while read -r col1 col2 col3; do
  #remove leading and trailing quotes (") with sed
  pattern=`sed -e 's/^"//' -e 's/"$//' <<<"$col2"`;
  file=`sed -e 's/^"//' -e 's/"$//' <<<"$col3"`;
  echo "Matches in $file:"
  #find matches with grep
  grep "$pattern" "$file";
  echo ""
done < list.txt

添加您想要 grep 的任何参数，例如 -n 用于行号。

【讨论】：

【解决方案2】：

你只需要：

awk -F'"' 'NR==FNR{ARGV[ARGC++]=$6; word[$6]=$4; next} $0 ~ word[FILENAME]' file

它可以很容易地变得更健壮和/或更高效，具体取决于您的潜意识需求（例如，您想要什么输出，同一个文件是否可以用不同的词多次出现，比较应该是正则表达式还是字符串，部分或全部等）。

【讨论】：