在每行的第 N 个字段中搜索字符串，然后在每行的末尾附加一个值答案

【问题标题】：Search Nth field in each line for string, then append a value to end of each line在每行的第 N 个字段中搜索字符串，然后在每行的末尾附加一个值
【发布时间】：2019-04-06 04:06:53
【问题描述】：

我正在创建一个 shell 脚本来分析输入文件并将结果发送到输出文件。这是来自输入文件的示例：

01,Rome,30,New York,70,
02,Los Angeles,5,New York,50,
03,New York,40,Tokyo,20,
04,Paris,5,New York,40,
05,New York,20,London,30,
06,Seattle,20,New York,40,
07,Chicago,10,New York,30,
08,New York,20,Miami,40,

这是我在输出文件中需要的结果：

01,Rome,30,New York,70,4th,40,
02,Los Angeles,5,New York,50,4th,45,
03,New York,40,Tokyo,20,2nd,20,
04,Paris,5,New York,40,4th,35,
05,New York,20,London,30,2nd,-10,
06,Seattle,20,New York,40,4th,20,
07,Chicago,10,New York,30,4th,20,
08,New York,20,Miami,40,2nd,-20,

字段用逗号分隔。

我打算在每行的第二个字段中搜索字符串“New York”，如果为真，则添加带有“2nd”的第 6 个字段，如果不为真，则添加值为“4th”的第 6 个字段

然后我打算使用第 3 和第 5 字段中的值进行减法运算。如果第 6 个字段中的字符串是“4th”，则从第 5 个字段中减去第 3 个字段。如果第 6 个字段中的字符串是“2nd”，则从第 3 个字段中减去第 5 个字段。计算结果需要是每行第7个字段。

我尝试过 awk、sed、grep、echo 和 bc 的组合，但我觉得我想多了。有什么建议吗？

编辑：到目前为止我的进展 - 我认为单独评估和附加每一行会效率低下。

    while read line; do 
         echo "$(cut -f2 -d ",")"
    done < input.txt

打印每行的第二个字段，但我很难评估字符串并将行附加到循环中。对于减法部分，我的计划是使用 echo 和 pipe 值来 bc，但我目前卡在第一步。

【问题讨论】：

欢迎来到 SO。 Stack Overflow 是一个面向专业和爱好者程序员的问答网站。目标是您将一些自己的代码添加到您的问题中，以至少显示您为解决这个问题所做的研究工作。
这对于 awk 来说看起来很容易。使用, 逗号作为字段分隔符，然后写列比较if ($2 = "New York") { $6 = "2nd" } else { $6 = "4th" } 看起来简单易行。
@KamilCuk 确实 awk 似乎最简单。 awk -F "," '{if ($2 == "New York") { $6 = "2nd" } else { $6 = "4th" } print $1","$2","$3","$4","$5","$6"," }' input.txt 在第一部分为我工作。至于减法部分，是否可以在 if 语句中设置多个变量？我尝试了awk -F "," '{if ($2 == "New York") { $6 = "2nd" && NYpoints=$3 && OtherPoints=$5 } else { $6 = "4th" && NYpoints=$5 && OtherPoints=$3 } print $1","$2","$3","$4","$5","$6","($NYpoints - $OtherPoints) }' input.txt，但这会将第 6 个字段更改为整数。

标签： bash shell

【解决方案1】：

我认为awk 是最简单的工作，这里是使用sed 的替代方法：

sed -r 's/.*,New York,([0-9]*),.*,([0-9]*),/echo "&2nd,$((\1 - \2))"/e; 
        s/.*,.*,([0-9]*),New York,([0-9]*),/echo "&4th,$((\2 - \1))"/e' input.txt

编辑，解释：当您将 /e; 更改为 /; 并删除最后一个 e 时，您可以更好地看到正在发生的事情。
以 New York 作为第二个字段的输入行部分匹配：

.*,       # First field. It will not eat the whole line, because
          # the rest of the line must match too. 
New York, # Match on the second field
([0-9]*), # The match on the number in parentheses, so it can be used later.
.*,       # next field
([0-9]*), # Second string to remember. I edited the answer, first I had `([0-9]*).`
          # what worked (`.` is a wildcard), but `,` is better.

为了进行计算，我们需要 shell。 shell 可以在没有bc 的情况下使用echo "$((8 - 5))" 之类的东西进行计算。替换字符串将是可以执行的东西。

echo "..." # Command to echo the things in quotes
&          # Replace with the complete match, in our case the complete line
2nd,       # Nothing special here.
$((...))   # Perform calculation
\1         # Replace with first remembered match (between parentheses)
\2         # Replace with second remembered match (between parentheses)

sed 支持/e 执行结果。（不要尝试用/e设置变量，会在子shell中执行，执行后变量会丢失）。
将纽约作为第四个字段重复上述构造。

【讨论】：

【解决方案2】：

首先替换文件中的那些空格，因为这样会更容易工作

cat inputfile | sed 's/ /_/g' > tmp && mv tmp inputfile

然后定义一个测试变量：

test=New_York

现在主要流程：

for i in $(cat inputfile)
do
  if [[ $(echo "$i" | cut -d',' -f2) == "$test" ]]
  then
    int1=$(echo "$i" | cut -d',' -f5)
    int2=$(echo "$i" | cut -d',' -f3)
    result=$(expr "$int2" - "$int1")
    echo $i | sed "s/$/2nd,$result/g" >> outputfile
  else
    int1=$(echo "$i" | cut -d',' -f3)
    int2=$(echo "$i" | cut -d',' -f5)
    result=$(expr "$int2" - "$int1")
    echo $i | sed "s/$/4th,$result/g" >> outputfile
  fi
done

如果要将空格放回文件中：

cat outputfile | sed 's/_/ /g' > tmp && mv tmp outputfile

【讨论】：

如果您确实想使用循环（不应该），请使用while IFS=, read -r f1 f2 f3 f4 f5; do .. done < inputfile。
谢谢，我最终使用了这种方法。
我想你想练习使用条件语句或类似的东西。 c:
DontReadLinesWithFor