AWK 处理数据直到下一次匹配答案

【问题标题】：AWK process data until next matchAWK 处理数据直到下一次匹配
【发布时间】：2018-10-31 21:29:38
【问题描述】：

我正在尝试使用 awk 处理文件。样本数据：

   233;20180514;1;00;456..;m
   233;1111;2;5647;6754;..;n
   233;1111;2;5647;2342;..;n
   233;1111;2;5647;p234;..;n
   233;20180211;1;00;780..;m
   233;1111;2;5647;3434;..;n
   233;1111;2;5647;4545;..;n
   233;1111;2;5647;3453;..;n

问题陈述是说我需要复制匹配“1;00;”的记录的第二列跟踪记录直到下一个“1;00;”匹配，然后进一步复制该记录的第二列，直到下一个“1;00;”匹配。匹配模式“1;00;”也可以改变。可以说“2;20;” .在这种情况下，我需要复制第二列，直到出现“1;00;”或“2;20;”匹配。

我可以使用 while 循环来执行此操作，但我确实需要使用 awk 或 sed 执行此操作，因为文件很大并且 while 可能需要很长时间。

预期输出：

   233;20180514;1;00;456..;m
   233;20180514;1111;2;5647;6754;..;n+1
   233;20180514;1111;2;5647;2342;..;n+1
   233;20180514;1111;2;5647;p234;..;n+1
   233;20180211;1;00;780..;m
   233;20180211;1111;2;5647;3434;..;n+1
   233;20180211;1111;2;5647;4545;..;n+1
   233;20180211;1111;2;5647;3453;..;n+1

提前致谢。

【问题讨论】：

我们怎么知道 1111;2 不是很好的匹配，你说说 2;20。

标签： linux awk sed sh

【解决方案1】：

编辑： 由于 OP 已更改相关示例 Input_file，因此现在根据新示例添加代码。

awk -F";" '
length($2)==8 && !($3=="1" && $4=="00"){
   flag=""}
($3=="1" && $4=="00"){
   val=$2;
   $2="";
   sub(/;;/,";");
   flag=1;
   print;
   next
}
flag{
   $2=val OFS $2;
   $NF=$NF"+1"
}
1
' OFS=";"  Input_file

基本上检查第 8 和第 3 和第 4 字段的长度是否不是 1 和 0 条件，而不是检查 ;1;0。

如果您的实际 Input_file 与显示的示例相同，那么以下内容可能会对您有所帮助。

awk -F";" 'NF==5 || !/pay;$/{flag=""} /1;00;$/{val=$2;$2="";sub(/;;/,";");flag=1} flag{$2=val OFS $2} 1' OFS=";"  Input_file

解释：

awk -F";" '         ##Setting field separator as semi colon for all the lines here.
NF==5 || !/pay;$/{  ##Checking condition if number of fields are 5 on a line OR line is NOT ending with pay; if yes then do following.
  flag=""}          ##Setting variable flag value as NULL here.
/1;00;$/{           ##Searching string /1;00; at last of a line if it is found then do following:
  val=$2;           ##Creating variable named val whose value is $2(3nd field of current line).
  $2="";            ##Nullifying 2nd column now for current line.
  sub(/;;/,";");    ##Substituting 2 continous semi colons with single semi colon to remove 2nd columns NULL value.
  flag=1}           ##Setting value of variable flag as 1 here.
flag{               ##Checking condition if variable flag is having values then do following.
  $2=val OFS $2}    ##Re-creating value of $2 as val OFS $2, basically adding value of 2nd column of pay; line here.
1                   ##awk works on concept of condition then action so mentioning 1 means making condition TRUE and no action mentioned so print will happen of line.
' OFS=";" Input_file ##Setting OFS as semi colon here and mentioning Input_file name here.

【讨论】：

非常感谢拉文德。你的回答确实解决了我的例子。虽然该文件可能不只包含 5 个字段，但 pay 只是一个示例。它可以具有任何值作为最后一列。我将更新示例集。你能帮我理解一下逻辑吗？
@Madie，现在肯定添加了解释。您也可以在那里删除NF 条件并像!/pay;$/ && length($2)==8 这样放置，但它也将取决于您的Input_file。让我知道情况如何？？
@Madie, 啊啊啊你已经更改了很多示例输入，让我现在修复我的代码并更新我的帖子。