使用 grep 匹配和擦除大块文本中的模式及其上一行答案

【问题标题】：Use grep to match and erase a pattern and its previous line in a large chunck of text使用 grep 匹配和擦除大块文本中的模式及其上一行
【发布时间】：2013-01-25 16:30:17
【问题描述】：

我有一个非常大的文本文件，其中包含类似于以下内容的数据：

he/PRP have/VBD obtain/VBN the/DT ##archbishopric/NN## against/IN the/DT monk/NNS of/IN the/DT

craft/NN ,/Fc he/PRP obtain/VBD the/DT ##archbishopric/NN## of/IN besancon/NP ;/Fx and/CC have/VBD it/PRP in/IN
======>match found: \#\#\sof\/IN

succeed/VBN to/TO the/DT ##archbishopric/NN## ./Fp

klutzy/NN little/JJ ##scene/NN## where/WRB 1/Z brave/JJ french/JJ man/NN refuse/VBZ to/TO sit/VB down/RP for/IN fear/NN of/IN be/VBG discover/VBN ./Fp
======>match found: \#\#\swhere\/WRB\s

我想使用 grep 匹配并删除所有那些包含一行“文本”的行，这些行紧跟在 =====>match found: 的换行符之后，如在：

craft/NN ,/Fc he/PRP obtain/VBD the/DT ##archbishopric/NN## of/IN besancon/NP ;/Fx and/CC have/VBD it/PRP in/IN
======>match found: \#\#\sof\/IN

并以换行符结束。

因此，根据前面的示例，我想运行 grep 并获得以下输出

he/PRP have/VBD obtain/VBN the/DT ##archbishopric/NN## against/IN the/DT monk/NNS of/IN the/DT

succeed/VBN to/TO the/DT ##archbishopric/NN## ./Fp

我已经试过了：grep -E -v '^.+\n======>match found:.+$' file.txt

按照here 的建议，通过将正则表达式.+*\n 附加到命令以包含上一行，但它不起作用，有什么建议吗？

【问题讨论】：

您是否尝试按 Enter 而不是 \n？
Remove matching and previous line的可能重复

标签： regex grep

【解决方案1】：

这个sed 命令很接近你想要的：

$ sed -n 'N;/\n======>match found:/d; P;D' textfile 
he/PRP have/VBD obtain/VBN the/DT ##archbishopric/NN## against/IN the/DT monk/NNS of/IN the/DT


succeed/VBN to/TO the/DT ##archbishopric/NN## ./Fp

【讨论】：

【解决方案2】：

由于传统的 grep 实现一次只考虑一行这一事实，多行 grep 变得复杂，因此将 \n 添加到您的模式中没有意义。

如果您有 pcregrep 可用的多行匹配，可以使用 -M 标志：

pcregrep -Mv '^.+\n======>match found:.+$'

输出：

he/PRP have/VBD obtain/VBN the/DT ##archbishopric/NN## against/IN the/DT monk/NNS of/IN the/DT


succeed/VBN to/TO the/DT ##archbishopric/NN## ./Fp

【讨论】：