sed 正则表达式命令输出以 html 结尾的行？答案

【问题标题】：sed regex command to output lines that end with html?sed 正则表达式命令输出以 html 结尾的行？
【发布时间】：2019-10-18 00:51:05
【问题描述】：

我需要一个 sed 正则表达式命令，它会在一段时间内输出以“html”结尾的每一行，而不是以“a”开头。

我当前的代码可以工作吗？

sed 's/\[^a]\*\.\(html)\/p' text.txt

【问题讨论】：

欢迎来到 Stack Overflow。我认为这个命令不起作用。但这里有一个好方法：尝试编写一个打印不以“a”开头的每一行的命令，以及另一个打印以“html”结尾的每一行的命令。这些中的每一个都比您当前的问题要容易得多，一旦您拥有它们，您会发现将它们结合起来很容易。
为什么sed 是一个要求？这是更大的sed 程序的一部分吗？您是否正在修改现有代码？更多上下文将使我们提供更好的答案。

标签： regex linux bash unix sed

【解决方案1】：

sed 命令是

sed -n '/^[^a].*html$/p'

但是打印匹配行的规范命令是grep:

grep '^[^a].*html$'

【讨论】：

如果该行只包含html怎么办？

【解决方案2】：

Sed 使事情变得过于复杂……您可以使用 grep 轻松处理！

egrep "^[^a].+\.html$" -f sourcefile > text.txt
//loads file from within the program egrep

egrep "^[^a].+\.html$" < sourcefile > text.txt
//Converts stdin file descriptor with the input redirect 
//to sourceFile for this stage of the` pipeline

在功能上是等价的。

或

pipe input |  xargs -n1 egrep "^[^a].+\.html$" > text.txt

//xargs  -n1 means take the stdin from the pipe and read it one line at a time in conjunction with the single command specified after any other xargs arguments
// ^ means from start of line, 
//. means any one character
//+ means the previous matched expression(which can be a 
//(pattern group)\1 or [r-ange], etc) one or more times
//\. means escape the single character match and match the period character

//$ means end of line(new line character)
//egrep is short for extended regular expression matches which are really

不错（假设您没有使用管道或猫等）

您可以使用以下命令将换行符分隔的文件转换为单个输入行：

cat file | tr -d '\n' ' '
//It converts all newlines into a space!

不管怎样，用简单的实用程序发挥创意，你可以做很多事情：

xargs、grep、tr 是一个很好的组合，易于学习。没有这一切的沉闷。

【讨论】：

【解决方案3】：

不要对 sed 执行此操作。使用两个不同的 grep 调用来完成

grep -v ^a file.txt | grep 'html$'

第一个 grep 获取所有不以“a”开头的行，并将输出发送到第二个 grep，第二个 grep 提取所有以“html”结尾的行。

【讨论】：

当它可以是单个表达式时，用管道做它有什么好处？
因为将事物分解成不同的部分会更容易理解。