提取斜线之间的文本块答案

【问题标题】：Extract text block between slashes提取斜线之间的文本块
【发布时间】：2018-10-19 03:28:41
【问题描述】：

我有一个包含以下内容的文件：

/interface bridge
add comment="Bridge VLAN18" fast-forward=no mtu=1500 name=\
    bridge-vlan18 protocol-mode=none
/interface ethernet
set [ find default-name=ether3 ] comment="Ether3" \
    name=cobre
set [ find default-name=ether5 ] comment="Ether5" disabled=\
    yes name=cobre2
/interface bridge port
add bridge=bridge-vlan18 interface=vlan18.bonding-ptp3
add bridge=bridge-vlan18 interface=vlan18.ether13
add bridge=bridge-vlan18 interface=vlan18.eoip-funes-vlan18
add bridge=bridge-vlan18 comment="VLAN18" \
    interface=eoip-orono-vlan18 path-cost=20
add bridge=bridge-vlan18 interface=vlan18.bonding-ptp5
/ip firewall connection tracking
    set tcp-established-timeout=10m

我需要提取从“/”到下一个“/”的每一个文本块，没有第二个斜线行。我尝试了以下（提取“接口桥”块）：

sed -n "/^\/interface\ bridge\s$/,/^\//p" file.txt

但我明白了：

/interface bridge
add comment="Bridge VLAN18" fast-forward=no mtu=1500 name=\
    bridge-vlan18 protocol-mode=none
/interface ethernet

我需要得到：

/interface bridge
add comment="Bridge VLAN18" fast-forward=no mtu=1500 name=\
    bridge-vlan18 protocol-mode=none

需要使用原生 linux 工具（grep、sed、awk 等）。有什么建议我该怎么做？

【问题讨论】：

标签： linux bash sed

【解决方案1】：

另一个awk

$ awk 'f && /^\//{exit} /^\/interface bridge\r?$/{f=1} f' file

/interface bridge
add comment="Bridge VLAN18" fast-forward=no mtu=1500 name=\
    bridge-vlan18 protocol-mode=none

这应该是 使用/interface bridge 令牌开始打印并在打印模式下退出并看到下一个/

【讨论】：

如果记录“interface bridge port”在“interface bridge”前面会失败
添加“+$”不起作用，没有结果。没有“+$”它对我来说效果很好
文本后面有特殊字符吗？执行cat -A file 以查看不可见的字符。也许是一个标签？
正在执行“cat -A file”，它告诉我：“/interface bridge^M$”
好的，你有回车，把`+$`改成\r?$

【解决方案2】：

带切：

cut -z -d '/' -f -2 file.txt

【讨论】：

使用 cut 8.22 我收到此错误：“cut: invalid option -- 'z'”

【解决方案3】：

这称为记录，因此在awk 中给出：

awk 'BEGIN{RS="/";ORS=""}/^interface bridge/{print RS $0}' file

这里我们将内置变量RS 定义为。 RS 是记录分隔符。输出记录分隔符ORS 设置为空字符串，因为每条记录都已以字符结尾。

上面的语句写成，如果记录以interface bridge开头，则打印前面带有记录分隔符RS的记录。

但这将匹配以字符串“interface bridge”开头的任何记录，以及“interface bridge port”。更干净一点的是：

awk 'BEGIN{RS="/";ORS=""; FS=" *\n *"}
           ($1=="interface bridge"){print RS $0}' file

在这里，我们还将记录拆分为字段，即记录中的行。上述语句为，如果记录的第一个字段等于interface bridge，则打印前面带有记录分隔符RS的记录。

【讨论】：

我对其进行了测试，它运行良好，但它也向我显示了“接口桥接端口”块。应该是完全匹配
@jeb76 是的，我明白了。请注意，如果interface bridge port 部分位于interface bridge 前面，Karafka 的解决方案也会给出错误结果
第二个例子它对我不起作用@kvantour，运行它时没有结果

【解决方案4】：

您可以在sed 中使用x 选项。

x：交换保持空间和模式空间的内容。

$ sed -n -e '/\/interface bridge$/,/^\//{x;p}' inputFile

/interface bridge
add comment="Bridge VLAN18" fast-forward=no mtu=1500 name=\
    bridge-vlan18 protocol-mode=none

注意，它会在开头添加一个额外的空行。

或者这个，

$ sed -n -e '/\/interface bridge$/,/^\//{/\/interface ethernet$/d;p}' inputFile
/interface bridge
add comment="Bridge VLAN18" fast-forward=no mtu=1500 name=\
    bridge-vlan18 protocol-mode=none

【讨论】：

它工作正常！哪个工具（awk 或 sed）最快？

【解决方案5】：

这可能对你有用（GNU sed）：

sed '/^\/interface bridge\s*$/{:a;n;/^\//!ba};d' file

如果一行以/ 开头，则打印它并将其替换为以下行。如果该行不以/ 开头，则打印并重复，直到一行确实以/ 开头，然后将其删除。所有其他行都被删除。

【讨论】：

它也显示了“接口桥端口”块。应该是完全匹配