如何仅使用 sed/awk/grep 从 Unix 中的递归 xml 属性中读取值答案

【问题标题】：How to read a value from recursive xml attribute in Unix using sed/awk/grep only如何仅使用 sed/awk/grep 从 Unix 中的递归 xml 属性中读取值
【发布时间】：2017-07-21 14:10:18
【问题描述】：

我有 config.xml。这里我需要从 xpath 中检索属性的值 /域/服务器/名称

我只能使用 grep/sed/awk。需要帮助

xml 的内容在下面，我只需要检索服务器名称。

<domain>
    <server>
        <name>AdminServer</name>
        <port>1234</port>
    </server>
    <server>
        <name>M1Server</name>
        <port>5678</port>
    </server>
    <machine>
        <name>machine01</name>
    </machine>
    <machine>
        <name>machine02</name>
    </machine>
</domain>

输出应该是：

AdminServer
M1Server

我试过了，

sed -ne '/<\/name>/ { s/<[^>]*>(.*)<\/name>/\1/; p }' config.xml

【问题讨论】：

sed -ne '// { s/]*>(.*)/\1/; p }' config.xml 此命令返回来自所有名称属性的值 AdminServer M1 machine01 machine02
我们无权使用这些工具。无法安装
@jkalyanc，你安装python了吗？
sed -ne '//,\||{ s/ ]*>(.*)/\1/; p}' config.xml 。使用此命令时，结果为 AdminServer1234M1Server5678 至少想知道我是否可以使用此输出并获取服务器名称？
@RomanPerekhrest 不，我们也没有 python。我实际上处于一个棘手的境地

标签： unix awk sed grep

【解决方案1】：

sed 仅用于单个行上的简单替换，使用 sed 做任何其他事情都只是为了心理锻炼，而不是真正的代码。这不是你想要做的，所以你甚至不应该考虑 sed。只需使用 awk：

$ awk -F'[<>]' 'p=="server" && $2=="name"{print $3} {p=$2}' file
AdminServer
M1Server

这将适用于任何 UNIX 机器上的任何 awk。如果这还不是您所需要的，请编辑您的问题以提供更具代表性的示例输入和预期输出。

【讨论】：

【解决方案2】：

试试这个命令。为您的 xml 命名并提供该文件作为输入。

awk '/<server>/,/<\/server>/' < name.xml | grep "name" | cut -d ">" -f2 | cut -d "<" -f1

输出：

AdminServer
M1Server

【讨论】：

【解决方案3】：

根据您显示的示例 Input_file，您能否尝试以下操作。

awk -F"[><]" '/<\/server>/{a="";next} /<server>/{a=1;next} a && /<name>/{print $3}'  Input_file

【讨论】：

【解决方案4】：

sed -n '/<server>/{n;s/\s*<[^>]*>//gp}'

例如。第一场比赛

1. /<server>/
match the line that contains "<server>" got "     <server>"

2. n
the "n" command will go to next line. after executed "n" command got "        <name>AdminServer</name>"

3.s/\s*<[^>]*>//gp
replece all "\s*<[^>]*>" as "". then print the pattern space

键入“info sed”以获取更多 sed 命令

【讨论】：

【解决方案5】：

只需 sed 即可获得所需的输出：

sed -n 's:.*<name>\(.*\)</name>.*:\1:p' config.xml

【讨论】：

【解决方案6】：

I feel dirty parsing XML in awk.

下面找到正确的depth 条目与正确的标签名称。它不会验证路径，尽管它取决于您指定的元素。虽然这适用于您的示例数据，但它做出了某些丑陋的假设，并且不能保证在其他地方也适用：

awk -F'[<>]' '$2~/^(domain|server|name)$/{n++} $1~/\// {n--} n==3&&$2=="name"{print $3}' input.xml

更好的解决方案是解析 XML 本身。

$ awk -F'[<>]' -v check="domain.server.name" '$2~/^[a-z]/ { path=path "." $2; closex="</"$2">" } $0~closex { sub(/\.[^.]$/,"",path) } substr(path,2)==check {print path " = " $3}' input.xml
.domain.server.name = AdminServer

为了便于评论，这里将其分开。

$ awk -F'[<>]' -v check="domain.server.name" '
  # Split fields around pointy brackets. Supply a path to check.

  $2~/^[a-z]/ {         # If we see an open tag,
    path=path "." $2    # append the current tag to our path,
    closex="</"$2">"    # compose a close tag which we'll check later.
  }

  $0~closex {                   # If we see a close tag,
    sub(/\.[^.]$/,"",path)      # truncate the path.
  }

  substr(path,2)==check {       # If we match the given path,
    print path " = " $3         # print the result.
  }

' input.xml

请注意，如果您输入格式错误的 XML，此解决方案会非常糟糕。可以改进标签的识别，但如果您使用一致的 XML 格式可能就足够了。它也可能因为其他原因而可怕地呕吐。不要这样做。安装正确的工具以正确解析 XML。

【讨论】：