使用 shell 脚本解析日志值答案

【问题标题】：Parsing log values with shell script使用 shell 脚本解析日志值
【发布时间】：2014-01-25 01:11:35
【问题描述】：

我正在尝试编写一个 shell 脚本来从日志的 grepped 行中解析值：

 <WhereIsTheCar - the car with id number 'Sys Generated. VARIABLESTRING 1111' is driving to: Canada>
 <WhereIsTheCar - the car with id number 'Sys Generated. VARIABLESTRING 2222' is driving to: Mexico>
 <WhereIsTheCar - no car could be found with the following ID number: 'Sys Generated. VARIABLESTRING 3333'>

我已经找到了这些行并创建了一个数组。然后，我希望获得类似于以下内容的输出：

Canada
    Sys Generated. VARIABLESTRING 1111

Mexico
    Sys Generated. VARIABLESTRING 2222

Not Found
    Sys Generated. VARIABLESTRING 3333

诚然，我不擅长编写 shell 脚本，但我想出了一种有点“蛮力”的方法来获得我想要的值：

i=0
for line in "${grep[@]}"
do
    loc[i]=`sed -e "s/.*\:\(.*\)>/\1/" <<< $line | sed -e "s/^[ \t]*//" -e "s/[ \t]*$//" -e "s/^\([\"']\)\(.*\)\1\$/\2/g"`
    echo ${loc[i]};
    id[i]=`sed -e "s/^.*\'\(.*\)\'.*$/\1/" <<< $line | sed -e "s/^[ \t]*//" -e "s/[ \t]*$//" -e "s/^\([\"']\)\(.*\)\1\$/\2/g"`
    echo ${id[i]};
    let i++
done

我在哪里创建位置和 id 数组，然后尝试修剪掉空格和多余的引号。我想我可以从这里完成，但我想知道是否有人有更优雅（或更适合）的方法。任何意见，将不胜感激。

【问题讨论】：

标签： bash shell sed

【解决方案1】：

另一种可能性是在 bash 中使用 BASH_REMATCH 而不是 awk 或 sed

   BASH_REMATCH
          An  array  variable  whose members are assigned by the =~ binary
          operator to the [[ conditional command.  The element with  index
          0  is  the  portion  of  the  string matching the entire regular
          expression.  The element with index n  is  the  portion  of  the
          string matching the nth parenthesized subexpression.  This vari‐
          able is read-only.

所以这应该适合你

#!/bin/bash
while read -r line; do
  [[ $line =~ "is driving to:"(.*)">" ]] && echo ${BASH_REMATCH[1]} || echo "Not Found"
  [[ $line =~ \'(.*)\' ]] && echo -e "\t${BASH_REMATCH[1]}\n"
done < "file"

示例输出

> ./abovescript
Canada
    Sys Generated. VARIABLESTRING 1111

Mexico
    Sys Generated. VARIABLESTRING 2222

Not Found
    Sys Generated. VARIABLESTRING 3333

【讨论】：

【解决方案2】：

awk 会更容易：

awk -F"('|driving to: |>)" '{printf "%s\n\t%s\n\n", NF==5?$4:"Not Found",$2;next}' file

用你的数据测试：

kent$  cat f
<WhereIsTheCar - the car with id number 'Sys Generated. VARIABLESTRING 1111' is driving to: Canada>
<WhereIsTheCar - the car with id number 'Sys Generated. VARIABLESTRING 2222' is driving to: Mexico>
<WhereIsTheCar - no car could be found with the following ID number: 'Sys Generated. VARIABLESTRING 3333'>

kent$  awk -F"('|driving to: |>)" '{printf "%s\n\t%s\n\n", NF==5?$4:"Not Found",$2;next}' f
Canada
        Sys Generated. VARIABLESTRING 1111

Mexico
        Sys Generated. VARIABLESTRING 2222

Not Found
        Sys Generated. VARIABLESTRING 3333

【讨论】：

【解决方案3】：

使用 sed

sed -nr "/driving to/ s/.*'([^']+)'.*:(.*)>/\2\n\t\1/p; /no car could be found/ s/.*'([^']+)'.*/ Not Found\n\t\1/p" file

 Canada
        Sys Generated. VARIABLESTRING 1111
 Mexico
        Sys Generated. VARIABLESTRING 2222
 Not Found
        Sys Generated. VARIABLESTRING 3333

解释：

一分为二，直接处理输入文件，无需循环。

提示：在 sed 中需要处理单配额时使用双配额。

/driving to/ s/.*'([^']+)'.*:(.*)>/\2\n\t\1/p用来取车找到的内容 /no car could be found/ s/.*'([^']+)'.*/ Not Found\n\t\1/p用来拍没找到车的内容。

【讨论】：