【问题标题】:Parsing Snort Alert File with Regex使用正则表达式解析 Snort 警报文件
【发布时间】:2016-07-03 22:15:13
【问题描述】:

我正在尝试在 Python 中使用正则表达式从 snort 警报文件中解析出源、目标(IP 和端口)和时间戳。示例如下:

03/09-14:10:43.323717  [**] [1:2008015:9] ET MALWARE User-Agent (Win95) [**] [Classification: A Network Trojan was detected] [Priority: 1] {TCP} 172.16.116.194:28692 -> 205.181.112.65:80

我有一个用于 IP 的正则表达式,但由于 IP 中的端口,它不能正确触发。如何将端口与 IP 分开?

^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$

【问题讨论】:

  • 移除锚点^$ 并尝试..这将捕获IP
  • 新场景,没有端口怎么办?因此:03/09-15:32:15.537934 [**] [1:2100366:8] GPL ICMP_INFO PING *NIX [**] [Classification: Misc activity] [Priority: 3] {ICMP} 172.16.114.50 -> 172.16.114.148

标签: python regex text-processing snort


【解决方案1】:

这应该从整行中提取必要的部分:

r'([0-9:./-]+)\s+.*?(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):(\d{1,5})\s+->\s+(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):(\d{1,5})'

看这个例子:

In [22]: line = '03/09-14:10:43.323717  [**] [1:2008015:9] ET MALWARE User-Agent (Win95) [**] [Classification: A Network Trojan was detected] [Priority: 1] {TCP} 172.16.116.194:28692 -> 205.181.112.65:80'

In [23]: m = re.match(r'([0-9:./-]+)\s+.*?(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):(\d{1,5})\s+->\s+(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):(\d{1,5})', line)

In [24]: m.group(1)
Out[24]: '03/09-14:10:43.323717'

In [25]: m.group(2)
Out[25]: '172.16.116.194'

In [26]: m.group(3)
Out[26]: '28692'

In [27]: m.group(4)
Out[27]: '205.181.112.65'

In [28]: m.group(5)
Out[28]: '80'

【讨论】:

  • 太棒了!将时间分成一个单独的实体只是另一个组对吗?
  • 好吧,把([0-9:./-]+)改成([0-9/]+)-([0-9:.]+)就行了。
  • 唯一剩下的就是从时间戳中删除微秒。我以为我可以用 strftime 做到这一点,但它不能像我想要的那样工作,因为输入字符串时间格式与输出字符串格式不匹配。
  • 它读取一个文本文件。如果其中一个组字段没有返回任何内容怎么办?例如,有些 IP 没有与之关联的端口。我遇到了一个问题,当我点击其中一个时出现 NoneType 错误。
【解决方案2】:

您可以通过这种方式将它们分成不同的捕获组:

(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):(\d{1,5})

同时丢失 ^$ 将使您能够匹配行的中间,而不仅仅是整行。

【讨论】:

    【解决方案3】:

    如果我理解正确,您想分别捕获 IP 和端口,对吗?

    在这种情况下,在正则表达式中使用“组”可以解决您的问题:

    result = re.search(r'((\d{1,3}\.){3}\d{1,3}):(\d{1,5})', input)
    

    现在,result.group(1) 包含 IP 地址,result.group(3) 包含端口。

    【讨论】:

      【解决方案4】:

      说明

      ^((?:[0-9]{2}[-\/:.]){5}[0-9]{6}).*[{]TCP[}]\s*(((?:[0-9]{1,3}[.]){1,3}[0-9]{1,3}):([0-9]{1,6}))\s*->\s*(((?:[0-9]{1,3}[.]){1,3}[0-9]{1,3}):([0-9]{1,6}))
      

      ** 要更好地查看图像,只需右键单击图像并选择在新窗口中查看

      此正则表达式将执行以下操作:

      • 将时间戳捕获到捕获组 1
      • 将源 IP 地址和端口捕获到捕获组 2、3、4
      • 将目标 IP 地址和端口捕获到捕获组 5、6、7 中
      • 要求 IP 源和目标由 {TCP} 处理,以防消息还包含 IP 地址。

      示例

      现场演示

      https://regex101.com/r/hD4fW8/1

      示例文本

      03/09-14:10:43.323717  [**] [1:2008015:9] ET MALWARE User-Agent (Win95) [**] [Classification: A Network Trojan was detected] [Priority: 1] {TCP} 172.16.116.194:28692 -> 205.181.112.65:80
      

      示例匹配

      MATCH 1
      1.  [0-21]  `03/09-14:10:43.323717`
      2.  [145-165]   `172.16.116.194:28692`
      3.  [145-159]   `172.16.116.194`
      4.  [160-165]   `28692`
      5.  [169-186]   `205.181.112.65:80`
      6.  [169-183]   `205.181.112.65`
      7.  [184-186]   `80`
      

      说明

      NODE                     EXPLANATION
      ----------------------------------------------------------------------
        ^                        the beginning of the string
      ----------------------------------------------------------------------
        (                        group and capture to \1:
      ----------------------------------------------------------------------
          (?:                      group, but do not capture (5 times):
      ----------------------------------------------------------------------
            [0-9]{2}                 any character of: '0' to '9' (2 times)
      ----------------------------------------------------------------------
            [-\/:.]                  any character of: '-', '\/', ':', '.'
      ----------------------------------------------------------------------
          ){5}                     end of grouping
      ----------------------------------------------------------------------
          [0-9]{6}                 any character of: '0' to '9' (6 times)
      ----------------------------------------------------------------------
        )                        end of \1
      ----------------------------------------------------------------------
        .*                       any character except \n (0 or more times
                                 (matching the most amount possible))
      ----------------------------------------------------------------------
        [{]                      any character of: '{'
      ----------------------------------------------------------------------
        TCP                      'TCP'
      ----------------------------------------------------------------------
        [}]                      any character of: '}'
      ----------------------------------------------------------------------
        \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                                 more times (matching the most amount
                                 possible))
      ----------------------------------------------------------------------
        (                        group and capture to \2:
      ----------------------------------------------------------------------
          (                        group and capture to \3:
      ----------------------------------------------------------------------
            (?:                      group, but do not capture (between 1
                                     and 3 times (matching the most amount
                                     possible)):
      ----------------------------------------------------------------------
              [0-9]{1,3}               any character of: '0' to '9'
                                       (between 1 and 3 times (matching the
                                       most amount possible))
      ----------------------------------------------------------------------
              [.]                      any character of: '.'
      ----------------------------------------------------------------------
            ){1,3}                   end of grouping
      ----------------------------------------------------------------------
            [0-9]{1,3}               any character of: '0' to '9' (between
                                     1 and 3 times (matching the most
                                     amount possible))
      ----------------------------------------------------------------------
          )                        end of \3
      ----------------------------------------------------------------------
          :                        ':'
      ----------------------------------------------------------------------
          (                        group and capture to \4:
      ----------------------------------------------------------------------
            [0-9]{1,6}               any character of: '0' to '9' (between
                                     1 and 6 times (matching the most
                                     amount possible))
      ----------------------------------------------------------------------
          )                        end of \4
      ----------------------------------------------------------------------
        )                        end of \2
      ----------------------------------------------------------------------
        \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                                 more times (matching the most amount
                                 possible))
      ----------------------------------------------------------------------
        ->                       '->'
      ----------------------------------------------------------------------
        \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                                 more times (matching the most amount
                                 possible))
      ----------------------------------------------------------------------
        (                        group and capture to \5:
      ----------------------------------------------------------------------
          (                        group and capture to \6:
      ----------------------------------------------------------------------
            (?:                      group, but do not capture (between 1
                                     and 3 times (matching the most amount
                                     possible)):
      ----------------------------------------------------------------------
              [0-9]{1,3}               any character of: '0' to '9'
                                       (between 1 and 3 times (matching the
                                       most amount possible))
      ----------------------------------------------------------------------
              [.]                      any character of: '.'
      ----------------------------------------------------------------------
            ){1,3}                   end of grouping
      ----------------------------------------------------------------------
            [0-9]{1,3}               any character of: '0' to '9' (between
                                     1 and 3 times (matching the most
                                     amount possible))
      ----------------------------------------------------------------------
          )                        end of \6
      ----------------------------------------------------------------------
          :                        ':'
      ----------------------------------------------------------------------
          (                        group and capture to \7:
      ----------------------------------------------------------------------
            [0-9]{1,6}               any character of: '0' to '9' (between
                                     1 and 6 times (matching the most
                                     amount possible))
      ----------------------------------------------------------------------
          )                        end of \7
      ----------------------------------------------------------------------
        )                        end of \5
      ----------------------------------------------------------------------
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2020-02-24
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2010-09-07
        相关资源
        最近更新 更多