【问题标题】:Linux Filter Big log File for reporting用于报告的 Linux 过滤器大日志文件
【发布时间】:2015-10-20 04:18:27
【问题描述】:

我有超过 26000 个文件的大日志,每个文件的内容如下所示。我需要使用 JSON 排除所有具有 404 的行。在下面的情况下,我需要获取最后一行,因为那是具有 404 而不是 JSON 的内容。编写过滤器正则表达式有什么帮助吗?感谢 Linux 大师的帮助..

- 错误 pbmzjYvLFIlLeth6mN2Yox9DH4vap1hcFHuJgNosd0XHVSxGdRcrWw== pdl.astro.com.my http 151 0.004 - - - 错误 2015-07-28 11:34:55 SIN3 659 14.192.213.22 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.jpg 404

版本:1.0

字段:日期时间 x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status c(Referer) cs(User-Agent) cs-uri-query cs( Cookie) x-edge-result-type x-edge-request-id x-host-header cs-protocol cs-bytes time-taken x-forwarded-for ssl-protocol ssl-cipher x-edge-response-result-type

2015-07-28 11:34:57 MAD50 658 124.13.170.152 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.json 404 - NDS%2520VM%2520Engine/002%2520Apr%252004%(00OSD:25202014%25 %252032%2520;SD) - - 错误 tdlmnsfrOCxOelbe82y3kIp_QfbBF7S3dDCn4rHR65JOMkOtZu4dzA== pdl.astro.com.my http 151 0.004 - - - 错误 2015-07-28 11:34:53 SIN3 659 14.192.214.93 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.json 404-NDS%2520VM%2520Engine/002%2520Apr%252004%25202014%2520%2520%2520%2014%2520%2 2520;SD) - - 错误 5r0xsHnxLY5TePeJ6ZfKvuHrhQnbd2lbWtDQosEXLj4Z7TZ5N68ZhA== pdl.astro.com.my http 151 0.002 - - - 错误 2015-07-28 11:34:53 SIN3 659 14.192.213.198 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.json 404 - NDS%2520VM%2520Engine/002%2520Apr%252004%25202014%2320:2014%23%20D 2520;SD) - - 错误 koGGTK2mc2dDS3XvABS0zAeqheH52toNmJgIqAh5A0TYKIZL6qsgRw== pdl.astro.com.my http 151 0.001 - - - 错误 2015-07-28 11:34:54 SIN3 659 14.192.208.27 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.json 404 - NDS%2520VM%2520Engine/002%2520Apr%252004%25202014%25202014%2520%2520%2520%2014%2520%2014%2520%2 2520;SD) - - 错误 bvLIe540oNMCeZ0QpOmX1OKoClgNgvSWppGuOmgVS85WnAXKJ1ryDg== pdl.astro.com.my http 151 0.002 - - - 错误 2015-07-28 11:34:54 SIN3 659 210.19.26.33 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.json 404 - NDS%2520VM%2520Engine/002%2520Apr%252004%25202014%252%20(OSD: 2520;SD) - - 错误 6Wl5xeCZArNN3WGaIGOA6XjUqZHEiENbWOmChiMZPayefDuLtC8WrA== pdl.astro.com.my http 151 0.001 - - - 错误 2015-07-28 11:34:54 SIN3 659 121.121.62.92 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.json 404 - NDS%2520VM%2520Engine/002%2520Apr%252004%25202014%2520%2520%2520%2014%2520%2 2520;SD) - - 错误 WLn7heBO3PvvVW1vt365EVXqoD440Byy6Qh6RYYazSyPBZUxwsS0Jg== pdl.astro.com.my http 151 0.001 - - - 错误 2015-07-28 11:34:54 SIN3 659 14.192.213.9 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.json 404 - NDS%2520VM%2520Engine/002%2520Apr%252004%25202014%252%20(OSD: 2520;SD) - - 错误 hTbk9HE5nyFSla1DmeC1D1jhuMtoUY6E7QQvyf0v1YyJ1GBp-I40bw== pdl.astro.com.my http 151 0.001 - - - 错误 2015-07-28 11:34:55 SIN3 659 14.192.213.250 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.json 404 - NDS%2520VM%2520Engine/002%2520Apr%252004%25202014%2520:2014%2320:2014%2520( 2520;HD) - - 错误 avWgysZyGeGXdVxZHLfP5uLJ4ie5Hx8pa6ZJC5GHXfvOkyEXXp8o0g== pdl.astro.com.my http 151 0.001 - - - 错误 2015-07-28 11:34:55 SIN3 659 14.192.211.78 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.json 404 - NDS%2520VM%2520Engine/002%2520Apr%252004%25202014%25202014%2520%2520%2520%2014%2520%2 2520;SD) - - 错误 wBepjCn58o9AiTifvtrCprkjdAdg--zsLTsjDpUBkxnEU5tahmJxxQ== pdl.astro.com.my http 151 0.004 - - - 错误 2015-07-28 11:34:55 SIN3 659 121.121.101.4 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.json 404 - NDS%2520VM%2520Engine/002%2520Apr%252004%25202014%2520%2014%2520%2OSD:%2520%2 2520;SD) - - 错误 YZ07B5vu7L4I3aoTcBXF5rcH8Dwrv5a77xRqqelkQqvQhYLDnkrKWg== pdl.astro.com.my http 151 0.001 - - - 错误 2015-07-28 11:34:55 SIN3 659 14.192.208.156 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.json 404 - NDS%2520VM%2520Engine/002%2520Apr%252004%25202014%2320:2014%2320:2014%2320( 2520;SD) - - 错误 pbmzjYvLFIlLeth6mN2Yox9DH4vap1hcFHuJgNosd0XHVSxGdRcrWw== pdl.astro.com.my http 151 0.004 - - - 错误 2015-07-28 11:34:55 SIN3 659 14.192.213.22 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.json 404

  • 错误 pbmzjYvLFIlLeth6mN2Yox9DH4vap1hcFHuJgNosd0XHVSxGdRcrWw== pdl.astro.com.my http 151 0.004 - - - 错误 2015-07-28 11:34:55 SIN3 659 14.192.213.22 GET d2v2sjgehuhalt.cloudfront.net /thumbnail/mediaInfo_211.jpg 404

【问题讨论】:

    标签: regex linux logging analysis


    【解决方案1】:

    请阅读how to ask,您的问题是off topic,而您没有provide code;这与编码无关,在serverfault 上可能会更好。

    如果你想解析大的 HTTP 日志,你应该使用 visitors,如果你想要 JSON 输出,因为这个社区是关于编码的,你可以扩展它来做到这一点。

    否则,对于您的原始问题,这是awk 的一种方式:

    awk '$NF == 404 && $(NF -1) ~ /\.json$/ { next; } {print}' /path/to/yourfile.log
    
    $NF == 404  # the last field is 404
    $(NF -1)    # the field before the last
    ~ /\.json$/ # ends with .json
    { next; }   # skip this line
    { print }   # print anything else
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2022-11-17
      • 1970-01-01
      • 2019-02-09
      • 2015-05-01
      • 1970-01-01
      相关资源
      最近更新 更多