【问题标题】:logstash filter definition for an extended apache log扩展 apache 日志的 logstash 过滤器定义
【发布时间】:2015-05-06 17:08:57
【问题描述】:

我正在尝试为扩展的 apache 日志过滤器定义配置一个 logstash 过滤器。它基本上是带有一些附加字段的“组合”LogFormat,这里是 apache 日志格式定义:

LogFormat "%h %{X-LB-Client-IP}i %l %u %m %t \"%{Host}i\" \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" %D" combinextended

这是一个示例日志文件内容:

12.123.456.789 122.123.122.133 - - GET [06/May/2015:18:42:41 +0200] "www.example.com" "GET /fr-fr/test/content/ HTTP/1.1" 200 14023 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121718 Gentoo Firefox/3.0.5" 7729

我配置了 logstash-forward 来发送文件:

{
  "paths": [
    "/var/log/mysite/extended.log",
    "/var/log/myothersite/extended.log" 
  ],
  "fields": { "type": "apache-extended" }
}

我在/etc/logstash/conf.d 的文件中配置了一个grok 模式的logstash 服务器,名为13-apache-extended.conf

filter {
if [type] == "apache-extended" {
  grok {
    match => { "message" => "%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}" }
    }
  }
}

我在https://grokdebug.herokuapp.com/ 测试过,感觉还可以:

日志样本:

12.123.456.789 122.123.122.133 - - GET [06/May/2015:18:42:41 +0200] "www.example.com" "GET /fr-fr/test/content/ HTTP/1.1" 200 14023 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121718 Gentoo Firefox/3.0.5" 7729

图案:

%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}

但是当我在主服务器上重新启动 logstash 时,出现错误:

{:timestamp=>"2015-05-06T18:36:28.846000+0200", :message=>"Exception in lumberjack input", :exception=>#<LogStash::ShutdownSignal: LogStash::ShutdownSignal>, :level=>:error}
{:timestamp=>"2015-05-06T18:36:44.342000+0200", :message=>"Error: Expected one of #, {, } at line 35, column 142 (byte 969) after filter {\n  if [type] == \"apache-extended\" {\n    grok {\n      match => { \"message\" => \"%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \\[%{HTTPDATE:timestamp}\\] \""}
{:timestamp=>"2015-05-06T18:36:44.349000+0200", :message=>"You may be interested in the '--configtest' flag which you can\nuse to validate logstash's configuration before you choose\nto restart a running system."}

任何想法都非常感谢。

谢谢。

【问题讨论】:

    标签: apache logstash logstash-grok logstash-forwarder logstash-configuration


    【解决方案1】:

    我已经测试了您的问题,并为您提供了 2 个可能的解决方案。

    1. 您确定您的伐木工人配置正确吗?你用基本的日志文件检查了吗?
    2. 您确定您的配置中的模式与您发布的相同吗?因为我注意到你发布的模式和错误输出不一样。

      %{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}

    !=

    %{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"}
    

    你不能在你的模式中输入语义,你需要 \ 每个特殊字符。

    此错误(“错误:第 35 行第 142 列(字节 969)应为 #、{、} 之一)通常意味着您在该位置存在语法错误,例如当您忘记转义特殊字符时.

    我在没有伐木工人的情况下测试了您的配置,一切正常。

    日志样本: 12.123.456.789 122.123.122.133 - - GET [06/May/2015:18:42:41 +0200] "www.example.com" "GET /fr-fr/test/content/ HTTP/1.1" 200 14023 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121718 Gentoo Firefox/3.0.5" 7729

    配置:

    input {  
        file {
            path => "d:/Git/LogstashELKElision/logstash/bin/log/test.log"
            type => extendedapache
          }}
        filter {
        if [type] == "extendedapache" {
        grok {
          match => [ "message", "%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}" ]
        }
          }
        }
        output {
          elasticsearch { hosts => ["localhost:9200"] }
          stdout { codec => rubydebug }
        }input {  
        file {
            path => "d:/Git/LogstashELKElision/logstash/bin/log/test.log"
            type => extendedapache
          }}
        filter {
        if [type] == "extendedapache" {
        grok {
          match => [ "message", "%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}" ]
        }
          }
        }
        output {
          elasticsearch { hosts => ["localhost:9200"] }
          stdout { codec => rubydebug }
        }
    

    【讨论】:

      【解决方案2】:

      你想阅读apache日志。我想你可以试试下面的代码。

      input {
        file {
          path => "/var/log/apache2/*.log"
          type => "apache"
          start_position => "beginning"
          sincedb_path => "/dev/null"
        }
      }
      
      filter {
        if [path] =~ "access" {
          mutate { replace => { "type" => "apache_access" } }
          grok {
            match => { "message" => "%{COMBINEDAPACHELOG}" }
          }
        }
        date {
          match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z"]
        }
      }
      
      output {
        elasticsearch { hosts => localhost }
        stdout { codec => rubydebug }
      }
      

      【讨论】:

      • 你到底想看什么??
      • 只是一个正确缩进的配置文件。此外,(虽然这个问题已经超过 6 个月了)它特别指出 COMBINEDAPACHELOG 模式没有捕捉到日志正在生成的输入
      猜你喜欢
      • 2018-02-09
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多