如何使用 grok 过滤 {"foo":"bar", "bar": "foo"} 以仅获取 foo 字段？答案

【问题标题】：how filter {"foo":"bar", "bar": "foo"} with grok to get only foo field?如何使用 grok 过滤 {"foo":"bar", "bar": "foo"} 以仅获取 foo 字段？
【发布时间】：2017-04-07 23:06:07
【问题描述】：

我复制了

{"name":"myapp","hostname":"banana.local","pid":40161,"level":30,"msg":"hi","time":"2013-01-04T18:46:23.851Z","v":0}

来自https://github.com/trentm/node-bunyan 并将其保存为我的logs.json。我试图通过 LogStash 将两个字段（名称和 msg）导入 ElasticSearch。问题是我依赖于一种我无法完成的过滤器。好吧，我已经成功地将这样的行作为一条消息导入，但在我的实际情况下肯定不值得。

也就是说，我如何仅将名称和味精导入 ElasticSearch？我使用http://grokdebug.herokuapp.com/ 测试了几种替代方案，以达到一个有用的过滤器，但完全没有成功。

例如，%{GREEDYDATA:message} 会将整行作为唯一消息，但如何拆分它并忽略除名称和 msg 字段以外的所有字段？

最后，我打算在这里使用：

input {
    file {
        type => "my_type"
        path => [ "/home/logs/logs.log" ]
        codec => "json"
    }   
}

filter {     

   grok {
            match => { "message" => "data=%{GREEDYDATA:request}"}        
        }   
#### some extra lines here probably
} 

output
{ 
  elasticsearch {
    codec => json
    hosts => "http://127.0.0.1:9200"
    index => "indextest"
  }

    stdout { codec => rubydebug }
}

【问题讨论】：

标签： elasticsearch elastic-stack logstash-grok grok

【解决方案1】：

我刚刚浏览了available Logstash filters 的列表。 prune filter 应该符合您的需要。

假设你已经安装了prune filter，你的配置文件应该是这样的：

input {
  file {
    type => "my_type"
    path => [ "/home/logs/logs.log" ]
    codec => "json"
  }
}

filter {
  prune {
    whitelist_names => [
      "@timestamp",
      "type",
      "name",
      "msg"
    ]
  }
} 

output { 
  elasticsearch {
    codec => json
    hosts => "http://127.0.0.1:9200"
    index => "indextest"
  }

  stdout { codec => rubydebug }
}

请注意，您需要保留 type 以便 Elasticsearch 将其索引为正确的类型。如果您要查看Kibana 上的数据，则需要@timestamp。

【讨论】：