【问题标题】:Importing a json file to elsticsearch将json文件导入elasticsearch
【发布时间】:2021-12-27 11:49:33
【问题描述】:

伙计们, 我是elasticsearch的新手,我想将一个json文件导入elasticsearch/kibana。json文件是wireshark pcap数据流量捕获的输出。据我所知,在能够使用_bulk( curl)将其导入elasticsearch。我阅读了一些示例来进行如下文档的映射,但我仍然不清楚如何使用我自己的文件执行此操作: https://www.elastic.co/blog/analyzing-network-packets-with-wireshark-elasticsearch-and-kibana

我的 json 文件如下所示:

[
  {
    "_index": "packets-2021-10-15",
    "_type": "doc",
    "_score": null,
    "_source": {
      "layers": {
        "frame": {
          "frame.interface_id": "0",
          "frame.interface_id_tree": {
            "frame.interface_name": "\\Device\\NPF_{9751C4A6-3584-467A-81DB-8E9E881967C3}",
            "frame.interface_description": "Ethernet"
          },
          "frame.encap_type": "1",
          "frame.time": "Oct 26, 2021 14:08:42.112764000 W. Europe Summer Time",
          "frame.offset_shift": "0.000000000",
          "frame.time_epoch": "1635250122.112764000",
          "frame.time_delta": "0.006664000",
          "frame.time_delta_displayed": "0.000000000",
          "frame.time_relative": "15.640148000",
          "frame.number": "18",
          "frame.len": "241",
          "frame.cap_len": "241",
          "frame.marked": "0",
          "frame.ignored": "0",
          "frame.protocols": "eth:ethertype:ip:tcp:tpkt:cotp:ses:pres:acse:mms",
          "frame.coloring_rule.name": "TCP",
          "frame.coloring_rule.string": "tcp"
        },
        "eth": {
          "eth.dst": "00:00:96:14:19:00",
          "eth.dst_tree": {
            "eth.dst_resolved": "MarconiE_14:19:00",
            "eth.dst.oui": "150",
            "eth.dst.oui_resolved": "Marconi Electronics Ltd.",
            "eth.addr": "00:00:96:14:19:00",
            "eth.addr_resolved": "MarconiE_14:19:00",
            "eth.addr.oui": "150",
            "eth.addr.oui_resolved": "Marconi Electronics Ltd.",
            "eth.dst.lg": "0",
            "eth.lg": "0",
            "eth.dst.ig": "0",
            "eth.ig": "0"
          },
          "eth.src": "20:47:47:b5:f2:62",
          "eth.src_tree": {
            "eth.src_resolved": "Dell_b5:f2:62",
            "eth.src.oui": "2115399",
            "eth.src.oui_resolved": "Dell Inc.",
            "eth.addr": "20:47:47:b5:f2:62",
            "eth.addr_resolved": "Dell_b5:f2:62",
            "eth.addr.oui": "2115399",
            "eth.addr.oui_resolved": "Dell Inc.",
            "eth.src.lg": "0",
            "eth.lg": "0",
            "eth.src.ig": "0",
            "eth.ig": "0"
          },
          "eth.type": "0x00000800"
        },
        "ip": {
          "ip.version": "4",
          "ip.hdr_len": "20",
          "ip.dsfield": "0x00000000",
          "ip.dsfield_tree": {
            "ip.dsfield.dscp": "0",
            "ip.dsfield.ecn": "0"
          },
          "ip.len": "227",
          "ip.id": "0x00001a1d",
          "ip.flags": "0x00000040",
          "ip.flags_tree": {
            "ip.flags.rb": "0",
            "ip.flags.df": "1",
            "ip.flags.mf": "0"
          },
          "ip.frag_offset": "0",
          "ip.ttl": "128",
          "ip.proto": "6",
          "ip.checksum": "0x00000000",
          "ip.checksum.status": "2",
          "ip.src": "192.168.1.92",
          "ip.addr": "192.168.1.92",
          "ip.src_host": "192.168.1.92",
          "ip.host": "192.168.1.92",
          "ip.dst": "192.168.1.93",
          "ip.addr": "192.168.1.93",
          "ip.dst_host": "192.168.1.93",
          "ip.host": "192.168.1.93"
        },
        "tcp": {
         ......

如果有人能告诉我最好的方法,我将不胜感激。

【问题讨论】:

    标签: json elasticsearch mapping kibana wireshark


    【解决方案1】:

    ES 的一个好处是在大量案例中的映射是自动完成的。因此,您可以尝试这样做,看看它分配给您的键的类型是否与您期望的相同。

    基本上,您需要构建一个满足ES documentation 中提出的要求的请求。

    TL;DR

    curl -X POST "localhost:9200/_bulk?pretty" -H 'Content-Type: application/json' -d'
    { "index" : { "_index" : "test", "_id" : "1" } }
    { "field1" : "value1" }
    { "delete" : { "_index" : "test", "_id" : "2" } }
    { "create" : { "_index" : "test", "_id" : "3" } }
    { "field1" : "value3" }
    { "update" : {"_id" : "1", "_index" : "test"} }
    { "doc" : {"field2" : "value2"} }
    '
    

    This answer 也可能有帮助。

    【讨论】:

    • 非常感谢您的回答。如您所见,我在这里共享的 json 文件中没有 _id,所以我不明白在这种情况下 id 是什么意思 另外我想知道我是否可以使用“--data-binary @packets.json”而不是使用“ -d ..." 能够将整个文件导入到elasticsearch?
    • _id 是分配给每个文档的内部 ES id。你可以给它任何你想要的价值。例如,您可以构建一个脚本,其中每个批量请求都使用 unix 时间戳作为文档 _id。关于另一个问题,那更多的是一个 curl 使用问题,但是当然,您可以使用--data-binary 而不是-d
    • 谢谢先生。每当我尝试将配置更改为您建议的形式时,但是当我运行 curl 命令时,我会收到错误 curl (52) Empty reply from server.你知道这个错误的原因是什么吗?
    • 用你尝试过的方法更新你的问题,我会调查一下。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2013-04-02
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多