【问题标题】:CSV & Lgstash & ElasticsearchCSV & Lgstash & Elasticsearch
【发布时间】:2021-10-07 08:53:53
【问题描述】:

我需要帮助!

我是 Elasticsearch 的新手...我刚刚从 Kaggle 导入 CSV 格式的数据,并通过指定配置文件通过 logstash 将它们集成到 elasticsearch,但是在 elasticsearch 上我的数据不再可读(格式如下: u0000E\u0000l) 我需要在配置文件中添加任何转换吗?

谢谢。

【问题讨论】:

  • 能否请您也提供logstash配置。这将有助于其他人了解您的配置。
  • 输入 { file { path => "C:/elastic_stack/data/*.csv" start_position => "beginning" sincedb_path => "NUL" } } filter { csv { separator => ", " 列 => ["id","name","location","description","rating","image","link"] } } 输出 { elasticsearch { hosts => "localhost:9200" index => "es_hotels" } }
  • 您的文件格式应该是 UTF-8,只有它才能正确索引。您也可以使用编解码器插件将数据转换为 UTF-8。这是参考:elastic.co/guide/en/logstash/current/plugins-codecs-csv.html
  • 我试过了:codec => plain { charset=> "UTF-8" } and ruby​​ { code => 'event.set("decoded", Base64.decode64(event.get("message ")))' } 但它不起作用
  • 我有一个二进制数据

标签: csv elasticsearch encoding format logstash


【解决方案1】:

我试过:codec => plain { charset=> "UTF-8" } 和 ruby​​ { code => 'event.set("decoded", Base64.decode64(event.get("message")))' } 但它没有用 这就是我得到的:{ "主机" => "HIND", “id”=>“\ u0000g \ u0000u \ u0000e \ u0000s \ u0000s \ u0000s \ u0000s \ u0000c \ u0000c \ u0000c \ u0000c \ u0000c \ u0000c \ u0000c \ u0000c \ u0000c \ u0000c \ u0000c \ u0000c \ u0000c \ u0000c \ u0000c \ u0000a \ u0000n \ u0000 \ u0000r \ u0000e \ u0000r \ u0000a \ u0000x \ u0000 \ u0000i \ u0000n \ u0000 \ u0000t \ u0000h \ u0000e \ u0000 \ u0000a \ u0000r \ u0000d \ u000000 \ u0000n \ u0000 \ u0000a \ u0000t \ u0000 \ u0000t \ u0000h \ u0000t \ u0000h \ u000000 \ u0000 \ u0000p \ u000000 \ u0000o \ u0000p \ u0000e \ u000000p \ u0000e \ u0000r \ u0000t \ u0000r \ u0000 .\u0000", “消息”=>“\ u0000g \ u0000u \ u0000u \ u0000s \ u0000t \ u0000s \ u0000 \ u0000c \ u0000a \ u0000n \ u0000a \ u0000n \ u0000 \ u0000r \ u0000e \ u0000l \ u0000a \ u0000x \ u0000 \ u0000i \ u0000n \ u0000 \ u0000t \ u0000h \ u0000e \ u0000 \ u0000a \ u0000r \ u0000d \ u000000 \ u0000n \ u0000 \ u0000a \ u0000t \ u0000 \ u0000t \ u0000h \ u0000t \ u0000h \ u000000 \ u0000 \ u0000p \ u000000 \ u0000o \ u0000p \ u0000e \ u000000p \ u0000e \ u0000r \ u0000t \ u0000r \ u0000 .\u0000", "解码" => "\x1A\xE7\xAC\xB6\xC7\x1A\x9E\xB7\xA5k\x18\xA7\xB6\x17\xA0j\xB7^\x9D\xABm\x85\xEAk\xA2\x97\ xAB\xB7", "路径" => "C:/elastic_stack/data/hotelsb_df_es.csv", "@version" => "1", "@timestamp" => 2021-08-04T18:04:46.170Z }

【讨论】:

    猜你喜欢
    • 2017-09-27
    • 2018-10-04
    • 2019-04-26
    • 1970-01-01
    • 2017-05-25
    • 1970-01-01
    • 2021-07-19
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多