【问题标题】:Unable to load csv file through logstash无法通过 logstash 加载 csv 文件
【发布时间】:2020-04-03 03:32:01
【问题描述】:

我是 ELK 的新手,我正在尝试通过 Logstash 加载本地存储的 .csv 文件,以便将其与 Elasticsearch 一起使用。

logstash 配置文件如下所示:

   input {
    file {
       path => "C:\ELK-Stack\Cars Data Set\cars.csv"
       start_position => "beginning"
      sincedb_path => "/dev/null"
      }
  }

filter {
         csv {
                separator =>","
                columns => ["maker","model","mileage","manufacture-year","engine_displacement","engine_power","body_type", "color_slug","stk_year","transmission","door_count","seat_count","fuel_type","date_created","date_last_seen", "price_eur"]
             }
     mutate {convert => ["mileage", "integer"]}
     mutate {convert => ["price_eur", "float"]}
     mutate {convert => ["door_count", "integer"]}
     mutate {convert => ["engine_power", "integer"]}
     mutate {convert => ["seat_count", "integer"]}

     }

output {
 elasticsearch {
 hosts => ["localhost:9200"]}
 index => "cars"
document_type => "sold_cars"
  }
stdout {}
}

而文件的路径是:C:\ELK-Stack\Cars Data Set\cars.csv

我得到一个如下所示的输出:

.csv 文件有超过一百万行。任何帮助,将不胜感激。

编辑:

现在我正在处理另一个数据集,无法通过 logstash 加载它。

    input {
  file {
    path => "C:\ELK-Stack\311.csv"
    start_position => "beginning"
    sincedb_path => "NUL"
  }
}

filter {
         csv {
                separator =>","
                columns => ["Unique Key","Created Date","Closed Date","Agency","Agency Name","Complaint Type","Descriptor", "Location Type","Incident Zip","Incident Address","Street Name","Cross Street 1","Cross Street 2","Intersection Street 1","Intersection Street 2", "Address Type", "City", "Landmark", "Facility Type", "Status", "Due Date", "Resolution Description", "Resolution Action Updated Date", "Community Board", "BBL", "Borough", "X Coordinate (State Plane)", "Y Coordinate (State Plane)", "Open Data Channel Type", "Park Facility Name", "Park Borough", "Vehicle Type", "Taxi Company Borough", "Taxi Pick Up Location", "Bridge Highway Name", "Bridge Highway Segment", "Latitude", "Longitude", "Location"]
             }
     mutate {convert => ["Unique Key", "integer"]}
     mutate {convert => ["Created Date", "timestamp"]}
     mutate {convert => ["Closed Date", "timestamp"]}
     mutate {convert => ["Due Date", "timestamp"]}
     mutate {convert => ["Resolution Action Updated Date", "timestamp"]}
     mutate {convert => ["X Coordinate (State Plane)", "integer"]}
     mutate {convert => ["X Coordinate (State Plane)", "integer"]}
     mutate {convert => ["Latitude", "integer"]}
     mutate {convert => ["Longitude", "integer"]}
     mutate {convert => ["Location", "integer"]}
     }

output {
  elasticsearch {
      hosts => ["localhost:9200"]
      index => "311"
  }
  stdout {}
}

有什么想法可能是错的吗?

【问题讨论】:

  • 完成。感谢您纠正我。

标签: csv elasticsearch logstash


【解决方案1】:

您的配置中有两个错误,第一个是输出块中的拼写错误,hosts 行中的右大括号,这在错误日志中进行了描述。

exception => "LogStash:ConfigurationError"

错行是这一行:hosts => ["localhost:9200"]}

这是固定配置

output {
  elasticsearch {
      hosts => ["localhost:9200"]
      index => "cars"
  }
  stdout {}
}

由于您运行的是 Logstash 7.5,因此 document_type 选项已从 7.0 版中删除。

第二个错误在您的输入块中,即使在运行 Windows 时也应该使用正斜杠,但指向 /dev/null/sincedb_path 是 Linux/macOS 配置,在 Windows 上您应该使用 NUL

这是正确的配置

input {
  file {
    path => "C:/ELK-Stack/Cars Data Set/cars.csv"
    start_position => "beginning"
    sincedb_path => "NUL"
  }
}

【讨论】:

  • 谢谢。这很有效,我现在可以加载我的数据了!
  • 嗨@leandrojmp。我正在处理另一个文件并面临同样的问题。
猜你喜欢
  • 2018-12-02
  • 1970-01-01
  • 1970-01-01
  • 2018-10-23
  • 1970-01-01
  • 1970-01-01
  • 2019-03-18
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多