使用 Logstash 复制 ElasticSearch-Index答案

【问题标题】：Copy ElasticSearch-Index with Logstash使用 Logstash 复制 ElasticSearch-Index
【发布时间】：2014-11-07 18:23:32
【问题描述】：

我在一台机器上有一个现成的 Apache-Index，我想使用 logstash 克隆到另一台机器。我认为相当容易

input {
    elasticsearch {
        host => "xxx.xxx.xxx.xxx"
        index => "logs"
    }
}
filter {

}

output {
    elasticsearch {
        cluster => "Loa"
        host => "127.0.0.1"
        protocol => http
        index => "logs"
        index_type => "apache_access"
    }
}

提取文档，但不会停止，因为它使用默认查询 "*"（原始索引有大约 50.000 个文档，当新索引超过 600.000 个文档并且还在上升时，我杀死了前一个脚本）

接下来我尝试确保文档会得到更新而不是重复，但 this commit 还没有完成，所以我没有主文档..

然后我记得sincedb 但似乎无法在查询中使用它（或者可能）

有什么建议吗？也许是完全不同的方法？非常感谢！

【问题讨论】：

你解决了吗？有类似的问题，但输入似乎只运行一次，并且不会在输入索引上获取任何新条目。

标签： elasticsearch logstash

【解决方案1】：

假设 elasticsearch 输入创建一个带有文档 id 的 logstash 事件（我假设它是 _id 或类似的东西），尝试按以下方式设置弹性搜索输出：

output {
    elasticsearch {
        cluster => "Loa"
        host => "127.0.0.1"
        protocol => http
        index => "logs"
        index_type => "apache_access"
        document_id => "%{_id}"
     }
 }

这样，即使 elasticsearch 输入，无论出于何种原因，继续无限期推送相同的文档，elasticsearch 也只会更新现有文档，而不是创建具有新 id 的新文档。

一旦达到 50,000，就可以停止。

【讨论】：

感谢您的回答.. 这是我尝试的第二件事，但不幸的是，elasticearch-input 没有返回 _id 字段（这应该由我在我的链接中链接的提交修复问题，但还没有）