【发布时间】:2016-05-26 13:05:49
【问题描述】:
好吧,我需要将数据从 mongoDB 传输到 elastticSearch,但我不能使用 River 插件,因为它们已被弃用。
完成这项任务的最佳方法是什么?
【问题讨论】:
-
你可能想看看这个答案:stackoverflow.com/questions/35374216/…
标签: mongodb elasticsearch etl
好吧,我需要将数据从 mongoDB 传输到 elastticSearch,但我不能使用 River 插件,因为它们已被弃用。
完成这项任务的最佳方法是什么?
【问题讨论】:
标签: mongodb elasticsearch etl
使用 mongodump 实用程序转储您的 mongodb 数据库。然后,您应该能够使用他们的实用程序将转储文件直接导入到 elasticsearch 中,这在此处进行了说明。
https://www.elastic.co/guide/en/elasticsearch/reference/current/_exploring_your_data.html
【讨论】:
在这里,我找到了另一个将 MongoDB 数据迁移到 Elasticsearch 的好选择。 一个将 mongodb 实时同步到 elasticsearch 的 go 守护进程。 它的Monstache。可在:https://rwynn.github.io/monstache-site/
在初始设置下配置和使用它。
第 1 步:
C:\Program Files\MongoDB\Server\4.0\bin>mongod --smallfiles --oplogSize 50 --replSet test
第 2 步:
C:\Program Files\MongoDB\Server\4.0\bin>mongo
C:\Program Files\MongoDB\Server\4.0\bin>mongo
MongoDB shell version v4.0.2
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 4.0.2
Server has startup warnings:
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten]
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database.
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted.
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten]
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** WARNING: This server is bound to localhost.
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** Remote systems will be unable to connect to this server.
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** Start the server with --bind_ip <address> to specify which IP
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** addresses it should serve responses from, or with --bind_ip_all to
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** bind to all interfaces. If this behavior is desired, start the
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** server with --bind_ip 127.0.0.1 to disable this warning.
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten]
MongoDB Enterprise test:PRIMARY>
第 3 步:验证复制。
MongoDB Enterprise test:PRIMARY> rs.status();
{
"set" : "test",
"date" : ISODate("2019-01-18T11:39:00.380Z"),
"myState" : 1,
"term" : NumberLong(2),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1547811537, 1),
"t" : NumberLong(2)
},
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1547811537, 1),
"t" : NumberLong(2)
},
"appliedOpTime" : {
"ts" : Timestamp(1547811537, 1),
"t" : NumberLong(2)
},
"durableOpTime" : {
"ts" : Timestamp(1547811537, 1),
"t" : NumberLong(2)
}
},
"lastStableCheckpointTimestamp" : Timestamp(1547811517, 1),
"members" : [
{
"_id" : 0,
"name" : "localhost:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 736,
"optime" : {
"ts" : Timestamp(1547811537, 1),
"t" : NumberLong(2)
},
"optimeDate" : ISODate("2019-01-18T11:38:57Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1547810805, 1),
"electionDate" : ISODate("2019-01-18T11:26:45Z"),
"configVersion" : 1,
"self" : true,
"lastHeartbeatMessage" : ""
}
],
"ok" : 1,
"operationTime" : Timestamp(1547811537, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1547811537, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
MongoDB Enterprise test:PRIMARY>
第 4 步。
下载“https://github.com/rwynn/monstache/releases”。
解压缩下载并调整您的 PATH 变量以包含您平台的文件夹路径。
转到 cmd 并输入"monstache -v"
# 4.13.1
Monstache 使用 TOML 格式进行配置。配置名为 config.toml 的迁移文件
第 5 步。
我的 config.toml -->
mongo-url = "mongodb://127.0.0.1:27017/?replicaSet=test"
elasticsearch-urls = ["http://localhost:9200"]
direct-read-namespaces = [ "admin.users" ]
gzip = true
stats = true
index-stats = true
elasticsearch-max-conns = 4
elasticsearch-max-seconds = 5
elasticsearch-max-bytes = 8000000
dropped-collections = false
dropped-databases = false
resume = true
resume-write-unsafe = true
resume-name = "default"
index-files = false
file-highlighting = false
verbose = true
exit-after-direct-reads = false
index-as-update=true
index-oplog-time=true
第 6 步。
D:\15-1-19>monstache -f config.toml
【讨论】:
在使用 Monstache、Apache NiFi 和 Logstash 等各种工具多次尝试和失败后,我发现了一个很棒的工具,即Metabase,它非常人性化,并且在将 MongoDB 数据库同步到基于 Web 的应用程序中具有良好的性能让您可视化和分析数据。
【讨论】: