【问题标题】:Elasticsearch Master not discovered exceptionElasticsearch Master 未发现异常
【发布时间】:2018-05-29 01:18:36
【问题描述】:

我正在运行一个 5 节点的 elasticsearch 集群(2 个数据节点、2 个主节点、1 个 kibana)。

使用命令时出现以下错误

curl -X GET "192.168.107.75:9200/_cat/master?v"

{"error":{"root_cause":[{"type":"master_not_discovered_exception","re​​ason":null} ],"type":"master_not_discovered_exception","re​​ason":null},"status":503}

我正在使用下面的命令来运行 elastic

sudo systemctl start elasticsearch.service

这是我在日志中看到的消息:

[2018-05-28T21:02:22,074][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] not enough master nodes discovered during pinging (found [[Candidate{node={node-master-1}{kJKYkpdbTKmdIeq-RVnCAQ}{JGbXMxOXR0SyjCu746Zlwg}{192.168.107.75}{192.168.107.75:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-05-28T21:02:25,076][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] not enough master nodes discovered during pinging (found [[Candidate{node={node-master-1}{kJKYkpdbTKmdIeq-RVnCAQ}{JGbXMxOXR0SyjCu746Zlwg}{192.168.107.75}{192.168.107.75:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-05-28T21:02:28,077][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] not enough master nodes discovered during pinging (found [[Candidate{node={node-master-1}{kJKYkpdbTKmdIeq-RVnCAQ}{JGbXMxOXR0SyjCu746Zlwg}{192.168.107.75}{192.168.107.75:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-05-28T21:02:31,079][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] not enough master nodes discovered during pinging (found [[Candidate{node={node-master-1}{kJKYkpdbTKmdIeq-RVnCAQ}{JGbXMxOXR0SyjCu746Zlwg}{192.168.107.75}{192.168.107.75:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-05-28T21:02:34,081][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] not enough master nodes discovered during pinging (found [[Candidate{node={node-master-1}{kJKYkpdbTKmdIeq-RVnCAQ}{JGbXMxOXR0SyjCu746Zlwg}{192.168.107.75}{192.168.107.75:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-05-28T21:02:37,084][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] not enough master nodes discovered during pinging (found [[Candidate{node={node-master-1}{kJKYkpdbTKmdIeq-RVnCAQ}{JGbXMxOXR0SyjCu746Zlwg}{192.168.107.75}{192.168.107.75:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-05-28T21:02:40,090][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] failed to connect to master [{node-master-2}{_M4BTrFbQguT3PbY5d2_JA}{1rzJcDPSQ5OH2OZ_CnhR-g}{192.168.107.76}{192.168.107.76:9300}], retrying...
org.elasticsearch.transport.ConnectTransportException: [node-master-2][192.168.107.76:9300] connect_exception
    at org.elasticsearch.transport.TcpChannel.awaitConnected(TcpChannel.java:165) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:616) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:513) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:331) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:318) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.discovery.zen.ZenDiscovery.joinElectedMaster(ZenDiscovery.java:515) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:483) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.discovery.zen.ZenDiscovery.access$2500(ZenDiscovery.java:90) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1253) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) [elasticsearch-6.2.4.jar:6.2.4]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_172]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_172]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]
Caused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: 192.168.107.76/192.168.107.76:9300
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:?]
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:323) ~[?:?]
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:633) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) ~[?:?]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
    ... 1 more
Caused by: java.net.NoRouteToHostException: No route to host
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:?]
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:323) ~[?:?]
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:633) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) ~[?:?]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
    ... 1 more

在 ealsticsearch.yml 文件中,除了为节点分配不同角色的配置之外,我使用以下配置:

cluster.name: test_cluster
network.host: 192.168.107.71
discovery.zen.ping.unicast.hosts: ["192.168.107.73", "192.168.107.74", "192.168.107.75", "192.168.107.76"]
#the above two configuration IPs change as per the node
discovery.zen.minimum_master_nodes: 2

主机可以ping通并且可以相互访问。

任何帮助将不胜感激。

【问题讨论】:

    标签: elasticsearch


    【解决方案1】:

    我认为问题很明显,[node-master-2][192.168.107.76] 要么无法从此主机访问,要么 [node-master-2] 上的弹性进程已关闭。

    您可以检查来自该主机的curl -XGET "192.168.107.76:9200" 是否有有效的答案。

    弹性文件也明确表示:

    建议避免只有两个符合条件的主节点, 因为两个的法定人数是两个。因此,失去任何一个主人 符合条件的节点将导致集群无法运行。

    【讨论】:

      【解决方案2】:

      ElasticSearch install guide 提供了如何修复 master_not_discovered_exception 异常的指导。基本上你可以得到这个错误有几个原因:

      • 防火墙规则阻止通信
      • 无法解析主/数据主机名(您使用 IP 地址时不会这样)
      • elasticsearch.yml 配置不正确(例如,主节点未配置为主节点,或运行在不同的端口/IP 地址上)。

      可以通过 telnet 轻松检查第一项和第二项(从主 telnet 到数据节点,反之亦然)。

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2016-06-01
        • 1970-01-01
        • 2010-12-11
        • 1970-01-01
        • 2016-03-06
        • 1970-01-01
        • 2014-04-22
        • 2013-07-28
        相关资源
        最近更新 更多