【发布时间】:2018-02-16 10:48:04
【问题描述】:
我有一个 15 节点的 cassandra 3.9 集群。我最近遇到了一个问题,我的一个节点堆积了 GossipStage 消息。根据我在类似报告中找到的一些指导,我在该节点上运行了“nodetool resetlocalschema”。虽然像这样的八卦错误继续显示在日志中
WARN [GossipTasks:1] 2018-02-11 23:55:34,197 Gossiper.java:771 - Gossip stage has 180317 pending tasks; skipping status check (no nodes will be marked down)
我还看到以下异常。关于如何克服这个问题并使该节点恢复正常的任何指导?另外我应该提到我在 cassandra.yaml 文件中启用了 PasswordAuthenticator。
ERROR [Native-Transport-Requests-1] 2018-02-11 23:55:33,581 Message.java:617 - Unexpected exception during request; channel = [id: 0xbaa65545,
L:/10.1.21.51:9042 - R:/10.1.86.40:35082]
java.lang.RuntimeException: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Unknown keyspace
/cf pair (system_auth.roles)
at org.apache.cassandra.auth.PasswordAuthenticator.authenticate(PasswordAuthenticator.java:107) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.auth.PasswordAuthenticator.access$300(PasswordAuthenticator.java:59) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.auth.PasswordAuthenticator$PlainTextSaslAuthenticator.getAuthenticatedUser(PasswordAuthenticator.java:220) ~[ap
ache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:78) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) [apache-cassandra-3.9.jar:3.9]
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.39.Final.jar:4.0.39.Fi
nal]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366) [netty-all-4.0.39.Final.jar
:4.0.39.Final]
at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) [netty-all-4.0.39.Final.jar:4.0.39.
Final]
at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:357) [netty-all-4.0.39.Final.jar:4.0.39.Fina
l]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_91]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) [apache
-cassandra-3.9.jar:3.9]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.9.jar:3.9]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
Caused by: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Unknown keyspace/cf pair (system_auth.roles)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203) ~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache.get(LocalCache.java:3937) ~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) ~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) ~[guava-18.0.jar:na]
at org.apache.cassandra.auth.AuthCache.get(AuthCache.java:108) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.auth.PasswordAuthenticator.authenticate(PasswordAuthenticator.java:88) ~[apache-cassandra-3.9.jar:3.9]
... 13 common frames omitted
【问题讨论】:
-
不知道为什么有人会推荐,resetlocalschema。无论如何重新启动有问题的节点,它会在备份过程中赶上并刷新模式。
-
感谢您的回复。我已经尝试过了,它似乎没有帮助。我不断地看到上述错误淹没了日志并且它们不会消退。并且 nodetool status 已经标记了节点 DN。还有什么我可以尝试的吗?
-
你能把“nodetool describecluster”的输出粘贴到这里吗?从集群中的所有节点获取此信息。
-
输出很长。所以我会将它拆分到多个 cmets 中。
-
我集群中的节点以 ips 结尾 41-55。节点 42 最近已从集群中删除。而 51 是我试图与之抗衡的流氓节点。
标签: cassandra