【问题标题】:Cassandra Hector - UnavailableExceptionCassandra Hector - UnavailableException
【发布时间】:2013-09-30 14:34:04
【问题描述】:

我正在尝试使用 Hector 插入记录,但有时我会收到此错误:

me.prettyprint.hector.api.exceptions.HUnavailableException: : May not be enough replicas present to handle consistency level.
    at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:59)
    at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:264)
    at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
    at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
    at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115)
    at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163)
    at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69)
    at ustocassandra.USToCassandraHector.consumer(USToCassandraHector.java:271)
    at ustocassandra.USToCassandraHector.access$100(USToCassandraHector.java:41)
    at ustocassandra.USToCassandraHector$2.run(USToCassandraHector.java:71)
    at java.lang.Thread.run(Thread.java:724)
Caused by: UnavailableException()
    at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
    at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
    at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
    at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246)
    at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:243)
    at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104)
    at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258)
    ... 9 more

我知道通常的解释是没有足够的节点,但事实并非如此。我的所有节点都已启动:

./nodetool ring
Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: DC1
==========
Address         Rack        Status State   Load            Owns                Token
                                                                               4611686018427388000
172.16.217.222  RAC1        Up     Normal  353.36 MB       25.00%              -9223372036854775808
172.16.217.223  RAC2        Up     Normal  180.84 MB       25.00%              -4611686018427388000
172.16.217.224  RAC3        Up     Normal  260.34 MB       25.00%              -2
172.16.217.225  RAC4        Up     Normal  222.71 MB       25.00%              4611686018427388000

我正在插入具有 20 个线程的记录(也许我应该使用更少的线程?据我所知,在这种情况下,错误将是 Overloaded,而不是 Unavailable)。我正在使用 ONE 的写入一致性。我正在使用 AutoDiscoveryAtStartup 和 LeastActiveBalancingPolicy。复制因子为 2。

我使用的是 Cassandra 1.2.8(我尝试使用 2.0 并且是一样的)。

错误不是从一开始就发生的。我通常设法在收到错误之前插入大约 200 万条记录。我的代码设置为在发生错误时重试。经过几十次重试后,插入通常会成功。之后,它再次可以正常工作数百万次插入,然后我再次收到错误并继续循环。

会不会是因为我设置了 gc_grace = 60?无论如何,我不会每 60 秒收到一次错误,所以我认为这不是原因。

您能给我一些关于这个错误的原因以及我应该怎么做的建议吗?

编辑:

'nodetool tpstats' 说我有一些消息被丢弃:

Message type           Dropped
RANGE_SLICE                  0
READ_REPAIR                  0
BINARY                       0
READ                         0
MUTATION                    11
_TRACE                       0

我在日志文件中看到以下警告:

 WARN [ScheduledTasks:1] 2013-09-30 09:20:16,633 GCInspector.java (line 136) Heap is 0.853986836999536 full.  You may need to reduce memtable and/or cache sizes.  Cassandra is now reducing cache sizes to free up memory.  Adjust reduce_cache_sizes_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
 WARN [ScheduledTasks:1] 2013-09-30 09:20:16,634 AutoSavingCache.java (line 185) Reducing KeyCache capacity from 1073741824 to 724 to reduce memory pressure
 WARN [ScheduledTasks:1] 2013-09-30 09:20:16,634 GCInspector.java (line 142) Heap is 0.853986836999536 full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
 WARN [ScheduledTasks:1] 2013-09-30 09:20:16,634 StorageService.java (line 3618) Flushing CFS(Keyspace='us', ColumnFamily='my_cf') to relieve memory pressure

这是 Hector 抛出 Unavailable 异常的确切时间。因此,这可能是与内存相关的问题。 我想我会尝试警告所说的内容:减少内存表大小。

【问题讨论】:

    标签: java cassandra hector


    【解决方案1】:

    这可能是因为您的服务器超载,因此某些节点没有响应。没有OverloadedException(过载的节点看起来就像一个不可用的节点)。

    您应该检查您的 Cassandra 日志 - 是否有关于堆已满的警告? nodetool tpstats 中是否列出了丢弃的消息?您的服务器上的 CPU 负载是多少?

    【讨论】:

    • 谢谢,看来(从日志文件和“tpstats”)我的节点确实超载了。我已经编辑了我的问题。如果您有其他建议,我们将受到欢迎:)
    猜你喜欢
    • 2014-05-05
    • 2015-06-03
    • 2011-10-24
    • 1970-01-01
    • 2011-11-12
    • 2014-08-03
    • 2011-10-17
    • 2012-03-12
    • 1970-01-01
    相关资源
    最近更新 更多