【问题标题】:Read Failure in Cassandra阅读 Cassandra 中的失败
【发布时间】:2018-09-19 16:26:22
【问题描述】:

我有一个单节点 cassandra 设置。我在 cassandra cqlsh 在表上运行 select count(*) where 查询时收到以下错误:

完整查询:

SELECT count(*) FROM casb.o365_activity_log_by_date WHERE
creation_time > '2018-09-16 00:00:00' and creation_time < '2018-09-16 23:59:59' 
ALLOW FILTERING;

回复信息:

ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] 
message="Operation failed - received 0 responses and 1 failures" 
info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}

表架构:

CREATE TABLE IF NOT EXISTS casb.o365_activity_log_by_date (
    current_date date,
    creation_time timestamp,
    insertion_time timestamp,
    id text,
    client_ip text,
    workload text,
    operation text,
    user_id text,
    object_id text,
    activity_detail text,
    PRIMARY KEY ((current_date), insertion_time, id)
    )
) WITH CLUSTERING ORDER BY (insertion_time DESC, id DESC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';

我有另一个基于 python 的应用程序正在从该表中读取数据,但工作似乎被卡住了。

日志:

/var/log/cassandra/system.log

WARN  [ReadStage-2] 2018-09-16 22:06:48,803 ReadCommand.java:533 - Read 58545 live rows and 100001 tombstone cells for query SELECT * FROM casb.o365_activity_log_by_date WHERE creation_time > 2018-09-16 00:00Z AND creation_time < 2018-09-16 23:59Z LIMIT 100 (see tombstone_warn_threshold)
ERROR [ReadStage-2] 2018-09-16 22:06:48,804 StorageProxy.java:1906 - Scanned over 100001 tombstones during query 'SELECT * FROM casb.o365_activity_log_by_date WHERE creation_time > 2018-09-16 00:00Z AND creation_time < 2018-09-16 23:59Z LIMIT 100' (last scanned row partion key was ((2018-09-15), 2018-09-15 08:09Z, 72160ee4-5310-4941-af92-d27ced9c9ca8)); query aborted
WARN  [Native-Transport-Requests-1] 2018-09-16 22:07:02,937 SelectStatement.java:430 - Aggregation query used without partition key
WARN  [Native-Transport-Requests-1] 2018-09-16 22:07:45,946 SelectStatement.java:430 - Aggregation query used without partition key
WARN  [ReadStage-2] 2018-09-16 22:07:47,200 ReadCommand.java:533 - Read 58545 live rows and 100001 tombstone cells for query SELECT * FROM casb.o365_activity_log_by_date WHERE creation_time > 2018-09-16 00:00Z AND creation_time < 2018-09-16 23:59Z LIMIT 100 (see tombstone_warn_threshold)
ERROR [ReadStage-2] 2018-09-16 22:07:47,200 StorageProxy.java:1906 - Scanned over 100001 tombstones during query 'SELECT * FROM casb.o365_activity_log_by_date WHERE creation_time > 2018-09-16 00:00Z AND creation_time < 2018-09-16 23:59Z LIMIT 100' (last scanned row partion key was ((2018-09-15), 2018-09-15 08:09Z, 72160ee4-5310-4941-af92-d27ced9c9ca8)); query aborted
WARN  [Native-Transport-Requests-1] 2018-09-16 22:17:52,810 SelectStatement.java:430 - Aggregation query used without partition key
WARN  [ReadStage-2] 2018-09-16 22:17:54,513 ReadCommand.java:533 - Read 58545 live rows and 100001 tombstone cells for query SELECT * FROM casb.o365_activity_log_by_date WHERE creation_time > 2018-09-17 00:00Z AND creation_time < 2018-09-17 23:59Z LIMIT 100 (see tombstone_warn_threshold)
ERROR [ReadStage-2] 2018-09-16 22:17:54,513 StorageProxy.java:1906 - Scanned over 100001 tombstones during query 'SELECT * FROM casb.o365_activity_log_by_date WHERE creation_time > 2018-09-17 00:00Z AND creation_time < 2018-09-17 23:59Z LIMIT 100' (last scanned row partion key was ((2018-09-15), 2018-09-15 08:09Z, 72160ee4-5310-4941-af92-d27ced9c9ca8)); query aborted
WARN  [Native-Transport-Requests-3] 2018-09-16 22:18:09,541 SelectStatement.java:430 - Aggregation query used without partition key
INFO  [ScheduledTasks:1] 2018-09-16 22:18:17,143 NoSpamLogger.java:91 - Some operations were slow, details available at debug level (debug.log)
WARN  [Native-Transport-Requests-1] 2018-09-16 22:18:28,160 SelectStatement.java:430 - Aggregation query used without partition key
WARN  [Native-Transport-Requests-1] 2018-09-16 22:18:47,943 SelectStatement.java:430 - Aggregation query used without partition key
INFO  [CompactionExecutor:75] 2018-09-16 22:28:26,738 AutoSavingCache.java:394 - Saved KeyCache (48 items) in 250 ms
INFO  [IndexSummaryManager:1] 2018-09-16 22:29:27,992 IndexSummaryRedistribution.java:75 - Redistributing index summaries

更多详情:

我可以在同一张表上运行以下查询而不会出现任何错误:

SELECT * FROM casb.o365_activity_log_by_date;

在上面的查询中运行,我可以看到有一些列中大部分为空值。看到这一点,从日志中我猜这与 cassandra 中的墓碑有关。

我应该在这里做什么?我查看了this answer,所以我应该清理墓碑吗?我不知道。

【问题讨论】:

  • 需要注意的是,查询必须从整个集群中读取整个数据集。如果这不会因为太多的墓碑或其他原因而失败,那么它将超时

标签: cassandra cassandra-3.0


【解决方案1】:

您的查询对 Cassandra 使用了两种可能的反模式。

首先,您尝试计算整个数据库中的所有键。这将导致在 Cassandra 中读取磁盘上的所有全部数据,因为没有像 RDBMS 中那样可以快速回答您的问题的索引。请注意 SELECT * FROM foo;SELECT count(*) FROM bar; 在 Cassandra 中总是很慢。

其次,您忽略了 Cassandra 发出的警告以执行这些操作 - ALLOW FILTERING 必须明确写入。请记住,这是为了保护您进行整个集群读取。您的 select 语句正在过滤不属于您的主键的 creation_time

所以我敢打赌,您在查询运行时会遇到超时。查看来自 Cassandra 的 system.log,当从软件包安装时,通常可以在 /var/log/cassandra/system.log 下找到。

一般来说 - 如果您想使用 Cassandra,我建议您学习一些数据建模课程,例如 DataStax 提供的课程。它通常归结为:围绕您将使用的查询构建数据模型 - 如有必要,进行非规范化,以便您的查询理想地只命中一个分区键。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2014-01-20
    • 2019-07-25
    • 1970-01-01
    • 1970-01-01
    • 2018-09-19
    • 2016-01-12
    相关资源
    最近更新 更多