【问题标题】:Slow reads in Cassandra clusterCassandra 集群中的慢读
【发布时间】:2015-06-23 21:08:10
【问题描述】:

我最近启动了一个包含 3 台机器的 Cassandra 集群。我让它一切正常,但是在我不得不重置其中一个节点(在这篇文章的底部解释)之后,我在读取最大的表之一时遇到了问题(见下面的跟踪)。

我认为我有一个非常明显的分区和集群键设置,并且在崩溃之前我没有遇到这个问题,所以我认为这不是问题。

CREATE TABLE datachannel_6min (
  channel_id int,
  time_start timestamp,
  power_avg float,
  power_min float,
  power_max float,
  energy float,
  temperature_in float,
  PRIMARY KEY (channel_id, time_start)
);

查询是使用复合键的单行选择。

select * from datachannel_6min where channel_id = 1028 order by time_start desc limit 1;

这里有 4 个痕迹示例……正如您所见,它们并不总是完全相同。

                                                                        activity                  | timestamp    | source   | source_elapsed
--------------------------------------------------------------------------------------------------+--------------+----------+----------------
                                                                               execute_cql3_query | 09:00:11,930 | 10.1.1.5 |              0
 Parsing select * from datachannel_6min where channel_id = 1042 order by time_start desc limit 1; | 09:00:11,930 | 10.1.1.5 |            102
                                                                              Preparing statement | 09:00:11,930 | 10.1.1.5 |            233
                                             Executing single-partition query on datachannel_6min | 09:00:11,931 | 10.1.1.5 |           1135
                                                                     Acquiring sstable references | 09:00:11,931 | 10.1.1.5 |           1163
                                                                      Merging memtable tombstones | 09:00:11,931 | 10.1.1.5 |           1185
                                                                  Key cache hit for sstable 14912 | 09:00:11,931 | 10.1.1.5 |           1223
                                                Seeking to partition indexed section in data file | 09:00:11,931 | 10.1.1.5 |           1230
                                                                  Key cache hit for sstable 14823 | 09:00:11,984 | 10.1.1.5 |          53805
                                                Seeking to partition indexed section in data file | 09:00:11,984 | 10.1.1.5 |          53851
                                                                  Key cache hit for sstable 14786 | 09:00:12,059 | 10.1.1.5 |         129027
                                                Seeking to partition indexed section in data file | 09:00:12,059 | 10.1.1.5 |         129060
                                                                  Key cache hit for sstable 14749 | 09:00:12,241 | 10.1.1.5 |         311521
                                                Seeking to partition indexed section in data file | 09:00:12,241 | 10.1.1.5 |         311558
                                                                  Key cache hit for sstable 14714 | 09:00:12,242 | 10.1.1.5 |         311843
                                                Seeking to partition indexed section in data file | 09:00:12,242 | 10.1.1.5 |         311849
                                           Partition index with 0 entries found for sstable 14913 | 09:00:12,242 | 10.1.1.5 |         312153
                                                Seeking to partition indexed section in data file | 09:00:12,242 | 10.1.1.5 |         312159
                                           Partition index with 0 entries found for sstable 14914 | 09:00:12,354 | 10.1.1.5 |         423820
                                                Seeking to partition indexed section in data file | 09:00:12,354 | 10.1.1.5 |         423849
                                           Partition index with 0 entries found for sstable 14916 | 09:00:12,354 | 10.1.1.5 |         424455
                                                Seeking to partition indexed section in data file | 09:00:12,354 | 10.1.1.5 |         424463
                                           Partition index with 0 entries found for sstable 14915 | 09:00:12,420 | 10.1.1.5 |         490468
                                                Seeking to partition indexed section in data file | 09:00:12,420 | 10.1.1.5 |         490501
                                           Partition index with 0 entries found for sstable 14917 | 09:00:12,492 | 10.1.1.5 |         561711
                                                Seeking to partition indexed section in data file | 09:00:12,492 | 10.1.1.5 |         561748
                                         Partition index with 146 entries found for sstable 14918 | 09:00:12,696 | 10.1.1.5 |         766248
                                                Seeking to partition indexed section in data file | 09:00:12,696 | 10.1.1.5 |         766306
                       Skipped 0/11 non-slice-intersecting sstables, included 0 due to tombstones | 09:00:12,696 | 10.1.1.5 |         766323
                                                      Merging data from memtables and 11 sstables | 09:00:12,696 | 10.1.1.5 |         766329
                                                               Read 2 live and 0 tombstoned cells | 09:00:12,773 | 10.1.1.5 |         842632
                                                                                 Request complete | 09:00:12,773 | 10.1.1.5 |         843350



                                                                        activity                  | timestamp    | source   | source_elapsed
--------------------------------------------------------------------------------------------------+--------------+----------+----------------
                                                                               execute_cql3_query | 09:05:46,255 | 10.1.1.4 |              0
                                                                  Message received from /10.1.1.4 | 09:05:46,250 | 10.1.1.5 |             21
                                             Executing single-partition query on datachannel_6min | 09:05:46,250 | 10.1.1.5 |            520
                                                                     Acquiring sstable references | 09:05:46,250 | 10.1.1.5 |            593
                                                                      Merging memtable tombstones | 09:05:46,250 | 10.1.1.5 |            609
                                                       Bloom filter allows skipping sstable 14912 | 09:05:46,250 | 10.1.1.5 |            630
                                                       Bloom filter allows skipping sstable 14823 | 09:05:46,250 | 10.1.1.5 |            641
                                                       Bloom filter allows skipping sstable 14786 | 09:05:46,250 | 10.1.1.5 |            647
                                                       Bloom filter allows skipping sstable 14749 | 09:05:46,250 | 10.1.1.5 |            654
                                                       Bloom filter allows skipping sstable 14714 | 09:05:46,251 | 10.1.1.5 |            757
                                                       Bloom filter allows skipping sstable 14913 | 09:05:46,251 | 10.1.1.5 |            763
                                                       Bloom filter allows skipping sstable 14914 | 09:05:46,251 | 10.1.1.5 |            770
                                                       Bloom filter allows skipping sstable 14916 | 09:05:46,251 | 10.1.1.5 |            776
                                                       Bloom filter allows skipping sstable 14915 | 09:05:46,251 | 10.1.1.5 |            783
                                                       Bloom filter allows skipping sstable 14917 | 09:05:46,251 | 10.1.1.5 |            789
 Parsing select * from datachannel_6min where channel_id = 1036 order by time_start desc limit 1; | 09:05:46,255 | 10.1.1.4 |            103
                                                                              Preparing statement | 09:05:46,255 | 10.1.1.4 |            223
                                                                     Sending message to /10.1.1.5 | 09:05:46,256 | 10.1.1.4 |            673
                                          Partition index with 17 entries found for sstable 14918 | 09:05:46,534 | 10.1.1.5 |         283815
                                                Seeking to partition indexed section in data file | 09:05:46,534 | 10.1.1.5 |         283851
                       Skipped 0/11 non-slice-intersecting sstables, included 0 due to tombstones | 09:05:46,534 | 10.1.1.5 |         283867
                                                       Merging data from memtables and 1 sstables | 09:05:46,534 | 10.1.1.5 |         283873
                                                               Read 2 live and 0 tombstoned cells | 09:05:46,571 | 10.1.1.5 |         321319
                                                                  Enqueuing response to /10.1.1.4 | 09:05:46,571 | 10.1.1.5 |         321439
                                                                     Sending message to /10.1.1.4 | 09:05:46,571 | 10.1.1.5 |         321613
                                                                  Message received from /10.1.1.5 | 09:05:46,579 | 10.1.1.4 |         323621
                                                               Processing response from /10.1.1.5 | 09:05:46,579 | 10.1.1.4 |         323730                                                               
                                                                                 Request complete | 09:05:46,579 | 10.1.1.4 |         324458



                                                                        activity                  | timestamp    | source   | source_elapsed
--------------------------------------------------------------------------------------------------+--------------+----------+----------------
                                                                               execute_cql3_query | 05:39:12,430 | 10.1.1.4 |              0
 Parsing select * from datachannel_6min where channel_id = 1030 order by time_start desc limit 1; | 05:39:12,430 | 10.1.1.4 |            164
                                                                              Preparing statement | 05:39:12,430 | 10.1.1.4 |            310
                                                                     Sending message to /10.1.1.6 | 05:39:12,431 | 10.1.1.4 |            829
                                                                  Message received from /10.1.1.4 | 05:39:12,432 | 10.1.1.6 |             19
                                             Executing single-partition query on datachannel_6min | 05:39:12,433 | 10.1.1.6 |            719
                                                                     Acquiring sstable references | 05:39:12,433 | 10.1.1.6 |            742
                                                                      Merging memtable tombstones | 05:39:12,433 | 10.1.1.6 |            769
                                                        Bloom filter allows skipping sstable 1476 | 05:39:12,433 | 10.1.1.6 |            830
                                            Partition index with 0 entries found for sstable 1475 | 05:39:12,433 | 10.1.1.6 |            904
                                                Seeking to partition indexed section in data file | 05:39:12,433 | 10.1.1.6 |            919
                                            Partition index with 2 entries found for sstable 1346 | 05:39:12,434 | 10.1.1.6 |           1403
                                                Seeking to partition indexed section in data file | 05:39:12,434 | 10.1.1.6 |           1425
                                            Partition index with 2 entries found for sstable 1472 | 05:39:12,434 | 10.1.1.6 |           1511
                                                Seeking to partition indexed section in data file | 05:39:12,434 | 10.1.1.6 |           1522
                                             Partition index with 0 entries found for sstable 586 | 05:39:12,434 | 10.1.1.6 |           1567
                                                Seeking to partition indexed section in data file | 05:39:12,434 | 10.1.1.6 |           1578
                                             Partition index with 146 entries found for sstable 5 | 05:39:12,434 | 10.1.1.6 |           2132
                                                Seeking to partition indexed section in data file | 05:39:12,434 | 10.1.1.6 |           2152
                        Skipped 0/6 non-slice-intersecting sstables, included 0 due to tombstones | 05:39:12,434 | 10.1.1.6 |           2177
                                                       Merging data from memtables and 5 sstables | 05:39:12,434 | 10.1.1.6 |           2192
                                                               Read 2 live and 0 tombstoned cells | 05:39:13,106 | 10.1.1.6 |         673858
                                                                  Enqueuing response to /10.1.1.4 | 05:39:13,106 | 10.1.1.6 |         674163
                                                                     Sending message to /10.1.1.4 | 05:39:13,107 | 10.1.1.6 |         674329
                                                                  Message received from /10.1.1.6 | 05:39:13,107 | 10.1.1.4 |         676882
                                                               Processing response from /10.1.1.6 | 05:39:13,107 | 10.1.1.4 |         677118
                                                                                 Request complete | 05:39:13,107 | 10.1.1.4 |         677344



                                                                        activity                  | timestamp    | source   | source_elapsed
--------------------------------------------------------------------------------------------------+--------------+----------+----------------
                                                                               execute_cql3_query | 05:40:41,322 | 10.1.1.4 |              0
 Parsing select * from datachannel_6min where channel_id = 1028 order by time_start desc limit 1; | 05:40:41,322 | 10.1.1.4 |            104
                                                                              Preparing statement | 05:40:41,322 | 10.1.1.4 |            257
                                                                     Sending message to /10.1.1.5 | 05:40:41,322 | 10.1.1.4 |            569
                                                                  Message received from /10.1.1.4 | 05:40:41,324 | 10.1.1.5 |              9
                                             Executing single-partition query on datachannel_6min | 05:40:41,324 | 10.1.1.5 |            401
                                                                     Acquiring sstable references | 05:40:41,324 | 10.1.1.5 |            410
                                                                      Merging memtable tombstones | 05:40:41,324 | 10.1.1.5 |            427
                                                       Bloom filter allows skipping sstable 15658 | 05:40:41,324 | 10.1.1.5 |            451
                                                       Bloom filter allows skipping sstable 15666 | 05:40:41,324 | 10.1.1.5 |            476
                                                       Bloom filter allows skipping sstable 15892 | 05:40:41,324 | 10.1.1.5 |            489
                                                       Bloom filter allows skipping sstable 15749 | 05:40:41,324 | 10.1.1.5 |            503
                                                       Bloom filter allows skipping sstable 15874 | 05:40:41,324 | 10.1.1.5 |            514
                                                       Bloom filter allows skipping sstable 15682 | 05:40:41,324 | 10.1.1.5 |            523
                                          Partition index with 14 entries found for sstable 14918 | 05:40:42,152 | 10.1.1.5 |         828365
                                                Seeking to partition indexed section in data file | 05:40:42,152 | 10.1.1.5 |         828406
                        Skipped 0/7 non-slice-intersecting sstables, included 0 due to tombstones | 05:40:42,152 | 10.1.1.5 |         828422
                                                       Merging data from memtables and 1 sstables | 05:40:42,152 | 10.1.1.5 |         828427
                                                               Read 2 live and 0 tombstoned cells | 05:40:42,300 | 10.1.1.5 |         976825
                                                                  Enqueuing response to /10.1.1.4 | 05:40:42,301 | 10.1.1.5 |         976984
                                                                  Message received from /10.1.1.5 | 05:40:42,301 | 10.1.1.4 |         978829
                                                                     Sending message to /10.1.1.4 | 05:40:42,301 | 10.1.1.5 |         977105
                                                               Processing response from /10.1.1.5 | 05:40:42,301 | 10.1.1.4 |         979018
                                                                                 Request complete | 05:40:42,301 | 10.1.1.4 |         979239

这是我的集群的历史和我遇到的错误。

  • 在西欧 Azure 数据中心的虚拟网络中安装了 3 个节点。我启动了将 API 记录到 Cassandra 的服务。 (约 10/s)。我启动了第二个服务,它使用添加的数据来计算新数据(这是使用上面选择的地方)
  • 将旧数据(MSSQL 中的5 亿行)移至 Cassandra。在大约 3 天内将此与我的服务并行运行。
  • [错误] 硬盘已满。我犯了一个愚蠢的错误,忘记为数据添加单独的磁盘。我在每台机器上安装了 4 个磁盘并将它们“合并”到一个(http://blog.metricshub.com/2012/12/27/running-cassandra-on-azure-step-by-step-gotcha-by-gotcha/)。我将日志和数据目录移动到所有三个节点上的新磁盘。其中两个节点运行良好,但第三个节点我必须彻底清理(删除数据/日志)。我的复制因子为 2,因此没有数据丢失。我在“新”节点上运行了 nodetool repair
  • 当我再次开始查询集群时,我注意到我的选择不一致。如果我在 Datastax Devcenter 中运行查询,我无法获得查询结果,但经过 3-5 次尝试后,我得到了完整回复。我将查询更改为使用 Quorom 而不是似乎可以解决问题的 ONE。
  • 我还在两个好的节点上运行了 nodetool cleanup
  • 最后我在一个好的节点上运行 nodetool repair 并且现在也在最后一个节点上运行它(运行大约需要 1 天)。

【问题讨论】:

    标签: cassandra


    【解决方案1】:

    我有两个建议:

    • 如果您总是要查询最近的行 (order by time_start desc limit 1;),那么您应该考虑在 CLUSTERING ORDER 中指定 DESCending 排序方向。这将比使用ASCending 进行聚类但使用DESCending 进行查询要快。

    • 可能我在这里看到的最大问题是您的分区 (channel_id = 1028) 分布在多个 SSTABLE 文件中。由于您的数据似乎是一个时间序列,您可以尝试使用DateTieredCompactionStrategy。 DateTieredCompactionStrategy 按时间戳对磁盘上的数据进行分组。从理论上讲,这应该使您的查询仅限于少数(甚至可能是单个)SSTABLE 文件。特别是如果您只需要最近的行。

    我会DROP(你不能ALTER CLUSTERING ORDER),像这样重新加载并重新创建你的表:

    CREATE TABLE datachannel_6min (
      channel_id int,
      time_start timestamp,
      power_avg float,
      power_min float,
      power_max float,
      energy float,
      temperature_in float,
      PRIMARY KEY (channel_id, time_start)
    ) WITH CLUSTERING ORDER BY (time_start DESC)
    AND COMPACTION = {'class': 'DateTieredCompactionStrategy'};
    

    您可以使用我上面链接的文章中概述的 DateTieredCompactionStrategy 设置一些选项。通读一遍,确保默认设置适合您,或根据需要进行调整。

    【讨论】:

    • 其实我以后也会订购其他方向的..我会从时间跨度中获取数据..所以我想我不会从改变聚类顺序中获得任何收益。但是你的文章真的很有趣。我的脚本及时“向后”添加了数据,因此可能使 SSTABLES 杂乱无章......但由于 SSTABLES 的元数据为 min/max,我仍然不明白这如何解释我的 400-1200ms 读取时间:S。 .我会一遍又一遍地阅读这篇文章。谢谢!
    • 我猜更改表压缩类不会重新组织旧数据,所以我正在考虑现在重做所有事情(清除所有数据)。这次按正确的顺序添加。
    • @Fischer 实际上下次触发压缩时,它应该对整个键空间使用新策略,包括现有数据。
    • 我不确定。我阅读了有关 DataTierdCompatcion 的信息,看起来它存在乱序添加数据或添加积压的问题,因为数据点的时间与记录添加到 Cassandra 的时间不匹配。对于新数据来说这听起来确实很棒,但因为我正在导入 3 年前的数据......
    • @Fischer 实际上,你是对的。您导入的数据将在导入日期加上时间戳,而不是 time_start。但今后,您的新数据将被写入并压缩在一起。
    猜你喜欢
    • 1970-01-01
    • 2015-08-07
    • 2013-09-09
    • 2022-07-05
    • 2015-10-13
    • 2018-12-02
    • 2019-03-03
    • 2016-08-16
    • 2020-03-28
    相关资源
    最近更新 更多