【问题标题】:Cassandra time series data, limit the number of disk seeksCassandra时间序列数据,限制磁盘寻道次数
【发布时间】:2016-07-09 03:57:53
【问题描述】:

我在查询时遇到了过多的磁盘寻道问题。

对于这样的查询:

SELECT sessionid
FROM sessionlink
WHERE linktype = 'host'
  AND link = 'webserver1'
  AND timestamp > minTimeuuid('2016-07-02 09:00:00')
  AND timestamp < maxTimeuuid('2016-07-02 10:00:00');

在这样的桌子上:

CREATE TABLE logs.sessionlink (
    linktype text,
    link text,
    "timestamp" timeuuid,
    sessionid text,
    PRIMARY KEY ((linktype, link), "timestamp", sessionid)
);

我正在打 120 个 SS 表。为什么查询在磁盘上的这么多 SS 表中寻找,而不仅仅是在给定日期内包含数据的一次?我使用sstablemetadata 手动检查了 SS 表,发现只有一个 SS 表包含被查询的日期(查看最小和最大时间戳)。

数据插入:

INSERT INTO sessionlink(sessionid, linktype, link, timestamp)
VALUES (?, ?, ?, now()) USING TTL ?;

并且该表使用 Date Tiered Compaction 进行压缩。

在阅读了关于日期分层压缩 (https://labs.spotify.com/2014/12/18/date-tiered-compaction/) 和其他资源的 Spotify 博客后,我的印象是它可以快速读取时间序列数据,因为 SS 表是及时排序的,这样可以最大限度地减少磁盘读取次数。但是,这不是我在集群上看到的。

这是查询的轨迹(删除了一些键缓存命中和查找行):

  activity                                                                                        | timestamp                  | source        | source_elapsed
 -------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
                                                                               Execute CQL3 query | 2016-07-08 12:06:17.333000 | 192.168.4.184 |              0
                                             Parsing "The query from above" [SharedPool-Worker-6] | 2016-07-08 12:06:17.335000 | 192.168.4.184 |             72
             READ message received from /192.168.4.184 [MessagingService-Incoming-/192.168.4.184] | 2016-07-08 12:06:17.335000 | 192.168.4.186 |             23
                                                        Preparing statement [SharedPool-Worker-6] | 2016-07-08 12:06:17.336000 | 192.168.4.184 |            236
                            Executing single-partition query on sessionlink [SharedPool-Worker-1] | 2016-07-08 12:06:17.336000 | 192.168.4.186 |             80
                                           reading data from /192.168.4.186 [SharedPool-Worker-6] | 2016-07-08 12:06:17.337000 | 192.168.4.184 |           1224
                                               Acquiring sstable references [SharedPool-Worker-1] | 2016-07-08 12:06:17.337000 | 192.168.4.186 |             85
                Sending READ message to /192.168.4.186 [MessagingService-Outgoing-/192.168.4.186] | 2016-07-08 12:06:17.338000 | 192.168.4.184 |           1304
                                                Merging memtable tombstones [SharedPool-Worker-1] | 2016-07-08 12:06:17.338000 | 192.168.4.186 |            371
                                             Key cache hit for sstable 1168 [SharedPool-Worker-1] | 2016-07-08 12:06:17.339000 | 192.168.4.186 |           1003
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.339000 | 192.168.4.186 |           1007
                                             Key cache hit for sstable 1167 [SharedPool-Worker-1] | 2016-07-08 12:06:17.339000 | 192.168.4.186 |           1029
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.343000 | 192.168.4.186 |           1031
                                             Key cache hit for sstable 1278 [SharedPool-Worker-1] | 2016-07-08 12:06:17.343000 | 192.168.4.186 |           1965
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.343000 | 192.168.4.186 |           1968
                                             Key cache hit for sstable 1277 [SharedPool-Worker-1] | 2016-07-08 12:06:17.344000 | 192.168.4.186 |           1982
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.344000 | 192.168.4.186 |           1984
                                             Key cache hit for sstable 1276 [SharedPool-Worker-1] | 2016-07-08 12:06:17.344000 | 192.168.4.186 |           1996
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.345000 | 192.168.4.186 |           1999
                                             Key cache hit for sstable 1275 [SharedPool-Worker-1] | 2016-07-08 12:06:17.345000 | 192.168.4.186 |           2011
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.345000 | 192.168.4.186 |           2013
                                             Key cache hit for sstable 1274 [SharedPool-Worker-1] | 2016-07-08 12:06:17.346000 | 192.168.4.186 |           2026
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.346000 | 192.168.4.186 |           2027
                                             Key cache hit for sstable 1273 [SharedPool-Worker-1] | 2016-07-08 12:06:17.346000 | 192.168.4.186 |           2040
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.346000 | 192.168.4.186 |           2041
                                             Key cache hit for sstable 1272 [SharedPool-Worker-1] | 2016-07-08 12:06:17.347000 | 192.168.4.186 |           2053
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.347000 | 192.168.4.186 |           2078
                                             Key cache hit for sstable 1271 [SharedPool-Worker-1] | 2016-07-08 12:06:17.347000 | 192.168.4.186 |           2090
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.348000 | 192.168.4.186 |           2092
                                                                                    ... removed some rows to make it fit ...
 REQUEST_RESPONSE message received from /192.168.4.186 [MessagingService-Incoming-/192.168.4.186] | 2016-07-08 12:06:17.402000 | 192.168.4.184 |          68884
                                    Processing response from /192.168.4.186 [SharedPool-Worker-2] | 2016-07-08 12:06:17.403000 | 192.168.4.184 |          68953
                                             Key cache hit for sstable 1188 [SharedPool-Worker-1] | 2016-07-08 12:06:17.403000 | 192.168.4.186 |          17630
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.403000 | 192.168.4.186 |          17632
                                             Key cache hit for sstable 1187 [SharedPool-Worker-1] | 2016-07-08 12:06:17.403000 | 192.168.4.186 |          17647
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.404000 | 192.168.4.186 |          17649
                                             Key cache hit for sstable 1186 [SharedPool-Worker-1] | 2016-07-08 12:06:17.404000 | 192.168.4.186 |          17665
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.404000 | 192.168.4.186 |          17666
                                             Key cache hit for sstable 1185 [SharedPool-Worker-1] | 2016-07-08 12:06:17.404000 | 192.168.4.186 |          17884
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.405000 | 192.168.4.186 |          17886
                                             Key cache hit for sstable 1184 [SharedPool-Worker-1] | 2016-07-08 12:06:17.405000 | 192.168.4.186 |          18098
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.405000 | 192.168.4.186 |          18100
                                             Key cache hit for sstable 1183 [SharedPool-Worker-1] | 2016-07-08 12:06:17.405000 | 192.168.4.186 |          18314
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.406000 | 192.168.4.186 |          18317
                                             Key cache hit for sstable 1182 [SharedPool-Worker-1] | 2016-07-08 12:06:17.406000 | 192.168.4.186 |          18537
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.406000 | 192.168.4.186 |          18540
                                             Key cache hit for sstable 1181 [SharedPool-Worker-1] | 2016-07-08 12:06:17.406000 | 192.168.4.186 |          18754
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.407000 | 192.168.4.186 |          18756
                                             Key cache hit for sstable 1180 [SharedPool-Worker-1] | 2016-07-08 12:06:17.407000 | 192.168.4.186 |          18971
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.407000 | 192.168.4.186 |          18974
                                             Key cache hit for sstable 1179 [SharedPool-Worker-1] | 2016-07-08 12:06:17.408000 | 192.168.4.186 |          19212
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.408000 | 192.168.4.186 |          19215
                                             Key cache hit for sstable 1178 [SharedPool-Worker-1] | 2016-07-08 12:06:17.408000 | 192.168.4.186 |          19434
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.408000 | 192.168.4.186 |          19437
                                             Key cache hit for sstable 1177 [SharedPool-Worker-1] | 2016-07-08 12:06:17.409000 | 192.168.4.186 |          19656
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.409000 | 192.168.4.186 |          19658
                                             Key cache hit for sstable 1176 [SharedPool-Worker-1] | 2016-07-08 12:06:17.409000 | 192.168.4.186 |          19873
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.409001 | 192.168.4.186 |          19875
                                             Key cache hit for sstable 1175 [SharedPool-Worker-1] | 2016-07-08 12:06:17.410000 | 192.168.4.186 |          20093
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.410000 | 192.168.4.186 |          20096
                                             Key cache hit for sstable 1174 [SharedPool-Worker-1] | 2016-07-08 12:06:17.410000 | 192.168.4.186 |          20309
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.410000 | 192.168.4.186 |          20311
                                             Key cache hit for sstable 1173 [SharedPool-Worker-1] | 2016-07-08 12:06:17.410000 | 192.168.4.186 |          20537
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.411000 | 192.168.4.186 |          20539
                                             Key cache hit for sstable 1172 [SharedPool-Worker-1] | 2016-07-08 12:06:17.411000 | 192.168.4.186 |          20762
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.411000 | 192.168.4.186 |          20765
                                             Key cache hit for sstable 1171 [SharedPool-Worker-1] | 2016-07-08 12:06:17.412000 | 192.168.4.186 |          20985
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.412000 | 192.168.4.186 |          20987
                                             Key cache hit for sstable 1170 [SharedPool-Worker-1] | 2016-07-08 12:06:17.412000 | 192.168.4.186 |          21212
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.412000 | 192.168.4.186 |          21214
                                             Key cache hit for sstable 1169 [SharedPool-Worker-1] | 2016-07-08 12:06:17.413000 | 192.168.4.186 |          21230
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.413000 | 192.168.4.186 |          21232
                                             Key cache hit for sstable 1166 [SharedPool-Worker-1] | 2016-07-08 12:06:17.413000 | 192.168.4.186 |          21497
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.414000 | 192.168.4.186 |          21501
                                             Key cache hit for sstable 1165 [SharedPool-Worker-1] | 2016-07-08 12:06:17.414000 | 192.168.4.186 |          21928
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.414000 | 192.168.4.186 |          21930
                                             Key cache hit for sstable 1164 [SharedPool-Worker-1] | 2016-07-08 12:06:17.414001 | 192.168.4.186 |          22476
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.415000 | 192.168.4.186 |          22479
                                             Key cache hit for sstable 1163 [SharedPool-Worker-1] | 2016-07-08 12:06:17.415000 | 192.168.4.186 |          22494
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.415000 | 192.168.4.186 |          22496
                                             Key cache hit for sstable 1162 [SharedPool-Worker-1] | 2016-07-08 12:06:17.415000 | 192.168.4.186 |          23103
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.415000 | 192.168.4.186 |          23107
                                             Key cache hit for sstable 1161 [SharedPool-Worker-1] | 2016-07-08 12:06:17.416000 | 192.168.4.186 |          23124
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.416000 | 192.168.4.186 |          23126
                                             Key cache hit for sstable 1160 [SharedPool-Worker-1] | 2016-07-08 12:06:17.416000 | 192.168.4.186 |          23753
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.417000 | 192.168.4.186 |          23757
                                             Key cache hit for sstable 1159 [SharedPool-Worker-1] | 2016-07-08 12:06:17.417000 | 192.168.4.186 |          23771
                          Seeking to partition indexed section in data file [SharedPool-Worker-1] | 2016-07-08 12:06:17.417000 | 192.168.4.186 |          23773
                                                                                 Request complete | 2016-07-08 12:06:17.402317 | 192.168.4.184 |          69317

【问题讨论】:

标签: cassandra


【解决方案1】:

https://academy.datastax.com/resources/getting-started-time-series-data-modeling

这是在 Cassandra 中使用日期的一个很好的例子。我相信问题在于您如何为表格建模。如果您将时间戳移到分区键中,Cassandra 将更好地了解要搜索的表。如果您只是按(链接类型,链接)对数据进行排序,则需要搜索存储数据的每个区域。

简而言之,我建议使用 ((linktype, link, "timestamp"), sessionid) 的主键重新创建您的表。

【讨论】:

  • 由于“分区键仅支持 EQ 和 IN 关系”,我无法使用您建议的架构对时间戳进行任何切片查询。但是,我已经阅读了这篇文章,并且我几乎遵循示例模式 1,除了分区键被复合。这和我的问题有什么关系吗?如果我只有一个值 id 作为分区键,我是否应该期待更好的性能?喜欢这个(linkid, "timestamp", sessionid)
  • 引用文章。 查找两个日期之间数据的范围查询。这也称为切片,因为它将从磁盘读取一系列数据:SELECT temperature FROM temperature WHERE weatherstation_id='1234ABCD' AND event_time > '2013-04-03 07:01:00' AND event_time 根据我的阅读方式,您似乎可以对日期范围进行切片。
  • 回复您的第二个问题。我不认为单值分区键会有更好的性能,你只会存储更少的数据。
  • 是的,您可以对日期进行切片查询。但如果该日期是 分区键 的一部分,则不会。
  • @SimonLindberg,这就是我的建议。重写分区键以包含您的日期。
猜你喜欢
  • 1970-01-01
  • 2016-11-12
  • 2011-01-13
  • 2012-04-07
  • 2022-01-09
  • 2020-06-13
  • 1970-01-01
  • 2010-10-10
  • 2012-03-26
相关资源
最近更新 更多