使用 COUNT(*) 优化查询答案

【问题标题】：Optimize query with COUNT(*)使用 COUNT(*) 优化查询
【发布时间】：2016-08-17 18:57:57
【问题描述】：

我的表格有 145821312 行：/

CREATE TABLE core.conversations
(
  user_id integer NOT NULL,
  user_to_id integer NOT NULL,
  last_message_timestamp timestamp with time zone NOT NULL,
  body text NOT NULL,
  status core.message_status DEFAULT 'unread'::core.message_status,
  my_message boolean NOT NULL,
  delete_message_timestamp timestamp with time zone,
  deleted boolean NOT NULL DEFAULT false,
  CONSTRAINT userid_usertiid UNIQUE (, )
)
WITH (
  OIDS=FALSE
);

我有疑问：

SELECT COUNT(*) FROM core.conversations WHERE user_id=xxxx AND status='unread' AND deleted='false';

解释分析显示：

"Aggregate  (cost=930.17..930.18 rows=1 width=0) (actual time=0.027..0.027 rows=1 loops=1)"
"  ->  Index Only Scan using useridunreaddeleted_idx on conversations  (cost=0.57..929.59 rows=229 width=0) (actual time=0.019..0.019 rows=0 loops=1)"
"        Index Cond: ((user_id = 123123) AND (status = 'unread'::core.message_status) AND (deleted = false))"
"        Filter: (NOT deleted)"
"        Heap Fetches: 0"
"Planning time: 0.239 ms"
"Execution time: 0.130 ms"

索引：

CREATE INDEX useridunreaddeleted_idx
  ON core.conversations
  USING btree
  (user_id, status, deleted);

有没有办法优化这个查询？其他一些索引类型？它的查询非常简单，但我知道表中有很多数据；/ 或者我应该做一些聚合来获得这个计数器...

编辑：我更改了查询，这是错误的，没有计数（*），对不起

【问题讨论】：

“执行时间：0.066 毫秒” 您的查询用时不到一秒，您想加快 o.0 吗？
它现在被缓存了......这个生产中的查询大约需要 5-15 秒。并且有更多的“重度”用户，有更多未读消息......
5-15 秒可能是最好的。也就是说，您可以尝试使用部分索引，例如 CREATE INDEX ON core.conversations (user_id) WHERE status = 'unread'::core.message_status AND NOT deleted。如果不出意外，这应该会减少您的索引大小。
我会交换索引中的status 和deleted 列（或者甚至首先让status 具有非空值和更高的选择性。当然只有状态实际上分布良好。
如果您关心的是生产环境中查询的性能，那么提供该执行计划是有意义的。

标签： sql performance postgresql database-performance

【解决方案1】：

查看生产系统上查询的EXPLAIN (ANALYZE) 输出会很有趣。

从数据库的角度来看，您可能无法比使用该索引做得更好，我想说要提高性能，您必须改进硬件：

大量 RAM - 至少是索引，但理想情况下应该缓存整个表。
快速 I/O 子系统。

【讨论】：

我们有 3 个实例 - 比如 xlarge amazon ec2。一个主人，两个奴隶。和 HDD NOT SSD ;/
好吧，也许这就是你的问题。
我查了一下，每个都有 8 个 CPU 和 16G 内存。
如果没有并行查询，8 个 CPU 将无法为您提供帮助

【解决方案2】：

分区呢？会有帮助吗？

CREATE TABLE core.conversations_0
(
  CONSTRAINT conversations_0_user_id_check CHECK ((user_id % 1000) = 0)
) INHERITS (core.conversations);
...
CREATE TABLE core.conversations_1000
(    
  CONSTRAINT conversations_1000_user_id_check CHECK ((user_id % 1000) = 999)
) INHERITS (core.conversations);

和索引：

CREATE INDEX conversations_0_user_id_idx
  ON core.conversations_0
  USING btree
  (user_id)
  WHERE status = 'unread'::core.message_status AND NOT deleted;
...
CREATE INDEX conversations_1000_user_id_idx
  ON core.conversations_1000
  USING btree
  (user_id)
  WHERE status = 'unread'::core.message_status AND NOT deleted;

但是，我仍然不知道为什么这个查询会扫描所有索引？

SET constraint_exclusion = on;
EXPLAIN  ANALYZE SELECT COUNT(*) as cnt FROM core.conversations WHERE user_id=12728967 AND status='unread' AND deleted='false';

 "Aggregate  (cost=89.77..89.78 rows=1 width=0) (actual time=0.035..0.035 rows=1 loops=1)"
"  ->  Append  (cost=0.28..89.75 rows=11 width=0) (actual time=0.032..0.032 rows=0 loops=1)"
"        ->  Index Scan using conversations_user_id_idx on conversations  (cost=0.28..8.30 rows=1 width=0) (actual time=0.013..0.013 rows=0 loops=1)"
"              Index Cond: (user_id = 12728967)"
"        ->  Index Scan using conversations_0_user_id_idx on conversations_0  (cost=0.12..8.14 rows=1 width=0) (actual time=0.003..0.003 rows=0 loops=1)"
"              Index Cond: (user_id = 12728967)"
"        ->  Index Scan using conversations_1_user_id_idx on conversations_1  (cost=0.12..8.14 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)"
"              Index Cond: (user_id = 12728967)"
"        ->  Index Scan using conversations_2_user_id_idx on conversations_2  (cost=0.12..8.14 rows=1 width=0) (actual time=0.001..0.001 rows=0 loops=1)"
"              Index Cond: (user_id = 12728967)"
"        ->  Index Scan using conversations_3_user_id_idx on conversations_3  (cost=0.12..8.14 rows=1 width=0) (actual time=0.001..0.001 rows=0 loops=1)"
"              Index Cond: (user_id = 12728967)"
"        ->  Index Scan using conversations_4_user_id_idx on conversations_4  (cost=0.12..8.14 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)"
"              Index Cond: (user_id = 12728967)"
"        ->  Index Scan using conversations_5_user_id_idx on conversations_5  (cost=0.12..8.14 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)"
"              Index Cond: (user_id = 12728967)"
"        ->  Index Scan using conversations_6_user_id_idx on conversations_6  (cost=0.12..8.14 rows=1 width=0) (actual time=0.001..0.001 rows=0 loops=1)"
"              Index Cond: (user_id = 12728967)"
"        ->  Index Scan using conversations_7_user_id_idx on conversations_7  (cost=0.12..8.14 rows=1 width=0) (actual time=0.001..0.001 rows=0 loops=1)"
"              Index Cond: (user_id = 12728967)"
"        ->  Index Scan using conversations_8_user_id_idx on conversations_8  (cost=0.12..8.14 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)"
"              Index Cond: (user_id = 12728967)"
"        ->  Index Scan using conversations_9_user_id_idx on conversations_9  (cost=0.12..8.14 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)"
"              Index Cond: (user_id = 12728967)"
"Planning time: 1.229 ms"
"Execution time: 0.133 ms"

编辑：

将此添加到查询'user_id%10 = 7'

SET constraint_exclusion = on;
EXPLAIN  ANALYZE SELECT COUNT(*) as cnt FROM core.conversations WHERE user_id=12728967 AND user_id%10 = 7 AND status='unread' AND deleted='false';

"Aggregate  (cost=16.46..16.47 rows=1 width=0) (actual time=0.019..0.019 rows=1 loops=1)"
"  ->  Append  (cost=0.28..16.45 rows=2 width=0) (actual time=0.015..0.015 rows=0 loops=1)"
"        ->  Index Scan using conversations_user_id_idx on conversations  (cost=0.28..8.30 rows=1 width=0) (actual time=0.013..0.013 rows=0 loops=1)"
"              Index Cond: (user_id = 12728967)"
"              Filter: ((user_id % 10) = 7)"
"        ->  Index Scan using conversations_7_user_id_idx on conversations_7  (cost=0.12..8.15 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)"
"              Index Cond: (user_id = 12728967)"
"              Filter: ((user_id % 10) = 7)"
"Planning time: 0.950 ms"
"Execution time: 0.070 ms"

但它也扫描“母亲”表大索引；/

【讨论】：