【问题标题】:the cost decreased but the time was practically the same, why?成本降低了,但时间几乎一样,为什么?
【发布时间】:2019-05-02 16:28:33
【问题描述】:

表定义:

CREATE TABLE schema.mylogoperation (
  id_mylogoperation serial,
  data DATE,
  myschema VARCHAR(255),
  column_var_2 VARCHAR(255),
  user VARCHAR(255),
  action TEXT,
  column_var_1 TEXT,
  log_old VARCHAR,
  log_new VARCHAR
  constraint pk_mylogoperation primary key (id_mylogoperation)
) 
WITH (oids = false);
  • 1200 万行

我试图解释分析:

explain analyze
SELECT
    column_var_1,
    column_var_2
    column_var_3,
    user,
    action,
    data,
    log_old,
    log_new
FROM schema.mylogoperation
WHERE
    myschema = 'schema'
AND column_var_2 IN ('mydata1', 'mydata2', 'mydata3')
AND log_old <> log_new
AND column_var_1 LIKE 'mydata%';

indexes ( pk_mylogoperation only)
QUERY PLAN
Seq Scan on myschema  (cost=0.00..713948.14rows=660 width=222) (actual time=380.308..4467.364 rows=48 loops=1)
  Filter: (((log_old)::text <> (log_new)::text) AND (column_var_1 ~~ 'mydata%'::text) AND ((schema)::text = 'schema'::text) AND ((column_var_2)::text = ANY ('{mydata1,mydata2,mydata3}'::text[])))
  Rows Removed by Filter: 12525296
Total runtime: 4467.425 ms

然后我尝试创建一些索引以获得更好的性能:

CREATE INDEX idx_mylogoperation_1    ON schema.mylogoperation (myschema, column_var_2);

reindex table schema.mylogoperation;
analyze schema.mylogoperation;

pk_mylogoperation + idx_mylogoperation_1
QUERY PLAN
Index Scan using idx_mylogoperation_qry1 on mylogoperation  (cost=0.56..589836.84 rows=658 width=223) (actual time=331.679..4997.507 rows=48 loops=1)
  Index Cond: (((myschema)::text = 'schema'::text) AND ((column_var_2)::text = ANY ('{mydata1,mydata2,mydata3}'::text[])))
  Filter: (((log_old)::text <> (log_new)::text) AND (column_var_1 ~~ 'mydata%'::text))
  Rows Removed by Filter: 7441986
Total runtime: 4997.580 ms

然后我再次尝试创建一些索引以获得更好的性能:

CREATE INDEX idx_mylogoperation_2 ON schema.mylogoperation USING gin (column_var_1 gin_trgm_ops);

reindex table schema.mylogoperation;
analyze schema.mylogoperation;

pk_mylogoperation + idx_mylogoperation_1 + idx_mylogoperation_2
QUERY PLAN
Bitmap Heap Scan on idx_mylogoperation_var_1  (cost=1398.58..2765.08 rows=663 width=222) (actual time=5303.481..5303.906 rows=48 loops=1)
  Recheck Cond: (column_var_1 ~~ 'mydata%'::text)
  Filter: (((log_old)::text <> (log_new)::text) AND ((myschema)::text = 'schema'::text) AND ((column_var_2)::text = ANY ('{mydata1,mydata2,mydata3}'::text[])))
  Rows Removed by Filter: 248
  ->  Bitmap Index Scan on idx_mylogoperation_var_1  (cost=0.00..1398.41 rows=1215 width=0) (actual time=5303.203..5303.203 rows=296 loops=1)
        Index Cond: (column_var_1 ~~ 'mydata%'::text)
Total runtime: 5303.950 ms

问题

  • 成本降低了,但时间几乎一样,为什么?

注意事项

  • 我不想更改选择操作,只是在数据库结构中。

  • 此测试是在正在使用的服务器上执行的。但是创建这些索引是有效的吗?或者更确切地说不要使用它们。

  • 我在 Linux 64 位 Red Hat 上使用 Postgres 9.3.22。

【问题讨论】:

    标签: postgresql performance indexing


    【解决方案1】:

    这个索引:

    CREATE INDEX idx_mylogoperation_1    ON schema.mylogoperation (myschema, column_var_2);
    

    没有帮助,因为您的 where 子句的相关部分与表的 ~2/3 匹配。索引并没有将结果缩小很多,但过滤器确实:

    Filter: (((log_old)::text <> (log_new)::text) AND (column_var_1 ~~ 'mydata%'::text))
      Rows Removed by Filter: 7441986
    

    我不确定过滤器中这两个内容中哪一个删除得更多,但您可以尝试使用部分索引,例如:

    CREATE INDEX idx_mylogoperation_1    ON schema.mylogoperation (myschema, column_var_2) WHERE log_old <> log_new;
    

    【讨论】:

      猜你喜欢
      • 2018-03-17
      • 2012-04-02
      • 2015-09-06
      • 1970-01-01
      • 2022-08-22
      • 1970-01-01
      • 2021-01-25
      • 1970-01-01
      • 2021-06-25
      相关资源
      最近更新 更多