【问题标题】:Index exists but sorting still takes time索引存在,但排序仍然需要时间
【发布时间】:2019-11-11 21:55:57
【问题描述】:

我正在处理已在我们的一项服务中使用的查询。 以下是查询:

select  rg.reservationGuestId, rg.reservationId, rg.checkindate,
        rg.checkoutdate, rg.assignedroom , tr.orderDate, tr.assignedDate,
        cast(tr.orderId as char(36)) , g.titleCode, g.firstName,
        g.middleName, g.lastName, g.birthDate, g.genderCode, g.email,
        g.preferredLanguage , cast(a.addressId as char(36)), a.addressTypeCode,
        a.line1, a.line2, a.city, a.state, a.countryCode as aCountryCode,
        a.zip , cast(p.phoneId as char(36)), p.phoneTypeCode,
        p.countryCode as pCountryCode, p.areaCode, p.number
    from  reservationguest rg
    inner join  guest g  ON rg.guestid = g.guestid
    inner join  address a  ON g.guestid = a.guestid
    inner join  phone p  ON g.guestid = p.guestid
    left join  orderdetail tr  ON rg.reservationguestid = tr.reservationguestid
    where  ((0 = 0)
              or  (rg.reservationGuestId in (null))
           )
      and  (('2019-05-01' = 'null')
              or  (rg.checkindate >= '2019-05-01')
           )
      and  (('2019-09-08' = 'null')
              or  (rg.checkindate <= '2019-09-08')
           )
      and  ((1 = 0)
              or  (a.addressTypeCode in ('SHIPPING'))
           )
      and  ((1 = 0)
              or  (p.phoneTypeCode in ('HOME'))
           )
      and  (('' = 'null')
              or  (('' = '')
                      and  (tr.orderDate is null)
                  )
              or  (tr.orderDate = '2019-06-02 00:00:00')
           )
    order by  rg.checkindate, rg.lastmodifieddate

上述查询在获取 161500 条记录时花费了将近 1150 毫秒。

下面是这个查询的执行计划:

Sort  (cost=21727.93..21732.95 rows=2009 width=683) (actual time=928.206..1117.145 rows=161500 loops=1)
  Sort Key: rg.checkindate, rg.lastmodifieddate
  Sort Method: external merge  Disk: 55936kB
  ->  Hash Right Join  (cost=15262.53..21617.71 rows=2009 width=683) (actual time=267.553..576.902 rows=161500 loops=1)
        Hash Cond: ((tr.reservationguestid)::text = (rg.reservationguestid)::text)
        Filter: ((tr.orderdate IS NULL) OR (tr.orderdate = '2019-06-02 00:00:00'::timestamp without time zone))
        Rows Removed by Filter: 252112
        ->  Seq Scan on orderdetail tr  (cost=0.00..6047.00 rows=66800 width=69) (actual time=0.018..36.367 rows=66887 loops=1)
        ->  Hash  (cost=15210.52..15210.52 rows=4161 width=255) (actual time=266.789..266.789 rows=18521 loops=1)
              Buckets: 16384 (originally 8192)  Batches: 2 (originally 1)  Memory Usage: 3969kB
              ->  Nested Loop  (cost=5302.72..15210.52 rows=4161 width=255) (actual time=62.445..248.868 rows=18521 loops=1)
                    ->  Hash Join  (cost=5302.30..6748.12 rows=3322 width=258) (actual time=62.378..83.816 rows=6762 loops=1)
                          Hash Cond: ((p.guestid)::text = (g.guestid)::text)
                          ->  Bitmap Heap Scan on phone p  (cost=263.88..1624.42 rows=13883 width=70) (actual time=1.482..13.057 rows=13909 loops=1)
                                Recheck Cond: ((phonetypecode)::text = 'HOME'::text)
                                Heap Blocks: exact=1186
                                ->  Bitmap Index Scan on ix_phone_phonetypecode  (cost=0.00..260.41 rows=13883 width=0) (actual time=1.315..1.315 rows=13909 loops=1)
                                      Index Cond: ((phonetypecode)::text = 'HOME'::text)
                          ->  Hash  (cost=4952.89..4952.89 rows=6842 width=188) (actual time=60.860..60.860 rows=6811 loops=1)
                                Buckets: 8192  Batches: 1  Memory Usage: 1664kB
                                ->  Hash Join  (cost=1774.72..4952.89 rows=6842 width=188) (actual time=28.954..56.879 rows=6811 loops=1)
                                      Hash Cond: ((a.guestid)::text = (g.guestid)::text)
                                      ->  Bitmap Heap Scan on address a  (cost=137.45..3221.97 rows=6842 width=100) (actual time=1.174..22.254 rows=6811 loops=1)
                                            Recheck Cond: ((addresstypecode)::text = 'SHIPPING'::text)
                                            Heap Blocks: exact=2290
                                            ->  Bitmap Index Scan on ix_address_addresstypecode  (cost=0.00..135.73 rows=6842 width=0) (actual time=0.877..0.877 rows=6811 loops=1)
                                                  Index Cond: ((addresstypecode)::text = 'SHIPPING'::text)
                                      ->  Hash  (cost=1279.90..1279.90 rows=28590 width=88) (actual time=27.704..27.704 rows=28590 loops=1)
                                            Buckets: 32768  Batches: 1  Memory Usage: 3708kB
                                            ->  Seq Scan on guest g  (cost=0.00..1279.90 rows=28590 width=88) (actual time=0.015..17.576 rows=28590 loops=1)
                    ->  Index Scan using ix_reservationguest_guestid_checkindate_lastmodifieddate on reservationguest rg  (cost=0.42..2.53 rows=2 width=129) (actual time=0.015..0.023 rows=3 loops=6762)
                          Index Cond: (((guestid)::text = (g.guestid)::text) AND (checkindate >= '2019-05-01'::date) AND (checkindate <= '2019-09-08'::date))
Planning time: 3.343 ms
Execution time: 1173.074 ms 

我认为查询已经优化了很多。但是,在我删除 order by 子句后,只需要大约 550 毫秒,几乎是总时间的一半。

由于我对 PostgreSQL 的内部知识知之甚少,我不确定使用索引ix_reservationguest_guestid_checkindate_lastmodifieddate 进行排序,因为执行计划中的排序操作没有提及索引名称。

问题

  • 如果它确实使用索引进行排序,那么这是我可以从 Postgres 获得的最短执行时间吗?

  • 如果不使用索引,那么有什么办法可以提高排序的性能?

  • 另外,我应该创建什么索引以避免orderdetail 表的 Seq Scan?

【问题讨论】:

  • 0 = 0 和类似检查有什么用?
  • @404 参数处理的情况。实际条件是 v_reservationGuestId 为 null 或 (v_reservationGuestId) 中的 reservationGuestId。如果未传递参数的值,则获取所有记录,否则使用过滤器。
  • rg.reservationGuestId in (null) 永远不会是真的你需要rg.reservationGuestId is null。并摆脱无用的or 条件。他们往往会混淆优化器并使事情变慢。 '2019-09-08' = 'null'('' = 'null' 可以替换为 false - 为什么你有所有这些无用的附加条件?
  • 投射到char() 是一种代码气味。为什么要在那里使用固定长度的数据类型?这些列的数据类型是什么?如果要将数字转换为字符串值,通常转换为 text 会更好。 Don't use the char type

标签: postgresql performance indexing postgresql-9.6


【解决方案1】:

我看到'null'null。你用对了吗?

in (null) 是否按预期工作?

考虑在这种情况下清理查询:

and   (('2019-05-01' = 'null')
   or  (rg.checkindate >= '2019-05-01')
      )

cast(p.phoneId as char(36)) -- 这是一个过度标准化的例子。 phoneID 大于值!

客户将如何处理“161500 条记录”?我希望你不会在 UI 上向某些人显示这个!考虑汇总数据而不是转储数据。

【讨论】:

    猜你喜欢
    • 2017-04-17
    • 1970-01-01
    • 2013-06-02
    • 1970-01-01
    • 2017-04-14
    • 1970-01-01
    • 2017-03-11
    • 2010-12-19
    • 1970-01-01
    相关资源
    最近更新 更多