【发布时间】:2021-10-10 19:33:38
【问题描述】:
我有一个包含五个相关字段的表 - id、source、iid、track_hash、alias。我想将所有条目分组到具有共同track_hash 的组中,然后为每一行保存其组中具有最低source 的行的id(与最高的iid 相比,打破关系以支持最高的iid)条目进入alias 字段。为此,我编写了以下查询:
with best as
(SELECT id as bid, track_hash FROM
(SELECT id, track_hash,
RANK () OVER (
PARTITION BY track_hash
ORDER BY source asc, iid DESC
) rank
from albums
)
where rank = 1
)
select bid, a.* from albums a inner join best
on a.track_hash = best.track_hash
这在 24k 行上花费了完全合理的 2 秒。现在,我不想简单地看到这个id,而是想实际保存它。为此,我使用了以下非常相似的查询:
with best as
(SELECT id as bid, track_hash FROM
(SELECT id, track_hash,
RANK () OVER (
PARTITION BY track_hash
ORDER BY source asc, iid DESC
) rank
from albums
)
where rank = 1
)
update albums
set alias = bid FROM albums a inner join best
on a.track_hash = best.track_hash
但是,这需要 1 到 10 分钟,我真的不明白为什么。无论如何,引擎是否不必将每一行与其best.id/alias 匹配,这正是我正在做的更新?为什么会发生这种情况,我做错了什么?
查询计划如下所示:
MATERIALIZE 1
CO-ROUTINE 4
SCAN TABLE albums USING INDEX track_hash_idx
USE TEMP B-TREE FOR RIGHT PART OF ORDER BY
SCAN SUBQUERY 4
SCAN TABLE albums USING COVERING INDEX track_hash_idx
SEARCH SUBQUERY 1 USING AUTOMATIC PARTIAL COVERING INDEX (rank=?)
SEARCH TABLE albums AS a USING COVERING INDEX track_hash_idx (track_hash=?)
【问题讨论】:
标签: sql sqlite join sql-update common-table-expression