【发布时间】:2022-03-28 06:05:36
【问题描述】:
我正在计算 PostGIS 数据库中两个表/图层的重叠。一组是六边形网格,我想为每个六边形计算与另一组多边形的重叠比例。多面体集也有一些重叠的多边形,所以我需要溶解/合并它们。在我在 FME 中执行此操作之前,但对于一些较大的多边形,它的内存一直不足。这就是我想在数据库中执行此操作的原因(PostGIS 应该非常有能力执行此操作)。
这是我目前所拥有的,它可以工作,而且内存现在没有用完:
EXPLAIN ANALYZE
WITH rh_union AS (
SELECT (ST_Dump(ST_Union(rh.geometry))).geom AS geometry
FROM relevant_habitats rh
WHERE rh.assessment_area_id=1
)
SELECT h.receptor_id,
SUM(ROUND((ST_Area(ST_Intersection(rhu.geometry, h.geometry)) / ST_Area(h.geometry))::numeric,3)) AS frac_overlap
FROM rh_union rhu, hexagons h
WHERE h.zoom_level=1 AND ST_Intersects(rhu.geometry, h.geometry)
GROUP BY h.receptor_id
所以我首先打破多面体并尽可能合并。然后计算六边形与多边形的叠加。然后计算(所有小块的总和)面积。
现在,我的问题是:
- 这是一种有效的方法吗?还是有更好的方法?
(还有一个附带问题:先 ST_Union 然后 ST_Dump 是否正确?)
--用 EXPLAIN ANALYZE 更新 单个区域的输出:
"QUERY PLAN"
"GroupAggregate (cost=1996736.77..15410052.20 rows=390140 width=36) (actual time=571.303..1137.657 rows=685 loops=1)"
" Group Key: h.receptor_id"
" -> Sort (cost=1996736.77..1998063.55 rows=530712 width=188) (actual time=571.090..620.379 rows=806 loops=1)"
" Sort Key: h.receptor_id"
" Sort Method: external merge Disk: 42152kB"
" -> Nested Loop (cost=55.53..1848314.51 rows=530712 width=188) (actual time=382.769..424.643 rows=806 loops=1)"
" -> Result (cost=55.25..1321.51 rows=1000 width=32) (actual time=382.550..383.696 rows=65 loops=1)"
" -> ProjectSet (cost=55.25..61.51 rows=1000 width=32) (actual time=382.544..383.652 rows=65 loops=1)"
" -> Aggregate (cost=55.25..55.26 rows=1 width=32) (actual time=381.323..381.325 rows=1 loops=1)"
" -> Index Scan using relevant_habitats_pkey on relevant_habitats rh (cost=0.28..28.75 rows=12 width=130244) (actual time=0.026..0.048 rows=12 loops=1)"
" Index Cond: (assessment_area_id = 94)"
" -> Index Scan using idx_hexagons_geometry_gist on hexagons h (cost=0.29..1846.45 rows=53 width=156) (actual time=0.315..0.624 rows=12 loops=65)"
" Index Cond: (geometry && (((st_dump((st_union(rh.geometry))))).geom))"
" Filter: (((zoom_level)::integer = 1) AND st_intersects((((st_dump((st_union(rh.geometry))))).geom), geometry))"
" Rows Removed by Filter: 19"
"Planning Time: 0.390 ms"
"Execution Time: 1372.443 ms"
更新 2:现在在第二次选择 (SELECT h.receptor_id~) 上输出 EXPLAIN ANALYZE,并将 CTE 替换为 (temp) 表:
"QUERY PLAN"
"GroupAggregate (cost=2691484.47..20931829.74 rows=390140 width=36) (actual time=29.455..927.945 rows=685 loops=1)"
" Group Key: h.receptor_id"
" -> Sort (cost=2691484.47..2693288.89 rows=721768 width=188) (actual time=28.382..31.514 rows=806 loops=1)"
" Sort Key: h.receptor_id"
" Sort Method: quicksort Memory: 336kB"
" -> Nested Loop (cost=0.29..2488035.20 rows=721768 width=188) (actual time=0.189..27.852 rows=806 loops=1)"
" -> Seq Scan on rh_union rhu (cost=0.00..23.60 rows=1360 width=32) (actual time=0.016..0.050 rows=65 loops=1)"
" -> Index Scan using idx_hexagons_geometry_gist on hexagons h (cost=0.29..1828.89 rows=53 width=156) (actual time=0.258..0.398 rows=12 loops=65)"
" Index Cond: (geometry && rhu.geometry)"
" Filter: (((zoom_level)::integer = 1) AND st_intersects(rhu.geometry, geometry))"
" Rows Removed by Filter: 19"
"Planning Time: 0.481 ms"
"Execution Time: 928.583 ms"
【问题讨论】:
-
是否应用了空间索引?
-
两者(
h和rh)都有关于几何的要点索引。 -
@romepa 可以添加查询计划吗:
explain analyse? -
@JimJones,我添加了解释分析。必须弄清楚实际放置的位置!这有用吗?
-
您能否创建一个表而不是 CTE 子句并再次为该函数计时?
标签: postgresql postgis psql overlap