从坐标参考系中的多边形边界获取栅格中的像素坐标答案

【问题标题】：Get pixel coordinates in raster from polygon boundary in coordinate reference system从坐标参考系中的多边形边界获取栅格中的像素坐标
【发布时间】：2021-02-19 16:50:27
【问题描述】：

我有一些相对较大的 GeoTiff 文件（10980 x 10980 像素），它们都对应于相同的地理区域（并且具有相同的坐标参考系），并且我有大量的多边形（100,000+）对应土地包裹，我想从每个图像文件中提取与每个多边形对应的像素。目前，我这样做的方式是使用 shapely Polygons 和 rasterio.mask.mask 函数，如下所示：

for filename in image_files:
    with rasterio.open(filename) as src:
        for shape in shapes:
            data, _ = rasterio.mask.mask(src, [shape], crop=True)

这在经验上相当缓慢。如果我预先计算了掩码索引，那么我只需要读取每个图像的整个数据一次，然后使用预先计算的索引来提取每个多边形的相关像素（我不需要它们在正确的 2-维度配置，我只需要值），这非常快。但我不知道是否有快速获取这些像素索引的方法。我知道我可以使用 rasterio 的 raster_geometry_mask 函数来获取整个图像大小的掩码，然后使用 numpy 来获取多边形内元素的索引，但是这样就不必要地构造一个 10980 x 10980 数组每个多边形来制作蒙版，这非常非常慢。

【问题讨论】：

标签： python gis raster rasterio

【解决方案1】：

我最终做的是，当我打开第一张图片，然后为每个多边形，

使用图像变换将多边形转换为像素坐标，并在整数像素坐标中找到包含多边形的矩形边界框。
要确定边界框中的哪些像素实际上在多边形中，请为每个像素构造 shapely Polygons 并使用 .intersects() 方法（如果您只想包含完全在多边形内的像素，您可以使用 @ 987654322@)。（我不确定这是否会很慢，但事实证明并非如此。）
保存每个多边形中所有像素的坐标对列表。

然后，对于您打开的每个新图像，您只需读取整个图像数据并索引每个多边形的部分，因为您已经有了像素索引。

代码大致如下：

import math
import numpy
import pyproj
import rasterio.mask
from shapely.geometry import Polygon

shape_pixels = None
for filename in image_files:
    with rasterio.open(filename) as src:
        if shape_pixels is None:
            projector = pyproj.Proj(src.crs)
            pixelcoord_shapes = [
                Polygon(zip(*(~src.transform * numpy.array(projector(*zip(*shape.boundary.coords))))))
                for shape in shapes
            ]

            pixels_per_shape = []
            for shape in shapes:
                xmin = max(0, math.floor(shape.bounds[0]))
                ymin = max(0, math.floor(shape.bounds[1]))
                xmax = math.ceil(shape.bounds[2])
                ymax = math.ceil(shape.bounds[3])
                pixel_squares = {}
                for j in range(xmin, xmax+1):
                    for i in range(ymin, ymax+1):
                        pixel_squares[(i, j)] = Polygon.from_bounds(j, i, j+1, i+1)
                pixels_per_shape.append([
                    coords for (coords, pixel) in pixel_squares.items()
                    if shape.intersects(pixel)
                ])

        whole_data = src.read()

        for pixels in pixels_per_shape:
            ivals, jvals = zip(*pixels)
            shape_data = whole_data[0, ivals, jvals]
            ...

【讨论】：