【问题标题】:Read Shapefile from Google Cloud Storage using Dataflow + Beam + Python使用 Dataflow + Beam + Python 从 Google Cloud Storage 读取 Shapefile
【发布时间】:2018-11-26 10:05:42
【问题描述】:
如何使用 Dataflow + Beam + Python 从 Google Cloud Storage 读取 Shapefile。
我只找到了beam.io.ReadFromText,但 python shapefile 阅读器需要类似文件的对象:shp.Reader(shp=shp_file, dbf=dbf_file) 或 shapefile。
我正在使用 Python 2.7。
【问题讨论】:
标签:
python
google-cloud-storage
google-cloud-dataflow
apache-beam
shapefile
【解决方案1】:
这是这样做的方法:
prj_file = beam.io.gcp.gcsio.GcsIO().open(
filenamePRJ,
mode='r',
read_buffer_size=1677721600,
mime_type='application/octet-stream'
)
shp_file = beam.io.gcp.gcsio.GcsIO().open(
filenameSHP,
mode='r',
read_buffer_size=1677721600,
mime_type='application/octet-stream'
)
dbf_file = beam.io.gcp.gcsio.GcsIO().open(
filenameDBF,
mode='r',
read_buffer_size=1677721600,
mime_type='application/octet-stream'
)
sf = shp.Reader(shp=shp_file, dbf=dbf_file)
euref = osr.SpatialReference()
euref.ImportFromWkt(str(prj_file.read()))
wgs84 = osr.SpatialReference()
wgs84.ImportFromEPSG(4326)
transformation = osr.CoordinateTransformation(euref,wgs84)