【发布时间】:2017-01-31 12:29:40
【问题描述】:
我想将一个非常大的矢量文件光栅化为 25m,并且使用 'cluster' 包取得了一些成功,调整了 qu 的 here 和 here,这对于该特定数据非常有效。
但是,我现在有一个更大的矢量文件,需要进行栅格化,并且可以访问使用降雪的集群。我不习惯集群功能,我只是不确定如何设置 sfLapply。在集群中调用 sfLapply 时,我一直收到以下错误:
Error in checkForRemoteErrors(val) :
one node produced an error: 'quote(96)' is not a function, character or symbol
Calls: sfLapply ... clusterApply -> staticClusterApply -> checkForRemoteErrors
我的完整代码:
library(snowfall)
library(rgeos)
library(maptools)
library(raster)
library(sp)
setwd("/home/dir/")
# Initialise the cluster...
hosts = as.character(read.table(Sys.getenv('PBS_NODEFILE'),header=FALSE)[,1]) # read the nodes to use
sfSetMaxCPUs(length(hosts)) # make sure the maximum allowed number of CPUs matches the number of hosts
sfInit(parallel=TRUE, type="SOCK", socketHosts=hosts, cpus=length(hosts), useRscript=TRUE) # initialise a socket cluster session with the named nodes
sfLibrary(snowfall)
# read in required data
shp <- readShapePoly("my_data.shp")
BNG <- "+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +datum=OSGB36 +units=m +no_defs"
crs(shp) <- BNG
### rasterize the uniques to 25m and write (GB and clipped) ###
rw <- raster(res=c(25,25), xmn=0, xmx=600000, ymn=0, ymx=1000000, crs=BNG)
# Number of polygons features in SPDF
features <- 1:nrow(shp[,])
# Split features in n parts
n <- 96
parts <- split(features, cut(features, n))
rasFunction = function(X, shape, raster, nparts){
ras = rasterize(shape[nparts[[X]],], raster, 'CODE')
return(ras)
}
# Export everything in the workspace onto the cluster...
sfExportAll()
# Distribute calculation across the cluster nodes...
rDis = sfLapply(n, fun=rasFunction,X=n, shape=shp, raster=rw, nparts=parts) # equivalent of sapply
rMerge <- do.call(merge, rDis)
writeRaster(rMerge, filename="my_data_25m", format="GTiff", overwrite=TRUE)
# Stop the cluster...
sfStop()
我尝试了很多方法,更改了函数和 sfLapply,但我就是无法让它运行。谢谢
【问题讨论】:
-
如果您正在寻找光栅化(大)矢量数据的速度,请查看
gdalUtils::gdal_rasterize。这通常比raster::rasterize快得多。 -
好的,谢谢,我也会看看
-
我删除了 rasFunction 并将 rDis 更改为 "rDis = sfLapply(1:n, fun=function(x) rasterize(shp[parts[[x]],], rw, 'CODE') )" 但现在我在 checkForRemoteErrors(val) 中得到错误:96 个节点产生了错误;第一个错误:“数据”必须是向量类型,为“NULL”。难住了。
-
@joberlin 哦。我一直在寻找一种方法来加快矢量 - > 光栅操作...
-
我对 gdalUtils::gdal_rasterize 的第一印象很好,本周将更新
标签: r linux cluster-computing raster rasterize