将数据从一个地理区域分配到另一个地理区域答案

【问题标题】：Apportioning data from one geography to another 将数据从一个地理区域分配到另一个地理区域
【发布时间】：2019-05-06 21:15:26
【问题描述】：

我们有两个地理区域：人口普查区和方格。网格数据集仅包含有关人口数量的信息。我们有关于每个人口普查区总收入的信息。 我们想做的是将这些收入数据从人口普查区分配到网格单元中。

这是地理分析中非常常见的问题，可能有很多方法可以解决它。我们希望这样做不仅要考虑人口普查区域和网格单元之间的空间重叠，还要考虑每个单元的人口。这主要是为了避免在大人口普查区可能只包含居住在小范围内的人时出现问题。

我们在下面展示了一个可重现的示例（使用 R 和 sf 包）以及我们迄今为止找到的解决这个问题的方法，使用我们从我们的地理区域中提取的示例。我们希望看看其他人是否有替代（更有效）的解决方案来检查我们的结果是否正确。

library(sf)
library(dplyr)
library(readr)

# Files
download.file("https://github.com/ipeaGIT/acesso_oport/raw/master/test/shapes.RData", "shapes.RData")
load("shapes.RData")

# Open tracts and calculate area
tract <- tract %>%
  mutate(area_tract = st_area(.))

# Open grid squares and calculate area
square <- square %>%
  mutate(area_square = st_area(.))


ui <-
  # Create spatial units for all intersections between the tracts and the squares (we're calling these "piece")
  st_intersection(square, tract) %>%
  # Calculate area for each piece
  mutate(area_piece = st_area(.)) %>%
  # Compute the proportion of each tract that's inserted in that piece
  mutate(area_prop_tract = area_piece/area_tract) %>%
  # Compute the proportion of each square that's inserted in that piece
  mutate(area_prop_square =  area_piece/area_square) %>%
  # Based on the square's population, compute the population that lives in that piece
  mutate(pop_prop_square = square_pop * area_prop_square) %>%
  # Compute the population proportion of each square that is within the tract
  group_by(id_tract) %>%
  mutate(sum = sum(pop_prop_square)) %>%
  ungroup() %>%
  # Compute population of each piece whitin the tract
  mutate(pop_prop_square_in_tract =  pop_prop_square/sum) %>%
  # Compute income within each piece
  mutate(income_piece = tract_incm* pop_prop_square_in_tract)

# Final agreggation by squares
ui_fim <- ui %>%
  # Group by squares and population and sum the income for each piece
  group_by(id_square, square_pop) %>%
  summarise(square_income = sum(income_piece, na.rm = TRUE))

谢谢！

【问题讨论】：

在这个问题中要考虑的一件大事是家庭如何在区域内分布的问题。在某些地区，家庭分布将是均匀的，在其他地区，它将集中在部分地区。收入计算的单位通常是家庭单位，而不是人数（人口）。如果您可以在基本的人口普查块组中获得家庭人数，这将为您的网格分配收入提供更好的基础。 ZIP+4 是另一种可能的代理，尽管需要做更多的工作。祝你好运！
对于面积加权插值，你得到的答案和sf::st_interpolate_aw一样吗？
@EdzerPebesma 是的！我不知道st_interpolate_aw，谢谢
@GeorgeDGirton 这是一个很好的观点。我们肯定会尝试的。
@GeorgeDGirton。我们在常规网格级别上有家庭和人口计数。因此，正如您所说，我们正在尝试使用此信息来分配收入数据，同时考虑到 area 重叠和 population 计数。

标签： r geospatial

【解决方案1】：

根据您要使用的插值方法，我可能会为您提供我帮助开发的解决方案。 areal 包实现了面积加权插值，我在我自己的研究中使用它，在美国人口普查地理和网格正方形之间进行插值。您可以查看包的网站（和相关的小插曲）here。希望这有用！

【讨论】：

感谢您的包裹！我不知道有任何包或函数在 R 中进行面积加权插值。不过，我们希望考虑其他变量（在我们的例子中使用网格人口）来做到这一点。