【问题标题】:Pivot wider two columns with different values使用不同的值旋转更宽的两列
【发布时间】:2022-01-05 03:13:13
【问题描述】:

我有两个数据框

第一个:AEZ_2

    head(AEZ_2)
     gid                           AEZ AEZ_area_sum
1 142976   Tropics, lowland; semi-arid     18.20585
2 142977   Tropics, lowland; semi-arid    924.40126
3 142978   Tropics, lowland; semi-arid    509.15215
4 142978   Tropics, lowland; sub-humid     11.47290
5 143696   Tropics, lowland; semi-arid    858.59022
6 143697 Dominantly hydromorphic soils    589.91021

dput(AEZ_2[1:6, c(1: 3)])
structure(list(gid = c(142976, 142977, 142978, 142978, 143696, 
143697), AEZ = c("Tropics, lowland; semi-arid", "Tropics, lowland; semi-arid", 
"Tropics, lowland; semi-arid", "Tropics, lowland; sub-humid", 
"Tropics, lowland; semi-arid", "Dominantly hydromorphic soils"
), AEZ_area_sum = c(18.2058489094, 924.401258895, 509.152149209, 
11.4728955973, 858.590216109, 589.9102080814)), row.names = c(NA, 
6L), class = "data.frame")

第二个:耕作系统

      gid                           Farming_system  FS_area_sum
1   142976 5. Cereal-root crop mixed farming system 1.820585e+01
2   142977 5. Cereal-root crop mixed farming system 9.244013e+02
3   142978 5. Cereal-root crop mixed farming system 5.206250e+02
4   143696           2. Agropastoral farming system 6.979757e+02
5   143696 5. Cereal-root crop mixed farming system 1.606145e+02
6   143697           2. Agropastoral farming system 2.107575e+03
dput(FS_2[1:5, c(1:3)])
structure(list(gid = c(142976, 142977, 142978, 143696, 143696
), Farming_system = structure(c(9L, 9L, 9L, 6L, 9L), .Label = c("1. Maize mixed farming system", 
"10. Forest-based farming system", "11. Large-scale irrigated farming system", 
"12. Perennial mixed farming system", "13. Arid pastoral oasis farming system", 
"2. Agropastoral farming system", "3. Highland perennial farming system", 
"4. Root and tuber crop farming system", "5. Cereal-root crop mixed farming system", 
"6. Highland mixed farming system", "7. Humid lowland tree crop farming system", 
"8. Pastoral farming system", "9. Fish-based farming system"), class = "factor"), 
    FS_area_sum = c(18.205849004, 924.40125911, 520.625044495, 
    697.975740616, 160.614476324)), row.names = c(NA, 5L), class = "data.frame")

我合并了两个数据框:Final.27

   head(Final.27)
    gid                     Farming_system FS_area_sum                                       AEZ AEZ_area_sum
1 62356                               <NA>          NA Land with severe soil/terrain limitations     334.9770
2 79599       9. Fish-based farming system 198.0554185   Sub-tropics, moderately cool; sub-humid      74.4029
3 79599       9. Fish-based farming system 198.0554185 Land with severe soil/terrain limitations     123.7758
4 79599 12. Perennial mixed farming system   0.1306899   Sub-tropics, moderately cool; sub-humid      74.4029
5 79599 12. Perennial mixed farming system   0.1306899 Land with severe soil/terrain limitations     123.7758
6 79600       9. Fish-based farming system 603.2818466 Land with severe soil/terrain limitations     141.4297
dput(Final.27[1:5, c(1:5)])
structure(list(gid = c(62356, 79599, 79599, 79599, 79599), Farming_system = structure(c(NA, 
13L, 13L, 4L, 4L), .Label = c("1. Maize mixed farming system", 
"10. Forest-based farming system", "11. Large-scale irrigated farming system", 
"12. Perennial mixed farming system", "13. Arid pastoral oasis farming system", 
"2. Agropastoral farming system", "3. Highland perennial farming system", 
"4. Root and tuber crop farming system", "5. Cereal-root crop mixed farming system", 
"6. Highland mixed farming system", "7. Humid lowland tree crop farming system", 
"8. Pastoral farming system", "9. Fish-based farming system"), class = "factor"), 
    FS_area_sum = c(NA, 198.0554184899, 198.0554184899, 0.130689938047, 
    0.130689938047), AEZ = c("Land with severe soil/terrain limitations", 
    "Sub-tropics, moderately cool; sub-humid", "Land with severe soil/terrain limitations", 
    "Sub-tropics, moderately cool; sub-humid", "Land with severe soil/terrain limitations"
    ), AEZ_area_sum = c(334.9769749362, 74.4028953581, 123.77575431, 
    74.4028953581, 123.77575431)), row.names = c(NA, 5L), class = "data.frame")

它们有共同的“gid”列(= 0.5x0.5 的单元标识符)。 所以一个 gid 可以有多个观察值(=多边形)。这就是为什么我们可以看到一些 gid 是重复的。此外,有时 gid 由 AEZ 而非耕作系统完成。对我来说,有一个 NA 观察结果(第 3 列)很好。

我想要一个最终表格,其中 AEZ 列旋转并填充 AEZ_area_sum,FS 列旋转并填充 FS_area_sum。为了通过 gid 获得许多观察结果,而不是根据不同的观察结果复制一个 gid。

我试过这样做

First.flanked <- Final.27 %>% 
  pivot_wider(names_from = AEZ, values_from = AEZ_area_sum)

然后对耕作系统再做一次

Final.flanked <- First.flanked %>% 
pivot_wider(names_from = Farming_system, value_from = FS_area_sum)

但我明白了:

Erreur : 由于名称错误,无法创建输出。

  • 使用names_repair 选择另一个策略 运行rlang::last_error() 以查看错误发生的位置。 De plus : 消息 d'avis : 值不是唯一标识的;输出将包含 list-cols。
  • 使用values_fn = list 取消此警告。
  • 使用values_fn = length 确定重复出现的位置
  • 使用values_fn = {summary_fun} 总结重复项

我知道由于 gid 重复,它无法正常工作。但是我怎样才能得到我想要的结果呢? to say => 一个gid用于同一行内的多个观察?

【问题讨论】:

  • 请编辑您的问题以在使用 dput 时输出所有相关列,例如 dput(Final.27[1:7, 1:5])
  • 对不起。我刚刚修改过!

标签: r dataframe pivot tidyverse


【解决方案1】:
Final.flanked <- Final.27 %>% 
  pivot_wider(id_cols = gid, names_from = c(Farming_system, AEZ), values_from = c(AEZ_area_sum, FS_area_sum))

View(Final.flanked)

我不确定您是否想要所有这些列,但很容易从这里删除不需要的列。

【讨论】:

  • 谢谢布赖恩。我修改了脚本,因为耕作系统列应该用 FS_area_sum 填充。
  • 如帖子中所述,我希望将 AEZ 和农业系统的专栏分开。
猜你喜欢
  • 1970-01-01
  • 2021-05-08
  • 1970-01-01
  • 2022-07-06
  • 2022-01-12
  • 2014-11-21
  • 2021-12-14
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多