【发布时间】:2020-01-02 17:23:29
【问题描述】:
我遇到了一个问题,其中 dplyr 中的 mutate() 返回结果顺序错误。我对mutate 的调用使用来自现有列的数据作为输入,但返回的结果就像数据在mutate 之前排序一样排列。
我的具体问题使用dataRetrieval 包从网络获取 USGS/NWIS 数据。在此示例中,我根据站点 ID 检索站点名称。在 `dataRetreival 包中,站点 ID 是存储为字符的数字代码。
library(dataRetrieval)
library(dplyr)
Gauges <- tibble( Name = c("Twisp", "Chewuch", "Andrews" ,"Met@Winthrop", "Met@Twisp", "Met@Pateros", "Met@Goat"),
ID = c("12448998" , "12448000","12447390", "12448500" ,"12449500","12449950" , "12447383")
)
## This works correctly with each of the station numbers
readNWISsite(Gauges$ID[1])$station_nm
# [1] "TWISP RIVER NEAR TWISP, WA"
## This does not work correctly
## Order is not right! Station does not correspond with ID !!
Gauges%>%
mutate(Station = readNWISsite(ID)$station_nm)
# # A tibble: 7 x 3
# Name ID Station
# <chr> <chr> <chr>
# 1 Twisp 12448998 METHOW RIVER ABOVE GOAT CREEK NEAR MAZAMA, WA
# 2 Chewuch 12448000 ANDREWS CREEK NEAR MAZAMA, WA
# 3 Andrews 12447390 CHEWUCH RIVER AT WINTHROP, WA
# 4 Met@Winthrop 12448500 METHOW RIVER AT WINTHROP, WA
# 5 Met@Twisp 12449500 TWISP RIVER NEAR TWISP, WA
# 6 Met@Pateros 12449950 METHOW RIVER AT TWISP, WA
# 7 Met@Goat 12447383 METHOW RIVER NEAR PATEROS, WA
## This works, returning the correct site associated with the gauge number
Gauges%>%
arrange(ID) %>%
mutate(Station = readNWISsite(ID)$station_nm)
# # A tibble: 7 x 3
# Name ID Station
# <chr> <chr> <chr>
# 1 Met@Goat 12447383 METHOW RIVER ABOVE GOAT CREEK NEAR MAZAMA, WA
# 2 Andrews 12447390 ANDREWS CREEK NEAR MAZAMA, WA
# 3 Chewuch 12448000 CHEWUCH RIVER AT WINTHROP, WA
# 4 Met@Winthrop 12448500 METHOW RIVER AT WINTHROP, WA
# 5 Twisp 12448998 TWISP RIVER NEAR TWISP, WA
# 6 Met@Twisp 12449500 METHOW RIVER AT TWISP, WA
# 7 Met@Pateros 12449950 METHOW RIVER NEAR PATEROS, WA
为什么mutate 在进程中间重新排列数据?或者,这里发生了什么?
【问题讨论】: