【问题标题】:How to apply for loop with ddply function?如何使用 ddply 函数申请循环?
【发布时间】:2020-03-28 02:35:49
【问题描述】:

我想计算每个月每列降雨量 >= 2.5 毫米的天数。在获得this post 的帮助后,我能够为单个列计算它

require(seas)
library (zoo)
data(mscdata)
dat.int <- (mksub(mscdata, id=1108447))

dat.int$yearmon <- as.yearmon(dat.int$date, "%b %y")
require(plyr)
rainydays_by_yearmon <- ddply(dat.int, .(yearmon), summarize, rainy_days=sum(rain >= 1.0) )
print.data.frame(rainydays_by_yearmon)

现在我想将它应用于所有列。我试过下面的代码

for(i in 1:length(dat.int)){
  y1 <- dat.int[[i]]
  rainydays <- ddply(dat.int, .(yearmon), summarize, rainy_days=sum(y1 >= 2.5))
  if(i==1){
    m1 <- rainydays
  }
  else{
    m1 <- cbind(rainydays, m1)
  }
  print(i)
}
m1

但我无法获得预期的结果。请帮帮我!!!

【问题讨论】:

    标签: r for-loop plyr threshold


    【解决方案1】:

    我会改用来自tidyversedplyrtidyrpivot_longer 将数据变成长格式,更易于操作。 pivot_wider 再次变宽(根据您的下一步,可能不需要)

    library(seas)
    library(tidyverse)
    library(zoo)
    data(mscdata)
    dat.int <- (mksub(mscdata, id=1108447))
    
    dat.int %>% 
      as_tibble() %>% # for easier viewing 
      mutate(yearmon = as.yearmon(dat.int$date, "%b %y")) %>% 
      select(-date, -year, -yday) %>% 
      pivot_longer(cols = -yearmon, names_to = "variable", values_to = "value") %>% 
      group_by(yearmon, variable) %>% 
      summarise(rainy_days = sum(value > 2.5)) %>% 
      pivot_wider(names_from = "variable", values_from = "rainy_days")
    

    【讨论】:

    • 运行代码会出现以下错误Error in select(., -date, -year, -yday) : unused arguments (-date, -year, -yday)
    • 我可以解决这个错误。实际上,当我们加载seas 库时,选择函数被屏蔽了。所以,我使用了dplyr::select,它解决了错误。
    • seas 中没有select,但MASSraster 中有。考虑使用conflicted 包来处理冲突
    【解决方案2】:

    如果您不介意使用 data.table 库,请参阅下面的解决方案。

    library('data.table')
    library('seas')
    setDT(mscdata)
    mscdata[id == 1108447 & rain >= 2.5, .(rain_ge_2.5mm = .N), 
            by = .(year, month = format(date, "%m"))]
    

    输出

    #    year month rain_ge_2.5mm
    # 1: 1975    01            12
    # 2: 1975    02             8
    # 3: 1975    03            10
    # 4: 1975    04             2
    # 5: 1975    05             4
    # ---                         
    # 350: 2004    07           2
    # 351: 2004    08           5
    # 352: 2004    10          10
    # 353: 2004    11          14
    # 354: 2004    12          14
    

    如果你想处理所有的id,那么你可以按id分组数据,如下所示。

    仅限下雨:

    mscdata[, .(rain_ge_2.5mm = sum(rain >= 2.5)),
            by = .(id, year, month = format(date, "%m"))]
    

    用于雨、雪和降水

    mscdata[, .(rain_ge_2.5mm = sum(rain >= 2.5), 
                snow_ge_2 = sum(snow >= 2.0), 
                precip_ge_2 = sum(precip >= 2.0)),
            by = .(id, year, month = format(date, "%m"))]
    
    #         id year month rain_ge_2.5mm snow_ge_2 precip_ge_2
    # 1: 1096450 1975    01             1        10           9
    # 2: 1096450 1975    02             0         5           3
    # 3: 1096450 1975    03             1         9           9
    # 4: 1096450 1975    04             1         2           3
    # 5: 1096450 1975    05             5         1           6
    # ---                                                       
    # 862: 2100630 2000    07            NA        NA           3
    # 863: 2100630 2000    08            NA        NA           8
    # 864: 2100630 2000    09            NA        NA           6
    # 865: 2100630 2000    11            NA        NA          NA
    # 866: 2100630 2001    01            NA        NA          NA
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2017-01-12
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-09-04
      相关资源
      最近更新 更多