【问题标题】:In R: Add rows with annual date intervals to data frame在 R 中:将具有年度日期间隔的行添加到数据框中
【发布时间】:2020-10-20 09:33:33
【问题描述】:

我想为每个样本点添加行,其中包含每年间隔的从到时间步长。因此,在添加的行中,我只想更改“from”和“to”列的内容,并保留上面行中的所有其他信息。

我现在拥有的:

 > sample_points
point       from         to     label
    1 2004-05-01 2007-05-01  cropland
    2 2009-05-01 2012-05-01 grassland
    3 2014-05-01 2016-05-01    forest

我需要什么:

 > sample_points
point       from         to     label
    1 2004-05-01 2005-05-01  cropland
    1 2005-05-01 2006-05-01  cropland
    1 2006-05-01 2007-05-01  cropland
    2 2009-05-01 2010-05-01 grassland
    2 2010-05-01 2011-05-01 grassland
    2 2011-05-01 2012-05-01 grassland
    3 2014-05-01 2015-05-01    forest
    3 2015-05-01 2016-05-01    forest  

这里是示例数据框:

point <- c("1", "2", "3")
from <- as.Date(c("2004-05-01", "2009-05-01", "2014-05-01"))
to <- as.Date(c("2007-05-01", "2012-05-01", "2016-05-01"))
label <- c("cropland", "grassland", "forest")

sample_points <- data.frame(point, from, to, label)

我是 R 新手,这是我在这里的第一个问题,所以如果问题表述不完美、缺少某些内容或我错过了一个类似的问题以及我的问题的解决方案,请原谅我。 感谢您的任何提示!

【问题讨论】:

    标签: r dataframe date


    【解决方案1】:

    这是一个tidyverse 选项:

    我们从from 列到to 列创建一个年度序列,并创建to 列,它是每个from 值的下一个值point

    library(tidyverse)
    
    sample_points %>%
      mutate(from = map2(from, to, seq, by = 'year')) %>%
      unnest(from) %>%
      group_by(point) %>%
      mutate(to = lead(from)) %>%
      filter(!is.na(to))
    
    #  point from       to         label    
    #  <chr> <date>     <date>     <chr>    
    #1 1     2004-05-01 2005-05-01 cropland 
    #2 1     2005-05-01 2006-05-01 cropland 
    #3 1     2006-05-01 2007-05-01 cropland 
    #4 2     2009-05-01 2010-05-01 grassland
    #5 2     2010-05-01 2011-05-01 grassland
    #6 2     2011-05-01 2012-05-01 grassland
    #7 3     2014-05-01 2015-05-01 forest   
    #8 3     2015-05-01 2016-05-01 forest   
    

    【讨论】:

      【解决方案2】:

      您可以每年按行创建sequences,repeat 每个值两次以创建matrix,然后删除不必要的行。

      res <- do.call(rbind, lapply(1:nrow(sample_points), function(m) {
        cc <- c("from", "to")
        dc <- as.character(do.call(seq, as.list(c(sample_points[m, cc], by="year"))))
        if (length(dc) == 2) {
          o <- sample_points[m, ]
        } else {
          dm <- suppressWarnings(matrix(rep(dc, each=2)[-1],,2,b=T))
          dm <- if (nrow(dm) == 1) dm else dm[-nrow(dm), ]
          o <- setNames(data.frame(sample_points[m, "point"], dm, 
                                   sample_points[m, "label"]),names(sample_points))
          o[cc] <- lapply(o[cc], as.Date)
        }
        o
      }))
      

      给予

      res
      #   point       from         to     label
      # 1     1 2004-05-01 2005-05-01  cropland
      # 2     1 2005-05-01 2006-05-01  cropland
      # 3     1 2006-05-01 2007-05-01  cropland
      # 4     2 2009-05-01 2010-05-01 grassland
      # 5     2 2010-05-01 2011-05-01 grassland
      # 6     2 2011-05-01 2012-05-01 grassland
      # 7     3 2014-05-01 2015-05-01    forest
      # 8     3 2015-05-01 2016-05-01    forest
      

      在哪里

      str(res)
      # 'data.frame': 8 obs. of  4 variables:
      # $ point: chr  "1" "1" "1" "2" ...
      # $ from : Date, format: "2004-05-01" ...
      # $ to   : Date, format: "2005-05-01" ...
      # $ label: chr  "cropland" "cropland" "cropland" "grassland" ...
      

      数据:

      sample_points <- structure(list(point = c("1", "2", "3"), from = structure(c(12539, 
      14365, 16191), class = "Date"), to = structure(c(13634, 15461, 
      16922), class = "Date"), label = c("cropland", "grassland", "forest"
      )), class = "data.frame", row.names = c(NA, -3L))
      

      【讨论】:

        猜你喜欢
        • 2022-06-15
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2013-04-29
        • 2022-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多