【问题标题】:Add Date-Stamped Observations by Group in R在 R 中按组添加带日期戳的观察
【发布时间】:2018-05-25 21:16:40
【问题描述】:

我有一个如下所示的数据框:

> head(dsidata3)
# A tibble: 6 x 28
  Date      `Day of week` Holiday Name   `Time entered` Work  Travel Exercise Sleep
  <chr>     <chr>         <chr>   <chr>  <time>         <chr> <chr>  <chr>    <dbl>
1 28/3/2018 Wednesday     NA      Dave   21:10          6.0   0.4    -         7.00
2 28/3/2018 Wednesday     NA      Mercu… 22:00          8.0   1.5    -         6.00
3 28/3/2018 Wednesday     NA      Mars   23:56          11.0  1.0    -         4.00
4 28/3/2018 Wednesday     NA      Venus  22:35          8.5   4.0    -         7.50
5 29/3/2018 Thursday      NA      Dave   22:00          -     -      -         6.00
6 29/3/2018 Thursday      NA      Mercu…    NA          8.5   0.8    1.0      10.0

对于每个日期,有四个观察值(每个 $Name、'Dave'、'Mars' 等一个)

我也有一个单独的数据框,看起来像这样

    > head(windspeeds)
# A tibble: 6 x 2
  Date       `km/h`
  <chr>       <int>
1 28/03/2018      2
2 29/03/2018      1
3 30/03/2018      0
4 31/03/2018      2
5 1/04/2018       1
6 2/04/2018       7

我想将我的风速数据添加到我的第一个数据框中,但是该数据框中每个日期有四个,风速数据框中每个日期只有一个观测值。

我确定这与嵌套和应用有关,但我无法弄清楚,对此的任何帮助将不胜感激!

这里要求的是这些变量的所有观察结果:

> dput(dsidata3$Date)
c("28/3/2018", "28/3/2018", "28/3/2018", "28/3/2018", "29/3/2018", 
"29/3/2018", "29/3/2018", "29/3/2018", "30/3/2018", "30/3/2018", 
"30/3/2018", "30/3/2018", "31/3/2018", "31/3/2018", "31/3/2018", 
"31/3/2018", "1/4/2018", "1/4/2018", "1/4/2018", "1/4/2018", 
"2/4/2018", "2/4/2018", "2/4/2018", "2/4/2018", "3/4/2018", "3/4/2018", 
"3/4/2018", "3/4/2018", "4/4/2018", "4/4/2018", "4/4/2018", "4/4/2018", 
"5/4/2018", "5/4/2018", "5/4/2018", "5/4/2018", "6/4/2018", "6/4/2018", 
"6/4/2018", "6/4/2018", "7/4/2018", "7/4/2018", "7/4/2018", "7/4/2018", 
"8/4/2018", "8/4/2018", "8/4/2018", "8/4/2018", "9/4/2018", "9/4/2018", 
"9/4/2018", "9/4/2018", "10/4/2018", "10/4/2018", "10/4/2018", 
"10/4/2018", "11/4/2018", "11/4/2018", "11/4/2018", "11/4/2018", 
"12/4/2018", "12/4/2018", "12/4/2018", "12/4/2018", "13/4/2018", 
"13/4/2018", "13/4/2018", "13/4/2018", "14/4/2018", "14/4/2018", 
"14/4/2018", "14/4/2018", "15/4/2018", "15/4/2018", "15/4/2018", 
"15/4/2018", "16/4/2018", "16/4/2018", "16/4/2018", "16/4/2018", 
"17/4/2018", "17/4/2018", "17/4/2018", "17/4/2018", "18/4/2018", 
"18/4/2018", "18/4/2018", "18/4/2018", "19/4/2018", "19/4/2018", 
"19/4/2018", "19/4/2018", "20/4/2018", "20/4/2018", "20/4/2018", 
"20/4/2018", "21/4/2018", "21/4/2018", "21/4/2018", "21/4/2018", 
"22/4/2018", "22/4/2018", "22/4/2018", "22/4/2018", "23/4/2018", 
"23/4/2018", "23/4/2018", "23/4/2018", "24/4/2018", "24/4/2018", 
"24/4/2018", "24/4/2018", "25/4/2018", "25/4/2018", "25/4/2018", 
"25/4/2018", "26/4/2018", "26/4/2018", "26/4/2018", "26/4/2018", 
"27/4/2018", "27/4/2018", "27/4/2018", "27/4/2018", "28/4/2018", 
"28/4/2018", "28/4/2018", "28/4/2018", "29/4/2018", "29/4/2018", 
"29/4/2018", "29/4/2018", "30/4/2018", "30/4/2018", "30/4/2018", 
"30/4/2018", "1/5/2018", "1/5/2018", "1/5/2018", "1/5/2018", 
"2/5/2018", "2/5/2018", "2/5/2018", "2/5/2018", "3/5/2018", "3/5/2018", 
"3/5/2018", "3/5/2018", "4/5/2018", "4/5/2018", "4/5/2018", "4/5/2018", 
"5/5/2018", "5/5/2018", "5/5/2018", "5/5/2018", "6/5/2018", "6/5/2018", 
"6/5/2018", "6/5/2018", "7/5/2018", "7/5/2018", "7/5/2018", "7/5/2018", 
"8/5/2018", "8/5/2018", "8/5/2018", "8/5/2018")

还有风速:

> dput(windspeeds)
structure(list(Date = c("28/03/2018", "29/03/2018", "30/03/2018", 
"31/03/2018", "1/04/2018", "2/04/2018", "3/04/2018", "4/04/2018", 
"5/04/2018", "6/04/2018", "7/04/2018", "8/04/2018", "9/04/2018", 
"10/04/2018", "11/04/2018", "12/04/2018", "13/04/2018", "14/04/2018", 
"15/04/2018", "16/04/2018", "17/04/2018", "18/04/2018", "19/04/2018", 
"20/04/2018", "21/04/2018", "22/04/2018", "23/04/2018", "24/04/2018", 
"25/04/2018", "26/04/2018", "27/04/2018", "28/04/2018", "29/04/2018", 
"30/04/2018", "1/05/2018", "2/05/2018", "3/05/2018", "4/05/2018", 
"5/05/2018", "6/05/2018", "7/05/2018", "8/05/2018"), `km/h` = c(2L, 
1L, 0L, 2L, 1L, 7L, 7L, 6L, 1L, 7L, 5L, 5L, 1L, 5L, 0L, 0L, 1L, 
3L, 6L, 1L, 6L, 6L, 6L, 3L, 3L, 1L, 1L, 1L, 7L, 7L, 5L, 7L, 3L, 
4L, 2L, 7L, 1L, 5L, 0L, 0L, 0L, 7L)), .Names = c("Date", "km/h"
), row.names = c(NA, -42L), class = c("tbl_df", "tbl", "data.frame"

【问题讨论】:

  • 您的预期输出是什么?您希望该日期的每一行都具有相同的风速吗?
  • 是的,我希望每个日期的风速出现在相应日期的每一行
  • 能否请您在输入数据帧上使用dput() 并提供?它可以帮助我们重新创建您的场景。
  • @Aramis7d 完成!

标签: r dplyr


【解决方案1】:

考虑输入为:

x1 <- 'A B
1 x
1 y
1 z
2 r
2 t
2 5'

x2 <- 'A D
1 x1
2 r1'

df1 <- read.table(text = x1, sep =" ", header = TRUE, stringsAsFactors = FALSE)
df2 <- read.table(text = x2, sep =" ", header = TRUE, stringsAsFactors = FALSE)

您可以尝试tidyverse 函数,例如:

df1 %>%
  left_join(df2)

给出:

  A B  D
1 1 x x1
2 1 y x1
3 1 z x1
4 2 r r1
5 2 t r1
6 2 5 r1

【讨论】:

  • 再次感谢您的帮助,但这会返回所有 NA 值
  • 很可能是由于Date 列中的格式不同导致的问题...首先显示dd/m/yyyy,而第二个显示dd/mm/yyyy
【解决方案2】:

您可以使用带有参数each=4 的函数rep 将每个风速条目重复4 次,然后将其添加到您的数据框中。

temp <- as.array(windspeeds["km/h"])
dsidata3["ws"]<- rep(temp, each = 4)

【讨论】:

  • 感谢您的意见!不幸的是,我没有得到我想要的结果。 > dsidata3["ws"][<-.data.frame(*tmp*, "ws", value = list(km/h = c( 2L, 1L, : 提供 4 个变量替换 1 个变量 > head(dsidata3$ws) [1] 2 1 0 2 1 7
  • 函数 rep 似乎对 tibble 列没有影响。但是,通过先将 "windspeeds["km/h"] 转换为数组,它应该可以工作。
猜你喜欢
  • 2021-06-04
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2021-12-04
  • 2015-10-27
  • 1970-01-01
  • 1970-01-01
  • 2022-11-12
相关资源
最近更新 更多