【问题标题】:Average of different subsets of variables with condition in RR中具有条件的不同变量子集的平均值
【发布时间】:2021-04-19 23:59:21
【问题描述】:

样本数据:

ID 月1 月2 月3 月4 月5 月6 月7 月8 月9 月10 b1 b2 -------------------------------------------------- -------------------------------------------------- --- 1 12 14 15 45 12 12 11 12 78 28 3 9 2 14 15 45 14 15 45 14 19 22 27 4 8 3 14 13 25 74 25 45 14 19 22 27 5 10 . . . . 70…………………………1 8

我想根据 b1(interview1 月)和 b2(面试2个月)。所以平均值将是逐行的

例如,对于 ID=1,他在 第 3 个月第一次采访,然后在 第 9 个月再次采访,平均值将为 (month3 + month4 + month5 +月 6 + 月 7 + 月 8 月 9)/7,即 (15 + 45 + 12 + 12 + 11 + 12 + 78)/7=26.42

对于 ID= 2,平均值为 (month4 + month5 +month6+ month7 +month8)/5

等等..

我正在研究 R-studio。所以,我更喜欢用那个写的代码。提前致谢!!

【问题讨论】:

  • 样本数据:df <- data.frame(ID = c("1","2","3"), month1 = c("12","14","14"), month2 = c("14","15","13"), month3 = c("15","45","25"), month4 = c("45","14","74"), month5 = c("12","15","25"), month6 = c("12","45","45"), month7 = c("11","14","14"), month8 = c("12","19","19"), month9 = c("78","22","22"), month10 = c("28","27","27"), b1 = c("3","4","5"), b2 = c("9","8","10"))

标签: r subset average


【解决方案1】:

只要变量的顺序不变,此解决方案就可以工作。

library(dplyr)

df %>%
  rowwise() %>%
  mutate(avg = mean(c_across((b1+1):(b2+1)), na.rm =TRUE)) %>%
  select(-ID)

# Rowwise: 
  month1 month2 month3 month4 month5 month6 month7 month8 month9 month10    b1    b2   avg
   <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>   <dbl> <dbl> <dbl> <dbl>
1     12     14     15     45     12     12     11     12     78      28     3     9  26.4
2     14     15     45     14     15     45     14     19     22      27     4     8  24.9
3     14     13     25     74     25     45     14     19     22      27     5    10  32 

样本数据:

df <- tribble(
  ~ID,  ~month1,  ~month2,   ~month3,   ~month4,   ~month5,  ~month6,  ~month7,  ~month8,  ~month9,  ~month10,   ~b1,  ~b2,
    1,   12,      14,        15,         45,      12,      12,       11,    12,       78,     28,      3,   9,
  2,   14,      15,        45,         14,      15,      45,       14,    19,       22,     27,      4,   8,
  3,   14,      13,        25,         74,      25,      45,       14,    19,       22,     27,      5,   10,
)

【讨论】:

  • 你可以省略select(-ID)
【解决方案2】:

使用mapply 的基本 R 选项:

cols <- grep('month', names(df), value = TRUE)
df$result <- mapply(function(x, y, z) mean(unlist(df[x,cols[y:z]]),na.rm = TRUE),
                     seq(nrow(df)), df$b1, df$b2)

【讨论】:

    【解决方案3】:

    您可以使用apply 逐行,对向量进行子集化并计算平均值:

    apply(df[-1], 1, function(x) mean(as.numeric(x[x[11]:x[12]])))
    #[1] 26.42857 21.40000 25.33333
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-10-12
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多