如何在 R 中包含字符串和数字的一组选定列上进行行和？答案

【问题标题】：How to do rowsums on a select set of columns containing a string and a number in R?如何在 R 中包含字符串和数字的一组选定列上进行行和？
【发布时间】：2021-11-16 18:44:07
【问题描述】：

我有一个看起来像这样的列名列表...

               colnames(dat)
1                    subject
2                     e.type
3                      group
4                     boxnum
5                      edate
6                  file.name
7                         fr
8                     active
9                   inactive
10                    reward
11   latency.to.first.active
12 latency.to.first.inactive
13                  act0.600
14               act600.1200
15              act1200.1800
16              act1800.2400
17              act2400.3000
18              act3000.3600
19                inact0.600
20             inact600.1200
21            inact1200.1800
22            inact1800.2400
23            inact2400.3000
24            inact3000.3600
25                  rew0.600
26               rew600.1200
27              rew1200.1800
28              rew1800.2400
29              rew2400.3000
30              rew3000.3600

我想获取列出act#、inact#和reward#的列的行总和

这行得通...

for (row in 1:nrow(dat)) {
dat[row, "active"] = rowSums(dat[row,c(13:18)])
dat[row, "inactive"] = rowSums(dat[row,c(19:24)])
dat[row, "reward"] = rowSums(dat[row,c(25:30)])
}

但我不想对其进行硬编码，因为这 3 个部分的列数可能会发生变化。如何在不对列索引进行硬编码的情况下做到这一点？

另外，例如，我尝试搜索“行为”命名列，但它也包括“活动”列。

【问题讨论】：

这能回答你的问题吗？ create a new column which is the sum of specific columns (selected by their names) in dplyr

标签： r

【解决方案1】：

sub_dat <- dat[, 13:30]
result <- sapply(split.default(sub_dat, substr(names(sub_dat), 1, 3)), rowSums)
dat[, c('active', 'inactive', 'reward')]  <- result

【讨论】：

这很棒！考虑到我的列数会发生变化，我将“d[,13:30]”行更改为“dat[,13:length(dat)]”

【解决方案2】：

与tidyverse 中的女巫select 和matches 轻松相处。

library(tidyverse)

data %>%
    mutate(
        sum_act = rowSums(select(., matches("act[0-9]"))),
        sum_inact = rowSums(select(., matches("inact[0-9]"))),
        sum_rew = rowSums(select(., matches("rew[0-9]")))
    )

【讨论】：

【解决方案3】：

我举了一个例子：

t <- data.frame(c(1,2,3),c("a","b","c"))
colnames(t) <- c("num","char")

#with function append() you make a list of rows that fulfill your logical argument
whichRows <- append(which(t$char == "a"),which(t$char == "b"))
sum(t$num[whichRows])

或者如果我误解了你，你想分别对每一列求和：

sum(t$num[which(t$char == "a")])
sum(t$num[which(t$char == "b")])

【讨论】：