如何根据 R 中的索引将列添加到数据框中？（见示例）答案

【问题标题】：How to add columns to a dataframe based on indexes in R? (See example)如何根据 R 中的索引将列添加到数据框中？（见示例）
【发布时间】：2019-06-03 23:26:36
【问题描述】：

我正在使用一个自制的中缀函数，它只是计算列中观察值之间的百分比增长。

options(digits=3)

`%grow%` <- function(x,y) {
    (y-x) / x * 100
}

test <- data.frame(a=c(101,202,301), b=c(123,214,199), h=c(134, 217, 205))

然后我在我的玩具数据库中使用lapply 以添加两个新列。

test[,4:5] <- lapply(1:(ncol(test)-1), function(i) test[,i] %grow% test[,(i+1)])
test

#Output
    a   b   h     V4   V5
1 101 123 134  21.78 8.94
2 202 214 217   5.94 1.40
3 301 199 205 -33.89 3.02

考虑到我只有三列而且我可以写test[,4:5]，这很容易。现在笼统地说：如果我们有 n 列使用列索引，如何做到这一点？我的意思是我想从最后一个开始为给定数据库创建 n-1 列。比如：

test[,(last_current_column+1):(last_column_created_using_function)]

考虑到我在其他帖子中读到的内容，使用我的示例，test[,(last_current_column+1): 可以写成：

test[,(ncol(test)+1):]

但第二部分仍然缺失，我不知道如何写。

我希望我说清楚了。我非常感谢任何评论或建议。

2019 年快乐 :)

【问题讨论】：

标签： r database dataframe functional-programming

【解决方案1】：

您将始终ncol(test) - 1 新列。现在使用这个逻辑有多种方法可以做到这一点。

一种方法是构造一个带有一些前缀值的字符向量。

test[paste0("new_col", seq_len(ncol(test) - 1))] <- lapply(1:(ncol(test)-1),
                    function(i) test[,i] %grow% test[,(i+1)])


test
#    a   b   h   new_col1 new_col2
#1 101 123 134  21.782178 8.943089
#2 202 214 217   5.940594 1.401869
#3 301 199 205 -33.887043 3.015075

通过创建数据帧的子集使用mapply 和transform 的另一种选择

transform(test,
   new_col = mapply(`%grow%`, test[1:(ncol(test)- 1)], test[2:ncol(test)]))


#    a   b   h  new_col.a new_col.b
#1 101 123 134  21.782178  8.943089
#2 202 214 217   5.940594  1.401869
#3 301 199 205 -33.887043  3.015075

【讨论】：

【解决方案2】：

另一种方法是：

#options(digits=3)

`%grow%` <- function(x,y) {
  (y-x) / x * 100
}

test <- data.frame(a=c(101,202,301), 
                   b=c(123,214,199),
                   h=c(134, 217, 205),
                   d=c(156,234,235))
#     a   b   h   d
# 1 101 123 134 156
# 2 202 214 217 234
# 3 301 199 205 235


seqcols <- seq_along(test) # saved just to improve readability
test[,seqcols[-length(seqcols)] + max(seqcols)] <- lapply(seqcols[-length(seqcols)], 
                     function(i) test[,i] %grow% test[,(i+1)])
test
#     a   b   h   d     V5   V6    V7
# 1 101 123 134 156  21.78 8.94 16.42
# 2 202 214 217 234   5.94 1.40  7.83
# 3 301 199 205 235 -33.89 3.02 14.63

类似于@Ronak Shah 的第二个解决方案，只是使用来自purrr 的map2_df：

cbind(test,
      new=purrr::map2_df(test[seqcols[-length(seqcols)]], test[seqcols[-1]], `%grow%`),
      deparse.level=1)
#     a   b   h   d  new.a new.b new.h
# 1 101 123 134 156  21.78  8.94 16.42
# 2 202 214 217 234   5.94  1.40  7.83
# 3 301 199 205 235 -33.89  3.02 14.63

【讨论】：