R cbind 和 bind_cols 的不同行为答案

【问题标题】：R differing behaviour of cbind and bind_colsR cbind 和 bind_cols 的不同行为
【发布时间】：2022-07-07 19:18:02
【问题描述】：

当组合具有不同行数/长度的数据框和向量时，bind_cols 会出错，而 cbind 会重复行 - 这是为什么呢？

（将其作为 cbind 的默认行为真的明智吗？）

请参阅下面的示例数据。


# Example data
x10 <- c(1:10)
y10 <- c(1:10)
xy10 <- tibble(x10, y10)

z10 <- c(1:10)
z20 <- c(1:20)

# Binding xy and z
xyz10 <- cbind(xy10, z10)
xyz10

# Provide an error
xyz20 <- dplyr::bind_cols(xy10, z20)

# But why is cbind repeating rows of xy10 to suit z20?
xyz20 <- cbind(xy10, z20)
xyz20

【问题讨论】：

来自bind_colsWhen column-binding, rows are matched by position, so all data frames must have the same number of rows. To match by value, not position的文档
但是cbind 将重复该向量，而它是参数 1 的向量长度的倍数

标签： r

【解决方案1】：

base::cbind 是一个通用函数。它的行为对于矩阵和数据帧是不同的。

对于矩阵，如果对象的行数不同，它会发出警告。

cbind(as.matrix(xy10), z20)
#      x10 y10 z20
# [1,]   1   1   1
# [2,]   2   2   2
# [3,]   3   3   3
# [4,]   4   4   4
# [5,]   5   5   5
# [6,]   6   6   6
# [7,]   7   7   7
# [8,]   8   8   8
# [9,]   9   9   9
#[10,]  10  10  10
#Warning message:
#In cbind(as.matrix(xy10), z20) :
#  number of rows of result is not a multiple of vector length (arg 2)

但对于数据框，它实际上是从头开始创建数据框。所以下面是相同的，都给出了 20 行的数据框：

cbind(xy10, z20)

## in this way, R's recycling rule steps in
data.frame(x = xy10[, 1], y = xy10[, 2], z = z20)

来自?cbind：

“cbind”数据框方法只是“data.frame(..., check.names = FALSE)”的包装。这意味着它将拆分数据框参数中的矩阵列，并将字符列转换为因子，除非指定了“stringsAsFactors = FALSE”。

【讨论】：