在 R 的 data.table 中按组执行 := 时如何更改目标列的类型？答案

【问题标题】：How to change type of target column when doing := by group in a data.table in R?在 R 的 data.table 中按组执行 := 时如何更改目标列的类型？
【发布时间】：2015-04-15 07:13:55
【问题描述】：

我正在尝试对现有的“整数”类型列执行 := by group，其中新值的类型为“double”类型，但失败了。

我的方案是根据其他列中的值将表示时间的列更改为 POSIXct。我可以修改 data.table 的创建作为解决方法，但我仍然对如何实际更改列的类型感兴趣，正如错误消息中所建议的那样。

这是我的问题的一个简单玩具示例：

db = data.table(id=rep(1:2, each=5), x=1:10, y=runif(10))
db
id  x          y
 1:  1  1 0.47154470
 2:  1  2 0.03325867
 3:  1  3 0.56784494
 4:  1  4 0.47936031
 5:  1  5 0.96318208
 6:  2  6 0.83257416
 7:  2  7 0.10659533
 8:  2  8 0.23103810
 9:  2  9 0.02900567
10:  2 10 0.38346531

db[, x:=mean(y), by=id]   

Error in `[.data.table`(db, , `:=`(x, mean(y)), by = id) : 
Type of RHS ('double') must match LHS ('integer'). To check and coerce would impact performance too much for the fastest cases. Either change the type of the target column, or coerce the RHS of := yourself (e.g. by using 1L instead of 1)

【问题讨论】：

标签： r types data.table

【解决方案1】：

我们可以在将“mean(y)”分配给“x”之前将“x”列的类转换为“数字”，因为“x”的类是“整数”。如果我们将“x”替换为任何其他数字变量（包括“x”）的mean，这可能会很有用。

db[, x:= as.numeric(x)][, x:= mean(y), by=id][]

或者分配给一个新列，然后更改列名

setnames(db[, x1:= mean(y),by=id][,x:=NULL],'x1', 'x')

或者我们可以将“x”分配给“NULL”，然后创建“x”作为“y”的mean（@David Arenburg 的建议）

db[, x:=NULL][, x:= mean(y), by= id][]

【讨论】：

喜欢快速响应！完全忘记改变......关于第二个建议，在 setnames 中使用 'x1' 可能会更整洁？ IE。 setnames(db[, x1:= mean(y),by=id][,x:=NULL],'x1', 'x')
哈哈，我会的 - 只需再等几分钟；-)
@DavidArenburg 谢谢，这是有道理的。如果 OP 想要将 x 更改为该变量本身的平均值，我将其转换为数字 x。我没想过用db[,x:= NULL][, x:= mean(y), by =id]，
@hallvig 是的，它更整洁。我更新了帖子。