【发布时间】:2015-11-12 13:27:52
【问题描述】:
我希望加载和处理一个包含七个变量的 CSV 文件,一个是分组变量/因子 (data$hashtag),六个是类别(data$support 和其他),用“X”表示或“x”(或留空)。
data <- read.csv("maet_coded_tweets.csv", stringsAsFactors = F)
names(data) <- c("hashtag", "support", "contributeConversation", "otherCommunities", "buildCommunity", "engageConversation", "unclear")
str(data)
'data.frame': 854 obs. of 7 variables:
$ hashtag : chr "#capstoneisfun" "#capstoneisfun" "#capstoneisfun" "#capstoneisfun" ...
$ support : chr "x" "x" "x" "x" ...
$ contributeConversation: chr "" "" "" "" ...
$ otherCommunities : chr "" "" "" "" ...
$ buildCommunity : chr "" "" "" "" ...
$ engageConversation : chr "" "" "" "" ...
$ unclear : chr "" "" "" "" ...
当我使用函数将“X”或“x”重新编码为 1 和“”(空白)0 时,数据是奇怪的字符类型,而不是预期的数字。
recode <- function(x) {
x[x=="x"] <- 1
x[x=="X"] <- 1
x[x==""] <- 0
x
}
data[] <- lapply(data, recode)
str(data)
'data.frame': 854 obs. of 7 variables:
$ hashtag : chr "#capstoneisfun" "#capstoneisfun" "#capstoneisfun" "#capstoneisfun" ...
$ support : chr "1" "1" "1" "1" ...
$ contributeConversation: chr "0" "0" "0" "0" ...
$ otherCommunities : chr "0" "0" "0" "0" ...
$ buildCommunity : chr "0" "0" "0" "0" ...
$ engageConversation : chr "0" "0" "0" "0" ...
$ unclear : chr "0" "0" "0" "0" ...
当我尝试在函数中使用as.numeric() 强制字符时,它仍然不起作用。什么给出 - 为什么将变量视为字符以及如何将变量字符转换为数字?
【问题讨论】:
-
向量只能保存一种数据类型。因此,如果您将字符串的一部分替换为数字变量,它将被转换为字符。你到底是怎么在函数中使用
as.numeric()的? -
recode <- function(x) { x[x=="x"] <- as.numeric(1) x[x=="X"] <- as.numeric(1) x[x==""] <- as.numeric(0) x } -
你可以试试
return(as.numeric(x))。正如我在之前的评论中所说,您这样做的方式仍然迫使您转变为角色。或者你可以做res <- ifelse(x %in% c("x","X"),1,0)
标签: r