【发布时间】:2015-08-31 23:19:56
【问题描述】:
我正在尝试创建一个函数,该函数将根据条件值生成一个新变量。我有一个包含 100 多列的调查数据集,这些列将相应地折叠。阅读this,但没有帮助。
'data.frame': 117 obs. of 7 variables:
$ fin_partner: Factor w/ 4 levels "","9","No","Yes": 2 2 4 3 2 2 2 2 4 4 ...
$ fin_parent : Factor w/ 4 levels "","9","No","Yes": 2 2 2 2 2 2 4 3 2 2 ...
$ fin_kids : Factor w/ 4 levels "","9","No","Yes": 4 2 2 2 2 2 2 2 2 2 ...
$ fin_othkids: Factor w/ 4 levels "","9","No","Yes": 2 2 2 2 2 2 3 2 2 2 ...
$ fin_fam : Factor w/ 4 levels "","9","No","Yes": 2 2 2 2 2 2 4 3 2 2 ...
$ fin_friend : Factor w/ 4 levels "","9","No","Yes": 2 2 3 3 2 2 2 2 4 2 ...
$ fin_oth : Factor w/ 4 levels "","9","No","Yes": 2 2 2 2 2 2 2 2 4 2 ...
我希望能够根据列对数据集进行子集化,然后将其传递给函数。现在,这些值包含“是”、“否”、“999”(表示缺失)。
我的目标是能够说明,对于每一行,任何列是否包含“是”,那么新列将填充“是”。我确信有比下面的代码更简单的方法,所以我对此持开放态度。
我目前的代码:
trial <- df[, 23:29]
trial.test <- as.data.frame(trial)
composite_score <- function(x){
# Convert to numeric values
change_to_number <- function(j) {
for (i in 1:length(j)){
if(i == "Yes"){
i <- 1
}
else{
i <- 0
}
}
}
x <- change_to_number(x)
new_col_var <- function(k){
if(rowSums(k) > 0){
k$newvar <- 1
}
else {
k$newvar <- 0
}
}
x <- new_col_var(x)
}
composite_score(trial.test)
代码产生以下错误:
Error in rowSums(k) : 'x' must be an array of at least two dimensions
数据:
> dput(head(trial.test))
structure(list(fin_partner = structure(c(2L, 2L, 4L, 3L, 2L,
2L), .Label = c("", "9", "No", "Yes"), class = "factor"), fin_parent = structure(c(2L,
2L, 2L, 2L, 2L, 2L), .Label = c("", "9", "No", "Yes"), class = "factor"),
fin_kids = structure(c(4L, 2L, 2L, 2L, 2L, 2L), .Label = c("",
"9", "No", "Yes"), class = "factor"), fin_othkids = structure(c(2L,
2L, 2L, 2L, 2L, 2L), .Label = c("", "9", "No", "Yes"), class = "factor"),
fin_fam = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("",
"9", "No", "Yes"), class = "factor"), fin_friend = structure(c(2L,
2L, 3L, 3L, 2L, 2L), .Label = c("", "9", "No", "Yes"), class = "factor"),
fin_oth = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("",
"9", "No", "Yes"), class = "factor")), .Names = c("fin_partner",
"fin_parent", "fin_kids", "fin_othkids", "fin_fam", "fin_friend",
"fin_oth"), row.names = c(NA, 6L), class = "data.frame")
【问题讨论】:
-
试试
rowSums(1:5)与rowSums(matrix(1:5))还有你希望rowSums(k) > 0做什么?你会有多个 TRUE/FALSE 而不仅仅是一个 -
您能添加一些示例数据供人们使用吗?
-
@rawr 我希望
rowSums对标志进行计数,如果总和不为0,则新列将为1 -
saply(df, MARGIN=1, FUN=function(row) ifelse(any("Yes" %in% row), "Yes", "No"))之类的东西应该可以工作。如果您想要有效的答案,请提供数据!例如,发布dput(head(trial.test))的值。 -
这太棒了@antoine-sac。谢谢您的帮助。 @gung 我在编辑中添加了
dput。
标签: r