【问题标题】:Assign value to a column from another column based on condition根据条件为另一列中的列赋值
【发布时间】:2014-07-25 08:11:45
【问题描述】:

假设我有一个这样的列表:

> desired <- c("10001", "10004")

还有一个像这样的示例数据框:

> desired_sample_df <- data.frame(geo = rep("other", 30), zip = c(rep(10001:10010, 2), 10011:10020), cbsa = c(rep("NY", 20), rep("CA", 10)))
> desired_sample_df
     geo   zip cbsa
1  other 10001   NY
2  other 10002   NY
3  other 10003   NY
4  other 10004   NY
5  other 10005   NY
6  other 10006   NY
7  other 10007   NY
8  other 10008   NY
9  other 10009   NY
10 other 10010   NY
11 other 10001   NY
12 other 10002   NY
13 other 10003   NY
14 other 10004   NY
15 other 10005   NY
16 other 10006   NY
17 other 10007   NY
18 other 10008   NY
19 other 10009   NY
20 other 10010   NY
21 other 10011   CA
22 other 10012   CA
23 other 10013   CA
24 other 10014   CA
25 other 10015   CA
26 other 10016   CA
27 other 10017   CA
28 other 10018   CA
29 other 10019   CA
30 other 10020   CA

仅当 zip 的值在开始时保存的 desired 列表中时,我想用 zip 中的值覆盖 geo 列。


这是我尝试过的:

> desired_sample_df$geo[desired_sample_df$zip %in% desired] <- desired_sample_df$zip[which(desired_sample_df$zip %in% desired)]
Warning message:
In `[<-.factor`(`*tmp*`, desired_sample_df$zip %in% desired, value = c(NA,  :
  invalid factor level, NA generated


> desired_sample_df$geo[desired_sample_df$zip %in% desired] <- desired_sample_df$zip
Warning messages:
1: In `[<-.factor`(`*tmp*`, desired_sample_df$zip %in% desired, value = c(NA,  :
  invalid factor level, NA generated
2: In `[<-.factor`(`*tmp*`, desired_sample_df$zip %in% desired, value = c(NA,  :
  number of items to replace is not a multiple of replacement length

【问题讨论】:

    标签: r


    【解决方案1】:

    其中一个问题是数据框中的字符串会自动成为因素。试试这个:

    desired <- c("10001", "10004")
    df <- data.frame(geo = rep("other", 30), zip = c(rep(10001:10010, 2), 10011:10020), cbsa = c(rep("NY", 20), rep("CA", 10)), stringsAsFactors=FALSE)
    
    idx <- df$zip %in% desired
    

    现在你可以改变你想要的元素

    df[idx, ]$geo <- df[idx, ]$zip
    

    【讨论】:

    • 我添加了 stringsAsFactors=FALSE 部分,因为它给了你错误。
    • 啊,有道理。不确定是什么导致了错误。不知道如何对答案进行排名,但为了简洁起见,我把它给了@jhoward...
    • 你只能接受一个答案,遗憾的是没有银牌;-)
    【解决方案2】:

    像这样?

    df$geo <- ifelse(df$zip %in% desired,df$zip,df$geo)
    

    我打电话给你的desired_sample_df,只是df

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2021-11-29
      • 1970-01-01
      • 1970-01-01
      • 2021-10-01
      • 1970-01-01
      • 1970-01-01
      • 2022-01-15
      相关资源
      最近更新 更多