【问题标题】:Mutate in R according to the row index?根据行索引在R中变异?
【发布时间】:2021-08-15 16:38:11
【问题描述】:

所以我在 R 中合并了两个数据框:

第一个:

Ministry Meeting Perso_or_Public
Ministry A Meeting 1 Personnal
Ministry A Meeting 2 Public
Ministry A Meeting 3 Public
Ministry B Meeting 1 Personnal
Ministry B Meeting 2 Personnal
NA Meeting 2 Personnal

第二个:

Ministry Meeting Guest Minister_Gender
Ministry A Meeting 1 Alexander MAN
Ministry A Meeting 2 Jane MAN
Ministry A Meeting 3 Antonio MAN
Ministry B Meeting 1 Jessica WOMAN
Ministry B Meeting 2 Camilla WOMAN
NA Meeting 2 NA NA

输出:

dfA <- merge(df1, df2, by=c("Ministry","Meeting"), all.x=TRUE)
Ministry Meeting Perso_or_Public Guest Minister_Gender
Ministry A Meeting 1 Personnal Alexander NA
Ministry A Meeting 2 Public Jane NA
Ministry A Meeting 3 Public Antonio NA
Ministry B Meeting 1 Personnal Jessica WOMAN
Ministry B Meeting 2 Personnal Camilla WOMAN
NA Meeting 2 Personnal NA NA

如您所见,“A 部”的部长性别有问题,我真的不明白为什么,因为没有错字或任何其他问题(我检查了所有内容 -> 没有多余的空格等) .我尝试了以下方法:

dfA <- dfA %>% mutate(Minister_Gender=ifelse(Ministry=='Ministry A', "MAN", Minister_Gender))
#doesn't work 

dfA$Minister_Gender <- dfA$Minister_Gender[1:3] <- "MAN"
#tried to mutate by row index
#writes MAN in all the Minister_Gender column

我不能使用带有 is.na() 的 mutate 命令,因为 Minister_Gender 列中的 NA 也涉及其他部门。 所以我想知道你们中是否有人知道如何根据行数进行变异,但比我尝试的方法更好;或任何其他可行的方法。

更新

dfB <- subset(dfA, Ministry=="Ministry A")
#0 obs
dfC <- subset(df1, Ministry=="Ministry A")
#0 obs

【问题讨论】:

  • 您能否分享您的数据示例,而不仅仅是屏幕截图/表格? stackoverflow.com/help/minimal-reproducible-example
  • 请分享dput(dfA[1:6, ])dput(df1[1:6, ])(或任何您输入的数据框名称)。尽管您检查了额外的空格,但对于您所看到的结果,对我来说唯一有意义的是这些问题。
  • 我做了一个输入,确实,数据集中显示的名字不是“真实的”!

标签: r dataframe merge na dplyr


【解决方案1】:

可以使用 dplyr 包中的left_join 代替:

df1 <- read.table(header=TRUE, sep=",", text="
Ministry, Meeting, Perso_or_Public
Ministry A, Meeting 1, Personnal
Ministry A, Meeting 2, Public
Ministry A, Meeting 3, Public
Ministry B, Meeting 1, Personnal
Ministry B, Meeting 2, Personnal
NA, Meeting 2, Personnal")

df2 <- read.table(header=TRUE, sep=",", text="
Ministry, Meeting, Guest, Minister_Gender
Ministry A, Meeting 1, Alexander, MAN
Ministry A, Meeting 2, Jane, MAN
Ministry A, Meeting 3, Antonio, MAN
Ministry B, Meeting 1, Jessica, WOMAN
Ministry B, Meeting 2, Camilla, WOMAN
NA, Meeting 2, NA, NA")

library(dplyr)
left_join(df1, df2, by=c("Ministry","Meeting"))

# Ministry    Meeting Perso_or_Public      Guest Minister_Gender
# 1 Ministry A  Meeting 1       Personnal  Alexander             MAN
# 2 Ministry A  Meeting 2          Public       Jane             MAN
# 3 Ministry A  Meeting 3          Public    Antonio             MAN
# 4 Ministry B  Meeting 1       Personnal    Jessica           WOMAN
# 5 Ministry B  Meeting 2       Personnal    Camilla           WOMAN
# 6       <NA>  Meeting 2       Personnal         NA              NA

顺便说一句,结果与merge(df1, df2, by=c("Ministry","Meeting"), all.x=TRUE) 中的结果相同,因此您的数据似乎有问题。请以可重复的方式提供您的数据,可以使用read.table 或(甚至更简单)使用dput 读取的内联文本。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2017-08-25
    • 1970-01-01
    • 2017-10-29
    • 2020-05-17
    • 1970-01-01
    • 1970-01-01
    • 2011-06-20
    相关资源
    最近更新 更多