【发布时间】:2019-07-30 13:31:57
【问题描述】:
我有一个如下所示的数据框。
> dput(head(wp_data_ensembl))
structure(list(wpid = c("WP3633", "WP3633", "WP3633", "WP694",
"WP694", "WP694"), gene = c("ENSG00000156006", "ENSG00000156006",
"ENSG00000156006", "ENSG00000156006", "ENSG00000156006", "ENSG00000156006"
), wpid = c("WP702", "WP694", "WP3633", "WP702", "WP694", "WP3633"
), name = c("Metapathway biotransformation Phase I and II", "Arylamine metabolism",
"Caffeine and Theobromine metabolism", "Metapathway biotransformation Phase I and II",
"Arylamine metabolism", "Caffeine and Theobromine metabolism"
)), row.names = c(NA, 6L), class = "data.frame")
数据框包含两列均名为 wpid。我想对两列都具有相同字符串的所有行进行子集化。
例如以下行。
wpid gene wpid name
1 WP3633 ENSG00000156006 WP702 Metapathway biotransformation Phase I and II
2 WP3633 ENSG00000156006 WP694 Arylamine metabolism
3 WP3633 ENSG00000156006 WP3633 Caffeine and Theobromine metabolism
只有第 3 行应保留在新数据框中。
欢迎任何帮助。
【问题讨论】:
-
不建议使用相同的列名