【问题标题】:R filter in dataframe by columnsR按列过滤数据帧
【发布时间】:2011-08-28 03:19:03
【问题描述】:

我对以下数据框有疑问:

genes <- matrix(c("chr1","chr2","chr2","chr2","chr2","chr2",
              "uc001upw.2","uc001upw.2","uc001upw.2","uc001upx.1","uc001upy.1","uc001upz.1",
              "188001308","188001308","188001308","188037202","188037202","188037202",
              "188021266","188021266","188021266","188086618","188127464","188127464",
              "-","-","-","-","-","-",
              "CARCRL","CALCRL","CALCRL","TFPI","TFPI","TFPI", 
              "uc001upx.1","uc00upy.1","uc001upz.1","uc001upw.2","uc001upw.2","uc001upw.2",
              "188037202","188037202","188037202","188001308","188001308","188001308",
              "188086618","188127464","188127464","188021266","188021266","188021266",
              "-","-","-","-","-","-",
              "TFPI","TFPI","TFPI","CALCRL","CALCRL","CALCRL",
              "35894","35894","35894","35894","35894","35894"), nrow=6)

colnames(genes)<- c("chr","names.x","start.x","stop.x","strand.x","alias.x","name.y","start.y","stop.y","strand.y", "alias.y", "distance_startsite")
genes<-as.data.frame(genes)

在数据框中,您可以看到前三行的 names.x 和 names.y 是唯一的。 第 4、5 和 6 行不是唯一的,它们只是以相反的方式显示。 我的问题是:有没有办法过滤这个?

谢谢! 萨曼莎

【问题讨论】:

  • 请概括这个问题,以便它服务于比 n = 1 更多的人群,其中 n = 你。

标签: r filter dataframe subset


【解决方案1】:

我敢肯定,这不是最漂亮的方法,但它可以完成工作:

genes[!duplicated(t(apply(genes[,c('names.x','name.y')],1,sort))),]

【讨论】:

  • 感谢您的回答,但是当我运行代码时,我创建了一个包含 4 行的数据框。第 2 行和第 4 行相同:
  • chr names.x start.x stop.x strand.x alias.x name.y start.y 1 chr1 uc001upw.2 188001308 188021266 - CARCRL uc001upx.1 188037202 2 chr2 uc001upw.2 188001308 188 - 通过Strand.y alias.y virtting_startsite 1 188037202 188037202 188037202 TFPI 35894 3 188127464 - TFPI 35894 5 188021266 - CALCRL 35894
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2022-07-29
  • 2021-07-02
  • 2018-02-20
  • 1970-01-01
  • 2020-10-25
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多