【发布时间】:2016-09-15 21:06:20
【问题描述】:
我正在尝试根据一个或多个值过滤 DataFrame。这是一个示例 CSV:
AlignmentId,TranscriptId,classifier,value
ENSMUST00000025010-1,ENSMUST00000025010,AlnCoverage,0.99612
ENSMUST00000025010-1,ENSMUST00000025010,AlnIdentity,0.93553
ENSMUST00000025010-1,ENSMUST00000025010,Badness,0.06749
ENSMUST00000025014-1,ENSMUST00000025014,AlnCoverage,1.0
ENSMUST00000025014-1,ENSMUST00000025014,AlnIdentity,0.96382
ENSMUST00000025014-1,ENSMUST00000025014,Badness,0.03618
加载时:
>>> df = pd.read_csv('tmp.csv', index_col=['AlignmentId', 'TranscriptId'])
>>> df
classifier value
AlignmentId TranscriptId
ENSMUST00000025010-1 ENSMUST00000025010 AlnCoverage 0.99612
ENSMUST00000025010 AlnIdentity 0.93553
ENSMUST00000025010 Badness 0.06749
ENSMUST00000025014-1 ENSMUST00000025014 AlnCoverage 1.00000
ENSMUST00000025014 AlnIdentity 0.96382
ENSMUST00000025014 Badness 0.03618
我想删除在一系列classifiers 中失败的每个AlignmentId 组。对于这个例子,假设我想删除ENSMUST00000025010,因为AlnCoverage < 1.0。因此,我想以这个数据框结束:
ENSMUST00000025014-1 ENSMUST00000025014 AlnCoverage 1.00000
ENSMUST00000025014 AlnIdentity 0.96382
ENSMUST00000025014 Badness 0.03618
我该怎么做?
【问题讨论】:
标签: python pandas dataframe multi-index