【问题标题】:remove entire rows from df if the word occurs如果单词出现,则从 df 中删除整行
【发布时间】:2022-11-30 22:45:18
【问题描述】:
stowwords列表:
stop_w = ["in", "&", "the", "|", "and", "is", "of", "a", "an", "as", "for", "was" ]
df:
| words |
frequency |
| the company |
10 |
| green energy |
9 |
| founded in |
8 |
| gases for |
8 |
| electricity |
5 |
如果它包含任何给定的停用词,我想删除整行,在此示例中输出应该是:
| words |
frequency |
| green energy |
9 |
| electricity |
5 |
【问题讨论】:
标签:
python
pandas
dataframe
【解决方案1】:
| 字符有含义,在 Python 的术语中表示 or,因此您需要转义该含义才能在停用词列表中使用它。你用反斜杠转义 (查看更多 here)
话虽如此,你可以这样做:
stop_w = ["in", "&", "the", "|", "and", "is", "of", "a", "an", "as", "for", "was"]
df.loc[~df['words'].str.contains('|'.join(stop_w))]
印刷:
words frequency
1 green energy 9
4 electricity 5