【发布时间】:2020-05-20 19:27:35
【问题描述】:
我有一个名为 df 的 pandas 数据框。它有一个名为article 的列。 article 列包含 600 个字符串,每个字符串代表一篇新闻文章。
我只想保留前四个句子包含关键字“COVID-19”AND(“China”或“Chinese”)的文章。但我无法自己找到一种方法来执行此操作。
(在字符串中,句子以\n分隔。示例文章如下所示:)
\nChina may be past the worst of the COVID-19 pandemic, but they aren’t taking any chances.\nWorkers in Wuhan in service-related jobs would have to take a coronavirus test this week, the government announced, proving they had a clean bill of health before they could leave the city, Reuters reported.\nThe order will affect workers in security, nursing, education and other fields that come with high exposure to the general public, according to the edict, which came down from the country’s National Health Commission.\ .......
【问题讨论】:
-
您的意思是要删除该列中不包含这些单词的所有行吗?我从this question 假设您将首先将文章列减少到仅在过滤之前的前三四个句子?
-
是的,我想删除该列中不包含这些单词的所有行,但我不想将文章列减少到仅前三四个句子。希望过滤后保留全文。 :)
标签: python pandas filter keyword-search