按字段是否包含值过滤内容答案

【问题标题】：Filtering content by whether field contains a value按字段是否包含值过滤内容
【发布时间】：2014-04-06 12:30:11
【问题描述】：

在我处理 csv 文件的原始代码中，我跳过了包含特定值的行中的数据：

df = df[df["ORGANIZATION"]!="Org1"]

现在我需要跳过包含该值的数据。下面判断是否包含值...

df = df[df["ORGANIZATION"].str.contains("Org1")]

但是我如何否定它来隐藏这些值呢？一些值可能是“Org1 - Dave”或“Org1 - Lisa”。如何跳过值中某处包含“Org1”的数据？

我一直在搜索，但无法正确表达我的问题以找到正确的答案。

【问题讨论】：

标签： python csv pandas filtering

【解决方案1】：

您可以使用~ 来否定您的布尔系列：

>>> df = pd.DataFrame({"ORGANIZATION": ["Org1", "Org1 - Dave", "Org1 - Lisa", "Org2 - Bob", "Org3 - Sally"]})
>>> df
   ORGANIZATION
0          Org1
1   Org1 - Dave
2   Org1 - Lisa
3    Org2 - Bob
4  Org3 - Sally

[5 rows x 1 columns]
>>> df[df["ORGANIZATION"].str.contains("Org1")]
  ORGANIZATION
0         Org1
1  Org1 - Dave
2  Org1 - Lisa

[3 rows x 1 columns]
>>> df[~df["ORGANIZATION"].str.contains("Org1")]
   ORGANIZATION
3    Org2 - Bob
4  Org3 - Sally
[2 rows x 1 columns]

注意，您也可以使用groupby 来划分框架：

>>> gg = df.groupby(df["ORGANIZATION"].str.contains("Org1"))
>>> for k,g in gg:
...     print k
...     print g
...     
False
   ORGANIZATION
3    Org2 - Bob
4  Org3 - Sally

[2 rows x 1 columns]
True
  ORGANIZATION
0         Org1
1  Org1 - Dave
2  Org1 - Lisa

[3 rows x 1 columns]

【讨论】：

谢谢！我刚刚意识到这些值是区分大小写的。我可以将df["ORGANIZATION"] 列转换为大写然后搜索值吗？
如果您想将该列设为大写，您可以使用df["ORGANIZATION"] = df["ORGANIZATION"].str.upper()。如果你只想做一个不区分大小写的包含搜索，你可以做.str.contains("Org1", case=False)。