【问题标题】:Filter a Mapped value过滤映射值
【发布时间】:2022-01-20 16:57:22
【问题描述】:

抱歉,很新。

我正在使用 pandas 读取镶木地板。我的数据集中的一列被映射。我会根据映射的值进行过滤,只返回符合我条件的行。

我的数据如下所示: 列:[UUID、UUID_c、Rating、approvalTimestamp] Rating 列看起来像这样(并且是一个对象数据类型):

[('US', 'IB'), ('EU', 'IA'), ('CA', 'IIC'), ('CH', 'III'), ('UK', 'IA'), ('AU', 'IB'), ('TW', 'III'), ('TK', 'IV')]

我想过滤“IA”或“IB”的美国值。

这将返回地图中“美国”的所有实例:

df2 = df[df['Rating'].str.contains("US")]

这会返回一个空的数据框:

df2 = df[df['Rating'].str.contains("IA")]

如何返回分配给 US 的值为“IA”或“IB”的实例?

数据框看起来像:

UUID  |  UUID_c  |   Rating   |  approvalTimeStamp|
---------------------------------------------------
037a9db2-c91f-4e93-a36e-3b6e7adb885f   |   ['8b2c409b-6c01-0100-2d32-670000010368','1fdfa790-a001-0100-5efe-b90000060013'] | [('US', 'IB'), ('EU', 'IA'), ('CA', 'IIC'), ('CH', 'III'), ('UK', 'IA'), ('AU', 'IB'), ('TW', 'III'), ('TK', 'IV')]   |   2022-01-06T19:10:46.304734Z
037a9db2-c91f-4e93-a36e-3b6e7adb885f   |   ['8b2c409b-6c01-0100-2d32-670000010368','691aa282-e1ec-4904-b6c3-18a20ba3cda2'] | [('US', 'IIC'), ('EU', 'IA'), ('CA', 'IIC'), ('CH', 'III'), ('UK', 'IA'), ('AU', 'IB'), ('TW', 'III'), ('TK', 'IV')]   |   2022-01-06T19:10:46.304734Z
037a9db2-c91f-4e93-a36e-3b6e7adb885f   |   ['8b2c409b-6c01-0100-2d32-670000010368','eb8d409b-6c01-0100-0f90-bd0000410011'] | [('US', 'IA'), ('EU', 'IA'), ('CA', 'IIC'), ('CH', 'III'), ('UK', 'IA'), ('AU', 'IB'), ('TW', 'III'), ('TK', 'IV')]   |   2022-01-06T19:10:46.304734Z

想要返回这个:(过滤掉美国,IIC 行)

 UUID  |  UUID_c  |   Rating   |  approvalTimeStamp|
    ---------------------------------------------------
    037a9db2-c91f-4e93-a36e-3b6e7adb885f   |   ['8b2c409b-6c01-0100-2d32-670000010368','1fdfa790-a001-0100-5efe-b90000060013'] | [('US', 'IB'), ('EU', 'IA'), ('CA', 'IIC'), ('CH', 'III'), ('UK', 'IA'), ('AU', 'IB'), ('TW', 'III'), ('TK', 'IV')]   |   2022-01-06T19:10:46.304734Z
    037a9db2-c91f-4e93-a36e-3b6e7adb885f   |   ['8b2c409b-6c01-0100-2d32-670000010368','eb8d409b-6c01-0100-0f90-bd0000410011'] | [('US', 'IA'), ('EU', 'IA'), ('CA', 'IIC'), ('CH', 'III'), ('UK', 'IA'), ('AU', 'IB'), ('TW', 'III'), ('TK', 'IV')]   |   2022-01-06T19:10:46.304734Z

【问题讨论】:

  • 不清楚列数据格式是什么。我相信您需要提供一个简短的示例,包括数据框和预期输出,以便更清楚您需要什么。

标签: python pandas dataframe filter parquet


【解决方案1】:

这可能会有所帮助:

df = pd.DataFrame({'id': ['037a9db2-c91f-4e93-a36e-3b6e7adb885f','037a9db2-c91f-4e93-a36e-3b6e7adb885f','037a9db2-c91f-4e93-a36e-3b6e7adb885f'],
                   'Rating' : [[('US', 'IB'), ('EU', 'IA'), ('CA', 'IIC'), ('CH', 'III'), ('UK', 'IA'), ('AU', 'IB'), ('TW', 'III'), ('TK', 'IV')],[('US', 'IIC'), ('EU', 'IA'), ('CA', 'IIC'), ('CH', 'III'), ('UK', 'IA'), ('AU', 'IB'), ('TW', 'III'), ('TK', 'IV')],[('US', 'IA'), ('EU', 'IA'), ('CA', 'IIC'), ('CH', 'III'), ('UK', 'IA'), ('AU', 'IB'), ('TW', 'III'), ('TK', 'IV')]]})
df

    id                                      Rating
0   037a9db2-c91f-4e93-a36e-3b6e7adb885f    [(US, IB), (EU, IA), (CA, IIC), (CH, III), (UK...
1   037a9db2-c91f-4e93-a36e-3b6e7adb885f    [(US, IIC), (EU, IA), (CA, IIC), (CH, III), (U...
2   037a9db2-c91f-4e93-a36e-3b6e7adb885f    [(US, IA), (EU, IA), (CA, IIC), (CH, III), (UK...

Lambda 函数:

lst = df.apply(lambda row : True if ('US','IIC') not in row['Rating'] else False, axis= 1)
df[lst]

结果

    id                                      Rating
0   037a9db2-c91f-4e93-a36e-3b6e7adb885f    [(US, IB), (EU, IA), (CA, IIC), (CH, III), (UK...
2   037a9db2-c91f-4e93-a36e-3b6e7adb885f    [(US, IA), (EU, IA), (CA, IIC), (CH, III), (UK...

【讨论】:

    猜你喜欢
    • 2012-08-04
    • 2012-03-09
    • 2019-05-19
    • 1970-01-01
    • 2021-12-05
    • 2011-04-10
    • 2014-07-23
    • 2018-03-26
    • 2011-11-04
    相关资源
    最近更新 更多