【发布时间】:2022-07-19 19:58:40
【问题描述】:
我有 2 个数据框:
df1 = pd.DataFrame({'Item': ["Bag room","Bag Scan", "Bag Screening Equipment"],'CC': ["AAA","BBB", "CCC"]})
df2 = pd.DataFrame({'Item': ["SIN_SATS LTD_DOC-Bag Scan :Aug","SIN_SATS LTD_DOC-Bag room :Aug","EDI_EDINBURGH AIRPORT LTD_DOC-Bag Screening Equipment :Sep"]})
我正在使用下面的代码从 df2 中的字符串中提取 df1 中的子字符串,最后返回 CC 列的内容。它的效果很好,如下例所示:
pat = '|'.join(df1['Item'].values)
df2['Item_Description'] = df2['Item'].str.extract(f"({pat})")
df2['CC'] = df2['Item_Description'].map(df1.set_index('Item')['CC'])
但是,当我将括号添加到项目 :Bag Screening (Equipment) 和 EDI_EDINBURGH AIRPORT LTD_DOC-Bag Screening (Equipment) :Sep 和我使用相同的代码提取子字符串时,我收到以下错误: 错误通过的项目数 2,位置意味着 1
有什么方法可以处理这个问题,还是我必须在使用代码之前从项目中删除括号?
df1 = pd.DataFrame({'Item': ["Bag room","Bag Scan", "Bag Screening (Equipment)"],'CC': ["AAA","BBB", "CCC"]})
df2 = pd.DataFrame({'Item': ["SIN_SATS LTD_DOC-Bag Scan :Aug","SIN_SATS LTD_DOC-Bag room :Aug","EDI_EDINBURGH AIRPORT LTD_DOC-Bag Screening (Equipment) :Sep"]})
【问题讨论】: