您可以先为特定列创建一个掩码(我稍微编辑了您的第二行),然后创建一个新的“匹配”列,其中所有值都初始化为“不匹配”,最后,将值更改为您想要的应用掩码后返回的行的格式(“列||值”)。我在以下示例代码中实现了这一点:
def match_column(df, column_name):
column_mask = df.astype(str).applymap(lambda x: any(y in x for y in ['Chann','Midm']))[column_name]
df['Match'] = 'No Match'
df.loc[column_mask, 'Match'] = column_name + ' || ' + df[column_name]
return df
df = {
'Segment': ['Government', 'Government', 'Midmarket', 'Midmarket', 'Government', 'Channel Partners'],
'Country': ['Canada', 'Germany', 'France', 'Canada', 'France', 'France']
}
df = pd.DataFrame(df)
display(df)
df = match_column(df, 'Segment')
display(df)
输出:
但是,这仅适用于单个列。我不知道在多列中有匹配项的情况下您想要什么输出(如果可以,请指定)。
更新:
如果您想使用列列表作为输入并与第一个实例匹配,您可以改用它:
def match_first_column(df, column_list):
df['Match'] = 'No Match'
# iterate over rows
for index, row in df.iterrows():
# iterate over column names
for column_name in column_list:
column_value = row[column_name]
substrings = ['Chann', 'Midm', 'Fran']
# if a match is found
if any(x in column_value for x in substrings):
# add match string
df.loc[index, 'Match'] = column_name + ' || ' + column_value
# stop iterating and move to next row
break
return df
df = {
'Segment': ['Government', 'Government', 'Midmarket', 'Midmarket', 'Government', 'Channel Partners'],
'Country': ['Canada', 'Germany', 'France', 'Canada', 'France', 'France']
}
df = pd.DataFrame(df)
display(df)
column_list= df.columns.tolist()
match_first_column(df, column_list)
输出: