【发布时间】:2022-01-01 11:36:08
【问题描述】:
我有这个例子 df:
data = pd.DataFrame({'id':[1, 2 , 3],
'question': ['first country visited?', 'first city visited?' , 'two cities we love?'],
'answer1': ['UK', 'Paris', 'CA'],
'answer2': ['US', 'New York', 'Paris'],
'answer3': ['CA', 'London', 'London'],
'answer4': ['JP', 'Toronto', 'Los Angeles'],
'correct': [['UK'], ['London'], ['London','Paris']]
})
给予:
id question answer1 answer2 answer3 answer4 correct
0 1 first country visited? UK US CA JP [UK]
1 2 first city visited? Paris New York London Toronto [London]
2 3 two cities we love? CA Paris London Los Angeles [London, Paris]
如果在名为 data['correct_column'] 的新列中的 data['correct'] 列中找到正确答案,我正在尝试识别列名称(answer1 或 2 .. 等)
到目前为止我做了什么:
data['correct_column'] = data.loc[:,'answer1':'answer4'].isin(data['correct']).idxmax(1)
我得到了所有相同的结果,只是 data['correct_column'] 中的值 answer1 我不知道为什么
想要的输出:
id question answer1 answer2 answer3 answer4 correct correct_column
0 1 first country visited? UK US CA JP [UK] answer1
1 2 first city visited? Paris New York London Toronto [London] answer3
2 3 two cities we love? CA Paris London Los Angeles [London, Paris] answer3,answer2
【问题讨论】:
标签: python python-3.x pandas dataframe