【发布时间】:2019-08-26 03:51:34
【问题描述】:
我需要比较 Pandas 数据框中的两列并进行模糊匹配。
如果模糊匹配高于某个百分比(例如 85),我需要返回那个百分比,或者一个字符串说 "Partial Match"
如果完全匹配,返回"Full Match"
如果不匹配,返回"No Match"
我尝试过的解决方案:
尝试 #1
conditions = [
(df['one'] == df['two']),fuzz.ratio((df['one'],df['two'])) > 80,
fuzz.ratio((df['one'],df['two'])) <= 80]
choices = ["FULL Match", fuzz.ratio((df['one'],df['two'])),"NO MATCH"]
df['result'] = np.select(condition,choices, default = np.nan)
================================================ ======================
尝试 #2
df['result'] = np.where(fuzz.ratio(df['one'], df['two']) >= 85, "部分匹配", '不匹配')
import pandas as pd
import numpy as np
from fuzzywuzzy import fuzz
import os
df = pd.read_csv('data.csv')
>x = fuzz.ratio(df['one'], df['two']) >= 85
df['result'] = np.where(x, "Match", 'No Match')'''
预期结果
one two result
0 apple Apple Partial Match
1 banana bannana Partial Match
2 kiwi dragonfruit No Match
3 mango mango Full Match
================================================ =====================
错误信息:
尝试 #1
IndexError: 元组索引超出范围
尝试 #2
ValueError:Series 的真值不明确。使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。
【问题讨论】:
-
您可以发布您的数据样本吗?
标签: python pandas numpy dataframe