【发布时间】:2021-07-13 22:46:19
【问题描述】:
我一直在关注 Python Record Linkage Toolkit 包中的示例记录链接代码,并且在使用“jarowinkler”字符串匹配方法时运行良好。但是,当使用 method = "qgram" 或 "cosine" 运行时,它会引发一个 numpy 错误。关于可能导致错误的任何想法?
文件 "C:\ProgramData\Anaconda3\lib\site-packages\recordlinkage\compare.py", 第 153 行,在 _compute_vectorized c = c.where((c
AttributeError: 'numpy.ndarray' 对象没有属性 'where'
参考代码:
import recordlinkage
from recordlinkage.datasets import load_febrl1
##### Functions Correctly
dfA = load_febrl1()
# Indexation step
indexer = recordlinkage.Index()
indexer.block(left_on='given_name')
candidate_links = indexer.index(dfA)
compare_cl = recordlinkage.Compare()
compare_cl.string('surname', 'surname', method='jaro', threshold=0.1, label='surname')
features = compare_cl.compute(candidate_links, dfA)
matches = features[features.sum(axis=1) > 0]
print(len(matches))
##### Fails with:
# AttributeError: 'numpy.ndarray' object has no attribute 'where'
dfA = load_febrl1()
# Indexation step
indexer = recordlinkage.Index()
indexer.block(left_on='given_name')
candidate_links = indexer.index(dfA)
compare_cl = recordlinkage.Compare()
compare_cl.string('surname', 'surname', method='qgram', threshold=0.1, label='surname')
features = compare_cl.compute(candidate_links, dfA)
matches = features[features.sum(axis=1) > 0]
print(len(matches))
【问题讨论】:
标签: python numpy attributeerror record-linkage