Python中Word Mover的距离答案

【问题标题】：Word Mover's Distance in PythonPython中Word Mover的距离
【发布时间】：2017-09-12 15:28:59
【问题描述】：

我正在尝试使用 WMD 计算 2 个文本的相似度。我尝试使用 gensim 在 Python 3 中使用以下代码：

word2vec_model = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
word2vec_model.init_sims(replace=True) # normalizes vectors
distance = word2vec_model.wmdistance("string 1", "string 2")  # Compute WMD as normal.

但是，我认为这并没有给我带来正确的价值。我应该如何在 python 中做到这一点？

【问题讨论】：

请记住，距离越小相似度越高

标签： python python-3.x text nlp information-retrieval

【解决方案1】：

请拆分字符串：

distance = word2vec_model.wmdistance("string 1".split(), "string 2".split())
>>> 0.4114476676950455

参数需要是字符串列表。

【讨论】：

有时候问题很简单！谢谢你。顺便说一句，你知道衡量两个文本是否相关的更好方法吗？