【问题标题】:AttributeError: 'Word2Vec' object has no attribute 'wmdistance'AttributeError:“Word2Vec”对象没有属性“wmdistance”
【发布时间】:2021-04-26 11:43:41
【问题描述】:

当我运行包含以下代码的 my.py 文件时: 生成以下错误: 回溯(最近一次通话最后): 文件“Checking.py”,第 34 行,在 距离 = model.wmdistance(sentance_a,sentance_b) AttributeError: 'Word2Vec' 对象没有属性 'wmdistance'

 from time import time
    from nltk.tokenize import sent_tokenize,word_tokenize
    from nltk.corpus import stopwords
    start_nb = time()
    
    data = 'The different Modi TV host in prime minister  chat Jim Corbett meet the'
    sentences = [sent_tokenize(x.lower()) for x in data]
    #sentences = [[w for w in sentence if w not in stopwords.words("english")] for x in sentence]
    
    sentance_a = 'Modi has a chat with Bear Grylls and Jim Corbett'
    sentance_b ='The prime minister meet the TV host in a National Park'
    sentance_a = sentance_a.lower().split()
    sentance_b = sentance_b.lower().split()
    
    from nltk.corpus import stopwords
    from nltk import download
    download('stopwords')
    
    stop_words = stopwords.words('english')
    sentance_a = [w for w in sentance_a if w not in stop_words]
    sentance_b = [w for w in sentance_b if w not in stop_words]
    
    start = time()
    import os
    from gensim import models as gsm
    
    from gensim.models import Word2Vec
    bigram = gsm.phrases.Phrases(sentences)
    bigram = gsm.phrases.Phraser(bigram) 
    trigram = gsm.phrases.Phrases(bigram[sentences])
    trigram = gsm.phrases.Phraser(trigram)
        
    model = gsm.Word2Vec(trigram[bigram[sentences]], min_count=2, workers=3, sg=1)
    distance = model.wmdistance(sentance_a,sentance_b)
    print("It took: %.4f"%(time()-start))
    
    print(distance)

【问题讨论】:

  • 你用的是哪个版本的gensim?
  • 我使用的是 4.0.1 gensim 版本

标签: python machine-learning nlp artificial-intelligence word-embedding


【解决方案1】:

使用最新版本的 Gensim,您必须使用 KeyedVectors

from gensim import models
w2vec_model = models.KeyedVectors.load_word2vec_format('model', binary=True)

源码参考:https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/models/keyedvectors.py

【讨论】:

    【解决方案2】:

    正如 Rivers 所指出的,在最近的 Gensim 版本中,您必须使用 KeyedVectors。它们可以通过model.wv 访问,然后加起来就是

    distance = model.wv.wmdistance(sentance_a,sentance_b)
    

    【讨论】:

      猜你喜欢
      • 2021-10-11
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2012-12-01
      • 2021-04-19
      • 2021-11-22
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多