【问题标题】:How to calculate maximum similarity between synsets in NLTK? -Python如何计算 NLTK 中同义词集之间的最大相似度? -Python
【发布时间】:2013-03-31 07:37:23
【问题描述】:

我必须计算 list1 和 list2 的项目之间的同义词集相似度。我只想保留 list1 中单词的最大同义词相似度值。我该怎么做呢?我希望我的输出是

apple.n.01, pear.n.01: 0.909090909091
honey.n.01, pear.n.01: 0.333333333333

我的代码

>>> from nltk.corpus import wordnet
>>> import itertools as IT
>>> list1 = ["apple", "honey"]
>>> list2 = ["pear", "shell", "movie", "fire", "tree", "candle"]
>>> for word1, word2 in IT.product(list1, list2):
    wordFromList1 = wordnet.synsets(word1)[0]
    wordFromList2 = wordnet.synsets(word2)[0]
    s = wordFromList1.wup_similarity(wordFromList2)
    print('{w1}, {w2}: {s}'.format(w1 = wordFromList1.name,w2 = wordFromList2.name,s = wordFromList1.wup_similarity(wordFromList2)))


apple.n.01, pear.n.01: 0.909090909091
apple.n.01, shell.n.01: 0.4
apple.n.01, movie.n.01: 0.421052631579
apple.n.01, fire.n.01: 0.142857142857
apple.n.01, tree.n.01: 0.380952380952
apple.n.01, candle.n.01: 0.380952380952
honey.n.01, pear.n.01: 0.333333333333
honey.n.01, shell.n.01: 0.210526315789
honey.n.01, movie.n.01: 0.222222222222
honey.n.01, fire.n.01: 0.125
honey.n.01, tree.n.01: 0.2
honey.n.01, candle.n.01: 0.2

【问题讨论】:

    标签: python-2.7 nltk wordnet


    【解决方案1】:

    试试这个:

    from nltk.corpus import wordnet
    import itertools as IT
    list1 = ["apple", "honey"]
    list2 = ["pear", "shell", "movie", "fire", "tree", "candle"]
    def f(word1, word2):
        wordFromList1 = wordnet.synsets(word1)[0]
        wordFromList2 = wordnet.synsets(word2)[0]
        s = wordFromList1.wup_similarity(wordFromList2)
        return(wordFromList1.name, wordFromList2.name, wordFromList1.wup_similarity(wordFromList2))
    
    for word1 in list1:
        similarities=(f(word1,word2) for word2 in list2)
        print(max(similarities, key=lambda x: x[2]))
    

    它创建了一个生成器,用于返回单词及其相似性。然后打印第三个元素中具有最大值的元组。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2018-05-20
      • 2017-03-19
      • 2019-08-05
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-11-18
      相关资源
      最近更新 更多