【问题标题】:How to get a pair of dependency relation between two words in a sentence using spacy?如何使用spacy获得句子中两个单词之间的一对依赖关系?
【发布时间】:2021-06-11 20:28:42
【问题描述】:

我正在使用 spacy 来获取依赖关系,这很好用。但是我在获取一对具有特定依赖关系的令牌时遇到了问题(conj 关系除外)。

当使用.dep_ 时,我可以获得每个单独令牌的依赖属性。 但是,我想为特定的依赖关系提供一对令牌。 例如,在下面的代码中,我可以得到显示的结果。

import spacy
nlp = spacy.load("en_core_web_md")
sentence = 'The Marlins were stymied by Austin Gomber and the Rockies in their 4-3 loss'
doc = nlp(sentence)
for token in doc:
    print (token, token.dep_)

当前输出:

The det
Marlins nsubjpass
were auxpass
stymied ROOT
by agent
Austin compound
Gomber pobj
and cc
the det
Rockies conj
in prep
their poss
4 nummod
- punct
3 prep
loss pobj

但我渴望得到的是: (请忽略输出样式,我只想得到一对具有特定依赖关系的token,例如,这里是pobj

'Gomber' is a 'pobj' of 'by'
'Loss' is a 'pobj' of 'in'

换句话说,我不仅想得到当前输出的结果,我还想得到每个单词的paired标记。

对于conj的依赖关系,我只用token.conjuncts就可以得到它们,但是对于其余的其他依赖关系,比如pobjprep,我还没有找到任何方法可以直接在 spacy 中使用。

有没有人暗示要获得这个pobj 关系?提前致谢!

【问题讨论】:

    标签: python parsing nlp dependencies spacy


    【解决方案1】:

    您可以使用头部索引。例如,

    tok_l = doc.to_json()['tokens']
    for t in tok_l:
      head = tok_l[t['head']]
      print(f"'{sentence[t['start']:t['end']]}' is {t['dep']} of '{sentence[head['start']:head['end']]}'")
    

    结果:

    'The' is det of 'Marlins'
    'Marlins' is nsubjpass of 'stymied'
    'were' is auxpass of 'stymied'
    'stymied' is ROOT of 'stymied'
    'by' is agent of 'stymied'
    'Austin' is compound of 'Gomber'
    'Gomber' is pobj of 'by'
    'and' is cc of 'Gomber'
    'the' is det of 'Rockies'
    'Rockies' is conj of 'Gomber'
    'in' is prep of 'stymied'
    'their' is poss of 'loss'
    '4' is nummod of 'loss'
    '-' is punct of '3'
    '3' is prep of '4'
    'loss' is pobj of 'in'
    

    【讨论】:

    • 非常感谢,这项工作非常好。还有一个问题,你能解释一下'doc.to_json()'的用法吗?网上查了一下,没有太多相关信息,再次感谢!
    • @Melina 它只是创建了一个包含各种字段的字典,请参阅source
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-04-03
    • 2014-06-19
    • 2020-12-07
    • 2019-12-13
    • 2015-04-14
    • 1970-01-01
    相关资源
    最近更新 更多