【发布时间】:2020-05-03 01:36:35
【问题描述】:
我在理解 SpaCy Matcher 模块时遇到了一些问题。
我有一句话:I think this is great, but I would not do it again
我想返回 but I would not do it again 文本。
到目前为止我所拥有的是:
nlp = spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocab)
pattern = [{"LOWER": "but"}]
doc = nlp("I think this is great, but I would not do it again")
matches = matcher(doc)
for match_id, start, end in matches:
string_id = nlp.vocab.strings[match_id] # Get string representation
span = doc[start:end] # The matched span
print(span.text)
此代码仅返回but。
此外,是否可以为模式匹配创建一个字符串列表,例如:
list_of_match_words = ['but', 'particularly']
pattern = [{'LOWER'}: list_of_match_words}]
或者类似的?我知道上面不会运行。
【问题讨论】: