【问题标题】:Stanford CoreNLP: How to get RelationTriple triples with only Name Entities (OpenIE)?斯坦福 CoreNLP:如何仅使用名称实体 (OpenIE) 获得 RelationTriple 三元组?
【发布时间】:2017-08-07 17:44:49
【问题描述】:

我目前正在 CoreNLP 开放信息提取 (OpenIE) 中搜索仅包含 NameEntities 的关系三元组(主语、谓语、宾语) > 在主题和对象类型中。但我不知道如何获取RelationTriple 对象的实体类型,即List<CoreMap>

下面是来自https://stanfordnlp.github.io/CoreNLP/openie.html的代码:

import edu.stanford.nlp.ie.util.RelationTriple;
import edu.stanford.nlp.ling.CoreAnnotations;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import edu.stanford.nlp.naturalli.NaturalLogicAnnotations;
import edu.stanford.nlp.util.CoreMap;

import java.util.Collection;
import java.util.Properties;

/**
 * A demo illustrating how to call the OpenIE system programmatically.
 */
public class OpenIEDemo {

  public static void main(String[] args) throws Exception {
    // Create the Stanford CoreNLP pipeline
    Properties props = new Properties();
    props.setProperty("annotators", "tokenize,ssplit,pos,lemma,depparse,natlog,openie");
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

    // Annotate an example document.
    Annotation doc = new Annotation("Obama was born in Hawaii. He is our president.");
    pipeline.annotate(doc);

    // Loop over sentences in the document
    for (CoreMap sentence : doc.get(CoreAnnotations.SentencesAnnotation.class)) {
      // Get the OpenIE triples for the sentence
      Collection <RelationTriple> triples = sentence.get(NaturalLogicAnnotations.RelationTriplesAnnotation.class);
      // Print the triples
      for (RelationTriple triple : triples) {
      // Here is where I get the entity type from a triple's subject or object
        System.out.println(triple.confidence + "\t" +
            triple.subjectLemmaGloss() + "\t" +
            triple.relationLemmaGloss() + "\t" +
            triple.objectLemmaGloss());
      }
    }
  }
}

如果有某种方法可以从RelationTriple 类中获取实体类型,我将不胜感激。

【问题讨论】:

    标签: java stanford-nlp information-extraction


    【解决方案1】:

    subjectobject 实例变量应该是CoreLabls 的列表,它们具有通过#ner() 方法附加的命名实体信息。像下面这样的东西应该做你想做的事:

    Collection<RelationTriple> triples = sentence.get(RelationTriplesAnnotation.class);
    List<RelationTriple> withNE = triples.stream()
        // make sure the subject is entirely named entities
        .filter( triple -> 
            triple.subject.stream().noneMatch(token -> "O".equals(token.ner())))
        // make sure the object is entirely named entities
        .filter( triple -> 
            triple.object.stream().noneMatch(token -> "O".equals(token.ner())))
        // Convert the stream back to a list
        .collect(Collectors.toList());
    

    【讨论】:

    • 我想要的是获取实体的 标签。例如,“Temer [PERSON] 是 [NR] 巴西 [LOCATION] 的裁判”。 RelationTriple 类只有类型的光泽,没有标签。
    • 三元组标记上的 .ner 函数应该为您提供类型,例如 PERSON。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2017-01-24
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-03-17
    • 1970-01-01
    相关资源
    最近更新 更多