BAG: Bi-directional Attention Entity Graph Convolutional Network for Multi-hop Reasoning QA

BAG: Bi-directional Attention Entity Graph Convolutional Network for Multi-hop Reasoning Question Answering （NAACL-2019）阅读笔记

动机：	从single-hop 到 multi-hop
贡献：	在node- query 中构建双向注意力学习query-aware representation 构建node 和 query 时，使用multi-level feature, 并引入了NER和POS [如果使用的是BERT, 不知道引入这些手工特征影响大不大？]
模型：	1. Entity Graph Construction 所有在文档集合中的candidate构成了entity graph的顶点，之后定义无向边基于节点对的位置属性。 1）跨文档边：定义在出现在不同文档中的相同entity之间 2）文档内部边：定义在同一文档的每个实体之间 2. Multi-level Features 作者将node 和query 用多个层面的特征表示；具体地， node 的表示：给定一个node, 计算node 中所有tokens 的GloVe embedding的平均值，得到token-level 特征。再计算node 中所有tokens 的ELMO embeddings的平均值，得到context-level 特征。再将这两个平均值用输入到1-layer linear network中进行编码融合，得到; 再叠加两个手工特征，NER(命名实体识别)和POS（词性标注），目的是反映token的语义属性。最终的node 表示为 query的表示：将query 用Bi-LSTM进行编码，得到. 同理，最终query的表示为。 3.GCN Layer [] 【其实，这个地方我并没有看懂，作者并没有解释如何得到，以及后半部分又介绍了一种的计算，WHY???】 4. Bi-directional Attention Between a Graph and a Query (1) 计算similarity matrix : ,其中最后一层GCN层所有节点的表示，是编码后的query feature representation(我在文字并没有看到类似的介绍，或者是我看漏了？) （2）计算node-to-query attention： (3) 计算query-to-node attention： (4) 最终的输出： Our bi-directional attention layer is the concatenation of the original nodes feature, nodes-toquery attention, the element-wise multiplication of nodes feature and nodes-to-query attention, and multiplication of nodes feature and query-to-nodes attention. 5.Output layer 经过全连接层的tanh**函数，最后通过一个softmax layer评估图中的每个节点是答案的概率，因为每个candidate可能会在图中多次出现（比如同一实体出现在不同文档中的时候），每个candidate的概率为所有对应node的和。
总结：	node 和 query 可以进行bi-attention 计算此外，很多地方作者都没交代清楚，我是没看懂。。。。

相关文章：

猜你喜欢

相关资源

相似解决方案

热门标签

Java Python linux javascript Mysql C# Docker 算法前端 SpringBoot Redis Vue spring 设计模式 .net core .net kubernetes c++ 数据库数据结构大数据 js 机器学习微服务 Android Go 程序员面试 JVM ASP.net core 云原生人工智能后端 PHP git CSS golang k8s Nginx Django mybatis 深度学习多线程 React 架构 devops 爬虫云计算 Spring Boot LeetCode