【EMNLP2019】Event Representation Learning Enhanced with External Commonsense Knowledge

p1 in 2019/12/3

论文名称：Event Representation Learning Enhanced with External Commonsense Knowledge
… … … ：常识信息增强的事件表示学习
论文作者：丁效，廖阔，刘挺，段俊文，李忠阳
论文来源：EMNLP2019
下载链接：https://arxiv.org/pdf/1909.05190.pdf
源码链接：https://github.com/MagiaSN/CommonsenseERL_EMNLP_2019
参考笔记：https://www.jiqizhixin.com/articles/2019-09-16-9

Abstact

以前的方法：获取文本的语法和语义信息，并在下游任务（如脚本事件预测）中证实了有效性。
以前方法的不足：从原文直接抽取的事件缺乏常识信息，如事件参与者的意图和情绪（这有利于区分事件对，因为从表面上看，他们只有细微差异）。
本文的方法：利用外部常识来了解事件的意图和情绪。
本文的实验数据集：1）event similarity事件相似性；2） script event prediction脚本事件预测；3）stock market prediction股票市场预测。

Commonsense Knowledge Enhanced Event Representations

2.1 Low-Rank Tensor for Event Embedding

事件表示学习的目的为事件三元组E=(A, P ,O)学习低维稠密的向量表示，其中P是动作或谓词，A是行为人或主语，O是行为对象或宾语。事件表示模型对谓语、主语、宾语的表示进行组合。
本文沿用Ding等人（2015）的方法，使用张量神经网络（Neural Tensor Network，NTN）作为事件表示模型。NTN的结构如图3所示：
【EMNLP2019】Event Representation Learning Enhanced with External Commonsense Knowledge
模型使用双线性变换显式地建模谓语与主语、谓语与宾语及三者间的交互关系。具体公式如下：

其他的参数是一个标准的前馈神经网络，W₁为前馈神经网络的权值，b为偏置，f=tanh为**函数。
NTN的一个问题是“维度灾难”，因此本文使用low-rank tensor decomposition来模拟高阶tensor以减少模型的参数数量。Low-rank tensor decomposition的过程如图4所示。具体地，将原来张量神经网络中的张量T₁使用[T_appr]₁近似，[T_appr]₁每个切片的计算方法为：
【EMNLP2019】Event Representation Learning Enhanced with External Commonsense Knowledge

对于训练集中出现的事件，本文随机将事件的一个论文替换为另一个单词。本文假设原始事件应比替换后的事件具有更高的得分，并计算两个事件的合页损失：
【EMNLP2019】Event Representation Learning Enhanced with External Commonsense Knowledge
其中，v_e为事件的向量表示，v_e^r为替换事件成分后的向量表示，g(v_e)为事件的得分，计算方式如下：

2.2 Intent Embedding

类似地，对于训练集中的每个事件，有一个人工标注的正确意图，我们从所有意图中随机采样一个错误的意图，认为正确的意图应该比错误的意图具有更高的得分。具体地，我们使用双向LSTM得到意图文本的向量表示，并使用意图与事件向量的余弦相似度作为意图得分，计算合页损失：
【EMNLP2019】Event Representation Learning Enhanced with External Commonsense Knowledge
其中，v_i为正确意图的向量表示，v_i^r为替换事件成分后的向量表示。

2.3 Sentiment Embedding

同时，对于训练集中的每个事件，有一个标注的情感极性标签（0-消极，1-积极）。我们将事件表示作为特征输入分类器，训练该分类器预测正确情感标签的能力，从而使事件表示中带有情感极性信息，计算情感分类的交叉熵损失：
【EMNLP2019】Event Representation Learning Enhanced with External Commonsense Knowledge
其中，E为训练集中事件的集合，p_l^g(x_e)为事件正确的情感极性标签，p_l(x_e)为模型预测的事件情感极性标签。

2.4 Joint Event, Intent and Sentiment Embedding

最终的优化目标为三部分损失的加权和：
【EMNLP2019】Event Representation Learning Enhanced with External Commonsense Knowledge
模型的整理架构如图2所示：
（融合意图、情感信息的事件表示模型架构）

_{注：所有下标为neg的，均为负样本}

Experiments

3.2 Event Similarity Evaluation

本文在Hard Similarity和Transitive Sentence Similarity两个事件相似度任务上对比了模型与基线方法的效果。

Hard Similarity任务由Weber等人（2018）提出，该任务构造了两种类型的事件对，第一种事件对中，两个事件语义相近，但几乎没有单词上的重叠；第二种对事件中，两个事件单词上重叠程度较高，但语义相差较远。对每种事件表示方法，本文计算每个事件对的余弦相似度作为得分，并以相似事件对得分大于不相似事件对得分的比例作为模型的准确率。

Transitive SentenceSimilarity数据集（Kartsaklis与Sadrzadeh，2014）包含了108个事件对，每个事件对带有由人工标注的相似度得分。本文使用Spearman相关系数评价模型给出的相似度与人工标注的相似度的一致性。

表1 事件相似度实验结果
【EMNLP2019】Event Representation Learning Enhanced with External Commonsense Knowledge
两个任务的结果如表1所示。本文发现：
(1) 在Transitive SentenceSimilarity任务上，词向量均值的方法取得了很好的结果，但在Hard Similarity任务上结果很差。这主要是因为HardSimilarity数据集是专门为了区分“重叠词较多但语义不相似”“重叠词较少但语义相似”的情况。显然，在这一数据集上，词向量均值的方法无法捕获事件论元间的交互，因此无法取得较好的效果。
(2) 基于Tensor 组合的模型（NTN, KGEB, RoleFactor Tensor, Predicate Tensor）超过了加性（Additive）模型（Comp.NN, EM Comp.），表明基于Tensor组合的方法可以更好地建模事件论元的语义组合。
(3) 本文的常识知识增强的事件表示方法在两个数据集上均超过了基线方法（在Hard Similarity小数据集和大数据集上分别取得了78%和200%的提升），表明常识知识对于区分事件具有重要的作用。
表2展示了Hard Similarity任务上加入常识信息前（oScore）/后（mScore）事件相似度的变化。

表2 加入常识信息前后事件相似度变化
【EMNLP2019】Event Representation Learning Enhanced with External Commonsense Knowledge

3.3 Script Event Prediction

脚本事件预测（Chambers与Jurafsky，2008）任务定义为：给定上下文事件，从候选事件中选出接下来最有可能发生的事件。
本文在标准的MCNC数据集（Granroth-Wilding与Clark，2016）上验证模型的效果。本文沿用Li等人（2018）的SGNN的模型，仅仅用本文的事件表示模型代替SGNN中的事件表示部分。表3中的实验结果显示，本文的方法在单模型上取得了1.5%的提升，在多模型ensemble上取得了1.4%的提升，验证了更好的事件表示在该任务上的重要性。观察到，仅仅融入意图的事件表示超过了其他基线方法，表明捕获参与者的意图信息可以帮助推理他们的后续活动。另外发现只融入情感信息的事件表示也取得了比原始SGNN更好的效果，这主要是因为顺承事件间情感的一致性也可以帮助预测后续的事件。

表3 脚本事件预测实验结果
【EMNLP2019】Event Representation Learning Enhanced with External Commonsense Knowledge

3.4 Stock Market Prediction

前人的研究显示新闻事件会对股价的涨跌产生影响（Luss与d’Aspremont，2012）。本文对比了使用不同事件表示作为特征预测股市涨跌的结果，如图5所示。该实验结果显示了事件中的情感信息在股市预测任务上的有效性（取得了2.4%的提升）。

图5 股市预测实验结果
【EMNLP2019】Event Representation Learning Enhanced with External Commonsense Knowledge

Conclusion

要让计算机充分理解事件，需要将常识信息融入事件表示之中。高质量的事件表示在脚本事件预测、股市预测等许多下游任务上具有重要的作用。本文提出了一个简单而有效的事件表示学习框架，将意图、情感常识信息融入事件表示的学习之中。事件相似度、脚本事件预测、股市预测三个任务上的实验结果表明，本文的方法可以有效提高事件表示的质量，并为下游任务带来提升。

References

Erik Cambria, Soujanya Poria, Devamanyu Hazarika, and Kenneth Kwok. 2018. Senticnet 5: discovering conceptual primitives for sentiment analysis by means of context embeddings. In Proceedings of the Thirty-Second AAAI Conference on Artiﬁcial Intelligence, (AAAI-18), the 30th innovative Applications of Artiﬁcial Intelligence(IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artiﬁcial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018.
Nathanael Chambers and Dan Jurafsky. 2008. Unsupervised learning of narrative event chains. In Proceedings of ACL-08: HLT,pages 789–797. Association for Computational Linguistics.
Xiao Ding, Yue Zhang, Ting Liu, and Junwen Duan. 2014. Using structured events to predict stock price movement: An empirical investigation. In Proceedings of the 2014 Conference on Empirical Methods inNaturalLanguageProcessing,EMNLP2014,October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pages 1415–1425, Doha, Qatar. Association for Computational Linguistics.
Hannah Rashkin, Maarten Sap, Emily Allaway, Noah A. Smith, and Yejin Choi. 2018. Event2mind: Commonsense inference on events, intents, and reactions. In Proceedings of the 56th Annual MeetingoftheAssociationforComputationalLinguistics (Volume 1: Long Papers), pages 463–473. Association for Computational Linguistics.
Zhongyang Li, Xiao Ding, and Ting Liu. 2018b. Generating reasonable and diversiﬁed story ending using sequence to sequence model with adversarial training. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1033– 1043, Santa Fe, New Mexico, USA. Association for Computational Linguistics.