ACL2017 Enhancing and Combining Sequential and Tree LSTM for Natural Language Inference

论文强调将语法和句法信息相结合，来做NLI任务。
In many problems, syntax and semantics interact closely, as generally phrased in the slogan “the syntax and the semantics work together in tandem” (Barker and Jacobson, 2007), among others.

模型结构如下图：
ACL2017 Enhancing and Combining Sequential and Tree LSTM for Natural Language Inference

给定两个句子 $(a_{1}, . . ., a_{l} a)$ and $b = (b_{1}, . . ., b_{l} b)$ ，这里 $a$ 是premise， $b$ 是hypothesis。 $a_{i}$ 和 $b_{j}$ 都是 $l$ -维的词向量，可通过预训练得到。目标是预测标签 $y$ ，来指示 $a$ 和 $b$ 的逻辑关系。

BLSTM对于序列 $a$ 顺序处理，当第 $i$ 时刻 $\bar{a_{i}}$ 和第 $j$ 时刻 $\bar{b_{j}}$ 计算如下：
ACL2017 Enhancing and Combining Sequential and Tree LSTM for Natural Language Inference

ACL2017 Enhancing and Combining Sequential and Tree LSTM for Natural Language Inference

为了构建premise和hypothesis的相关性，构建attention 权重矩阵，其中每个元素为 $e_{i j}$ 。例如，在premise中的一个词的hidden state, $\bar{a_{i}}$ (已经encode了自身和上下文信息),其与hypothesis中的语义相关性定义如下：

ACL2017 Enhancing and Combining Sequential and Tree LSTM for Natural Language Inference

$\tilde{a_{i}}$ 是 ${\bar{{b_{j}}}}_{j = 1}^{l_{b}}$ 的加权求和。即在 ${\bar{{b_{j}}}}_{j = 1}^{l_{b}}$ 中与 $\bar{a_{i}}$ 相关的内容表示为 $\tilde{a_{i}}$ 。同理公式13也可以类似理解。