Sentence-State LSTM for Text Representation

(pages 317–327,July, 2018. Association for Computational Linguistics)

1. Baseline BiLSTM

 Baseline BiLSTM由两个LSTM组成, 一个从左到右,一个从右到左。

对于从左到右的:输入一串词, 初始状态Sentence-State LSTM for Text Representation(论文笔记), 经过下面的步骤 反复操作,每次消耗一个词,得到Sentence-State LSTM for Text Representation(论文笔记).

 Sentence-State LSTM for Text Representation(论文笔记)

Sentence-State LSTM for Text Representation(论文笔记) : 词Sentence-State LSTM for Text Representation(论文笔记) ;

  Sentence-State LSTM for Text Representation(论文笔记): 输入门,输出门,遗忘门,真实的输入;

 Sentence-State LSTM for Text Representation(论文笔记): 模型参数;

Sentence-State LSTM for Text Representation(论文笔记): sigmoid 函数

对于从右到左的: 和从左到右的类似,只不过初始是Sentence-State LSTM for Text Representation(论文笔记),输入是 Sentence-State LSTM for Text Representation(论文笔记),输出是Sentence-State LSTM for Text Representation(论文笔记)

最后BiLSTM用 Sentence-State LSTM for Text Representation(论文笔记)来表示词Sentence-State LSTM for Text Representation(论文笔记)的隐藏向量,同时用Sentence-State LSTM for Text Representation(论文笔记)来表示这个句子的最终状态。

2. Sentence-State LSTM

 Sentence-State LSTM for Text Representation(论文笔记)

Sentence-State LSTM 简称S-LSTM:

在一个时步 t 里,可以表示成Sentence-State LSTM for Text Representation(论文笔记)

Sentence-State LSTM for Text Representation(论文笔记) : 词Sentence-State LSTM for Text Representation(论文笔记)的子状态, Sentence-State LSTM for Text Representation(论文笔记):句子的子状态。

初始状态Sentence-State LSTM for Text Representation(论文笔记),  设置Sentence-State LSTM for Text Representation(论文笔记),其中Sentence-State LSTM for Text Representation(论文笔记)是一个参数。

如上图每次计算,经如下步骤,由Sentence-State LSTM for Text Representation(论文笔记)得到Sentence-State LSTM for Text Representation(论文笔记)

Sentence-State LSTM for Text Representation(论文笔记)

Sentence-State LSTM for Text Representation(论文笔记): 一个文本窗的隐藏矩阵

Sentence-State LSTM for Text Representation(论文笔记): 门

Sentence-State LSTM for Text Representation(论文笔记): 模型参数;

Sentence-State LSTM for Text Representation(论文笔记): sigmoid 函数

Sentence-State LSTM for Text Representation(论文笔记)经下面步骤由Sentence-State LSTM for Text Representation(论文笔记)计算得到:

Sentence-State LSTM for Text Representation(论文笔记)

Sentence-State LSTM for Text Representation(论文笔记):控制门

Sentence-State LSTM for Text Representation(论文笔记): 输出门

Sentence-State LSTM for Text Representation(论文笔记): 模型参数;

和BiLSTM的对比

BiLSTM 中用了一个状态表示从开始到当前词

S-LSTM中用了一个结构性的状态表示整个句子,因为g 所以Sentence-State LSTM for Text Representation(论文笔记)包含更多的信息

文本窗的大小

文本窗的大小可以控制和相邻词的信息交换度。当文本窗的大小为2时Sentence-State LSTM for Text Representation(论文笔记)

更多的句子级节点

可以考虑加不止一个g

3. Task settings

1 分类:Sentence-State LSTM for Text Representation(论文笔记)  y 是 标注类别的概率分布

2 句子标注:每一个Sentence-State LSTM for Text Representation(论文笔记)可以表示相应词的特征

可加 attention:Sentence-State LSTM for Text Representation(论文笔记)

可加 CRF:Sentence-State LSTM for Text Representation(论文笔记)

Experiments

加了<s> </s>的句子更好一些

Sentence-State LSTM for Text Representation(论文笔记)Sentence-State LSTM for Text Representation(论文笔记)

Sentence-State LSTM for Text Representation(论文笔记)Sentence-State LSTM for Text Representation(论文笔记)

Sentence-State LSTM for Text Representation(论文笔记)Sentence-State LSTM for Text Representation(论文笔记)Sentence-State LSTM for Text Representation(论文笔记)

Sentence-State LSTM for Text Representation(论文笔记)Sentence-State LSTM for Text Representation(论文笔记)

Sentence-State LSTM for Text Representation(论文笔记)Sentence-State LSTM for Text Representation(论文笔记)

相关文章: