语音情感识别论文：Emotion Recognition From Speech With Recurrent Neural Networks

问题：Should it be one emotion per whole recording or per one utterance? If one chooses utterance-based solution then how the split should be done? Is it possible for the utterance to have multiple emotions?

论文提出的方法：CTC损失函数

什么是CTC损失函数？https://blog.csdn.net/luodongri/article/details/77005948，我的理解是预测序列与标签序列的长度肯定是不一样的，所以用CTC损失函数根据预测序列来计算真实标签序列的概率。

论文模型：

语音情感识别论文：Emotion Recognition From Speech With Recurrent Neural Networks

数据集：IEMOCAP

实验结果：

语音情感识别论文：Emotion Recognition From Speech With Recurrent Neural Networks