11-777 lecture 1.2 Dateset and task

文章目录

background
O
KR
- 1. Identify tasks/applications of multimodal

background

Recently, I find a good cources about multimodal machine learning. In this blog, I will study it and note my understanding.
Here is orgin URL : ppt

O

master multimodal datasets and tasks

KR

▪ Identify tasks/applications of multimodal
machine learning
▪ Knowledge of available datasets to tackle the
challenges
▪ Appreciation of current state-of-the-art

1. Identify tasks/applications of multimodal

1. affect recongnition

metic :

lable emtion
arousal, valence
dataSet :
AFEW – Acted Facial Expressions in the Wild (part of EmotiW Challenge)
Three AVEC challenge datasets 2011/2012,2013/2014, 2015, 2016, 2017, 2018
The Interactive Emotional Dyadic Motion Capture (IEMOCAP)
Persuasive Opinion Multimedia (POM)
Multimodal Corpus of Sentiment Intensity and Subjectivity Analysis in Online Opinion Videos (MOSI)
Multimodal sentiment and emotion recognition( CMU-MOSEI )
Tumblr Dataset – Tumblr posts with images and emotion word tags.
Multimodal humor sensing.(Video from RGB-d camera, but no audio/language)

State-of-the- art :
2015 challenge winner: Using multimodal BILSTM to fusion.
11-777 lecture 1.2 Dateset and task
Emotiw 2016 winner :Using CNN-RNN and C3D hybrid networks(later fusion)

Emotiw 2017 winner :Learning Supervised Scoring Ensemble

2. Personality/trait recognition

dataSet :

VGD – Video Game Dataset, game rating based on text and trailer screenshots.
Multimodal Dyadic Behaviour Database

3. Media description

task one Media description : Given a piece of media (image, video, audiovisual clips) provide a free form text description.

task two VQA : Given an image and a question, answer the question
task three Referring Expression: Generation (Bounding Box to Text) and Comprehension (Text to Bounding Box)

task four : Visual Dialog
11-777 lecture 1.2 Dateset and task