I will be talking about my recent research at ILLC in NLPitch event!

Title: Multimodal Emotion Recognition through Deep Learning

Abstract : A great amount of affective information is displayed through facial expressions, gestures, speech, and other means, making multimodality a compelling approach to emotion recognition. In this talk, I will discuss emotion recognition using both facial expressions and speech signals as well as bodily cues.

In the domain of emotion recognition based on facial expressions and speech signals, we proposed novel computational methods to capture the complementary information provided by audio-visual cues. Our research shows how emotion recognition depends on emotion annotation, the perceived modalities, modalities’ robust data representations, and the computational modeling of expressive cues. I will discuss meta-analysis studies and evaluations to show the impact of multimodal and temporal information in emotion recognition.

Turning to interpretable bodily expressed emotion recognition, an understudied research problem, I will introduce our approach which utilized body joints of the human skeleton. These body joints were represented as a graph as a base for Graph Convolutional Networks. Evaluations show that the proposed methodology offers accurate performance and interpretable decisions, demonstrating which body part contributes the most to its inference. I will discuss experimental results that include, e.g., the significance of arm movements in emotion recognition which is consistent with findings of perceptual studies