IIIT–Hyderabad researchers introduce new ML model that teaches machine to interpret emotions

By Telangana Today Published Date - 07:30 PM, Thu - 8 June 23

Representational Image

Hyderabad: You must have come across a machine greeting people at restaurants, guiding them to tables besides preparing and serving food or even sweeping the floor. But then, here is a new AI machine learning model that can be leveraged into interpreting human emotions to assist people in the field of personal healthcare.

The International Institute of Information Technology (IIIT) – Hyderabad researchers have come up a new machine learning model that teaches the machine to interpret emotions using movies.

In their study titled “How you feelin’? Learning emotions and mental states in movie scenes”, the researchers introduced a machine learning model to understand and label emotions not just for each movie character but also for overall scene.

With cinema possessing a vast amount of emotional data, the research group took movies as their starting point. “A character can go through a range of emotions in a single scene – from surprise and happiness to anger and even sadness. The emotions in a scene cannot be summarised with a single label and estimating multiple emotions and mental states is important,” the study’s primary author, Dhruv Srivastava said.

According to researchers, there is a distinction between an emotion and a mental state and clarified that while the former can be explicit and visible, for instance happy and angry, the latter refers to thoughts or feelings that may be difficult to discern externally, for example honest and helpful.

Decoding emotion and mental state based only on the language used was fraught with difficulty, co-author Prof. Makarand Tapaswi said. “Take the statement, I hate you. Interpreted in isolation, bereft of visual cues, a machine will likely label the underlying emotion as ‘anger’. However, the same statement could be uttered in a playful manner where the character is smiling at another while saying it thereby confusing machines,” he said.

Researchers used an existing dataset of movie clips collected by Prof. Tapaswi, in his previous work named MovieGraphs. The EmoTx, was trained to accurately label emotions and mental states of characters in each scene. For this, the researchers used a three-pronged process –analysing the full video and the actions involved, interpreting individual facial features of various characters and extracting the subtitles that accompanied the dialogues in each scene.

“Based on the three criteria, we were able to predict the corresponding mental states of the characters which are not explicit in the scenes,” co-author Aditya Kumar Singh said.

According to researchers, the model can also be leveraged to assist in the field of personal healthcare. The study has been accepted for presentation at the conference on Computer Vision and Pattern Recognition 2023 at Vancouver, Canada, from June 18 to 23.