TECPEC: Textual Emotion-Cause Pair Extraction in Conversations

Overview

This project presents our submission to SemEval-2024 Task 3, Subtask 1, "Textual Emotion-Cause Pair Extraction in Conversation". We introduce a two-step pipeline architecture to identify emotion-cause pairs in textual conversations. Our team ranked 5th out of 31 participating teams. 🎉🎉🎉

Check out the SemEval-2024 Task 3 Leaderboard for detailed rankings and visit the SemEval-2024 Task 3 Official Page for more information about the competition.

Abstract

In this project, we aim to develop novel model architectures and employ novel techniques to improve the detection of textual emotions and their causes in conversations. The primary goal is to enhance human-computer interaction through better understanding and extraction of emotions and their causes in textual data.

Project Structure

data/: Contains the dataset used for training and evaluation.
models/: Contains the model architectures and trained models.
scripts/: Scripts for training, evaluation, and data preprocessing.
results/: Contains the results and logs of model performance.
Report.pdf: Detailed project report.
README.md: This file.

Methodology

Task Definition

The task involves identifying emotion-cause pairs within conversations. Each conversation consists of multiple utterances from different speakers. An emotion-cause pair is defined as an emotion utterance along with its emotion category and the textual cause span in a specific cause utterance.

Pipeline Overview

The task pipeline comprises two stages:

Emotion Recognition in Conversation (ERC): Classification of utterances into Ekman’s six emotions (anger, joy, surprise, disgust, fear, sadness) and a "neutral" label for non-emotion utterances.
Cause-Emotion Extraction (CEE): Extraction of cause utterance corresponding to a target utterance based on its associated emotion using a Question-Answering model.

Models

Emotion Recognition in Conversation (ERC)

Utterance Level: We use transformer-based encoder architectures like BERT and RoBERTa with a classification head for emotion classification.
Conversational Level: A GPT2-based architecture is used where the entire conversation serves as a single data point, and each utterance is labelled collectively.

Cause-Emotion Extraction (CEE)

We utilize a Question-Answering model based on the method proposed by Poria et al. to identify the causal span for a target non-neutral utterance.

Dataset

The dataset contains conversations from the American sitcom F.R.I.E.N.D.S., annotated with emotion-cause pairs and emotion labels. The data distribution is as follows:

Training Set: 1236 conversations with 12144 utterances.
Validation Set: 138 conversations with 1475 utterances.
Testing Set: 665 conversations with 6301 utterances.

Experimental Setup

Sentence Embedding: We use the all-mpnet-base-v2 model from sentence-transformer to convert each utterance into a 768-dimensional vector.
BERT/RoBERTa: We extract the output embedding corresponding to the last token of the target utterance and concatenate it with the original target utterance embedding for classification.
GPT2: The whole conversation is fed into the GPT2 model, and the output embeddings are used for classification.
Question Answering Model: We use the pre-trained mrm8488/spanbert-finetuned-squadv2 model and fine-tune it for our task.

Evaluation Metrics

Emotion Recognition in Conversation (ERC)

Accuracy: Number of Correct Predictions over Total Number of Predictions.
Weighted F1: Average of class-wise F1 Score considering the proportion for each class in the dataset.
Macro F1: Average of class-wise F1 Score without considering the proportion for each class.

Cause-Emotion Extraction (CEE)

Strict Match: The predicted span should be the same as the annotated span.
Proportional Match: Considering the overlap proportion of the predicted span and the annotated one.

ERC Performance on Validation Set

Model	Accuracy	Macro F1	Weighted F1
BERT	0.32	0.27	0.32
RoBERTa	0.31	0.27	0.30
GPT2	0.36	0.30	0.37
Zero Shot GPT 4	0.38	0.12	0.28
In Context Learning GPT 4	0.58	0.37	0.53

Scores on Leaderboard

ERC Model	Cause Model	weighted_strict_f1	weighted_Proportional_f1	strict_f1	Proportional_f1
GPT 2	Simple Transformer QA (mrm8488/spanbert-finetuned-squadv2)	0.1345	0.1767	0.1283	0.1626
BERT	Simple Transformer QA (mrm8488/spanbert-finetuned-squadv2)	0.1318	0.1704	0.1267	0.1581
RoBERTa	Simple Transformer QA (mrm8488/spanbert-finetuned-squadv2)	0.1314	0.1697	0.1301	0.1629

Ablation Study on Validation Set

ERC Model	Cause Model	weighted_strict_f1	weighted_Proportional_f1	strict_f1	Proportional_f1
Ground truth	Simple Transformer QA (mrm8488/spanbert-finetuned-squadv2)	0.3430	0.4612	0.3441	0.4594
GPT 2	Simple Transformer QA (mrm8488/spanbert-finetuned-squadv2)	0.1153	0.1543	0.1135	0.1443
BERT	Simple Transformer QA (mrm8488/spanbert-finetuned-squadv2)	0.1181	0.1673	0.1148	0.1568
RoBERTa	Simple Transformer QA (mrm8488/spanbert-finetuned-squadv2)	0.1132	0.1623	0.1123	0.1557

Conclusion

Our exploration into Textual Emotion-Cause Pair Extraction in Conversations has provided a two-step pipeline that first assigns emotions to utterances in a conversation and then finds the cause for that emotion using a QA model. Despite our system’s commendable 5th place finish, the ablation study suggests further work on emotion recognition to bridge the gap to ground truth performance.

Future Work

Using state-of-the-art models for the ERC task.
Developing more robust architectures for emotion recognition and cause extraction.
Develop an end-to-end architecture that directly identifies the emotion-cause pair.
Work on Subtask 2: Multimodal input (speech, video, text).

Authors

Shreyas Kabra, CSAI, IIIT Delhi
Parthiv A Dholaria, CSE, IIIT Delhi
Kartik Singhal, CSAM, IIIT Delhi
Lakshya, CSAM, IIIT Delhi

References

S. Kumar, S. Dudeja, M. S. Akhtar, and T. Chakraborty, “Emotion flip reasoning in multiparty conversations,” IEEE Transactions on Artificial Intelligence, 2023.
S. Poria, N. Majumder, D. Hazarika, D. Ghosal, R. Bhardwaj, S. Y. B. Jian, P. Hong, R. Ghosh, A. Roy, N. Chhaya et al., “Recognizing emotion cause in conversations,” Cognitive Computation, vol. 13, pp. 1317–1332, 2021.
Z. Ding, R. Xia, and J. Yu, “End-to-end emotion-cause pair extraction based on sliding window multi-label learning,” in Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), 2020, pp. 3574–3583.
R. Xia and Z. Ding, “Emotion-cause pair extraction: A new task to emotion analysis in texts,” arXiv preprint arXiv:1906.01267, 2019.
P. Ekman, “Expression and the nature of emotion,” Approaches to emotion, vol. 3, no. 19, p. 344, 1984

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
Ablation_study		Ablation_study
CEE		CEE
Dataset		Dataset
ERC		ERC
NLP_Project_Poster.pdf		NLP_Project_Poster.pdf
README.md		README.md
Report.pdf		Report.pdf
Visualization.ipynb		Visualization.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TECPEC: Textual Emotion-Cause Pair Extraction in Conversations

Overview

Abstract

Project Structure

Methodology

Task Definition

Pipeline Overview

Models

Emotion Recognition in Conversation (ERC)

Cause-Emotion Extraction (CEE)

Dataset

Experimental Setup

Evaluation Metrics

Emotion Recognition in Conversation (ERC)

Cause-Emotion Extraction (CEE)

ERC Performance on Validation Set

Scores on Leaderboard

Ablation Study on Validation Set

Conclusion

Future Work

Authors

References

About

Releases

Packages

Languages

shreyas21563/TECPEC

Folders and files

Latest commit

History

Repository files navigation

TECPEC: Textual Emotion-Cause Pair Extraction in Conversations

Overview

Abstract

Project Structure

Methodology

Task Definition

Pipeline Overview

Models

Emotion Recognition in Conversation (ERC)

Cause-Emotion Extraction (CEE)

Dataset

Experimental Setup

Evaluation Metrics

Emotion Recognition in Conversation (ERC)

Cause-Emotion Extraction (CEE)

ERC Performance on Validation Set

Scores on Leaderboard

Ablation Study on Validation Set

Conclusion

Future Work

Authors

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages