Skip to content

Official repository of the paper "InterCLIP-MEP: Interactive CLIP and Memory-Enhanced Predictor for Multi-modal Sarcasm Detection"

License

Notifications You must be signed in to change notification settings

CoderChen01/InterCLIP-MEP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

😼InterCLIP-MEP: Interactive CLIP and Memory-Enhanced Predictor for Multi-modal Sarcasm Detection arXiv

📄Abstract

The prevalence of sarcasm in social media, conveyed through text-image combinations, presents significant challenges for sentiment analysis and intention mining. Current multi-modal sarcasm detection methods have been proven to struggle with biases from spurious cues, leading to a superficial understanding of the complex interactions between text and image. To address these issues, we propose InterCLIP-MEP, a robust framework for multi-modal sarcasm detection. InterCLIP-MEP introduces a refined variant of CLIP, Interactive CLIP (InterCLIP), as the backbone, enhancing sample representations by embedding cross-modality information in each encoder. Furthermore, a novel training strategy is designed to adapt InterCLIP for a Memory-Enhanced Predictor (MEP). MEP uses dynamic dual-channel memory to store valuable historical knowledge of test samples and then leverages this memory as a non-parametric classifier to derive the final prediction. By using InterCLIP to encode text-image interactions more effectively and incorporating MEP, InterCLIP-MEP offers a more robust recognition of multi-modal sarcasm. Experiments demonstrate that InterCLIP-MEP achieves state-of-the-art performance on the MMSD2.0 benchmark.

Framework overview

ℹ️Installation

Virtual Environment

We use pyenv to manage the Python environment.

If you haven't installed Python 3.9, please run the following command:

pyenv install 3.9

Note: pyenv will try its best to download and compile the wanted Python version, but sometimes compilation fails because of unmet system dependencies, or compilation succeeds but the new Python version exhibits weird failures at runtime. (ref: https://github.com/pyenv/pyenv/wiki#suggested-build-environment)

Then, create a virtual environment with the following command:

pyenv virtualenv 3.9.19 mmsd-3.9.19

Finally, activate the virtual environment:

pyenv activate mmsd-3.9.19

You can also create the virtual environment in any way you prefer.

Dependencies

We use poetry to manage the dependencies. Please install it first.

Then, install the dependencies with the following command:

poetry install

⚗️Reproduce Results

Dataset on HF

# Main results
./scripts/run_main_results.sh
Click to see the results Main Results
# Ablation study
./scripts/run_ablation_study.sh
Click to see the results Ablation Study
# LoRA analysis
./scripts/run_lora_analysis.sh
Click to see the results LoRA Analysis
# Hyperparameter study for InterCLIP-MEP w/ T2V
./scripts/run_hyperparam_study.sh
Click to see the results Hyperparameter Study

🤗Acknowledgement

📃Reference

If you find this project useful for your research, please consider citing the following paper:

@article{chen2024interclipmep,
  title   = {InterCLIP-MEP: Interactive CLIP and Memory-Enhanced Predictor for Multi-modal Sarcasm Detection},
  author  = {Junjie Chen and Subin Huang},
  year    = {2024},
  journal = {arXiv preprint arXiv: 2406.16464}
}

📝License

See the LICENSE file for license rights and limitations (MIT).

📧Contact

If you have any questions about our work, please do not hesitate to contact Junjie Chen.

About

Official repository of the paper "InterCLIP-MEP: Interactive CLIP and Memory-Enhanced Predictor for Multi-modal Sarcasm Detection"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages