ShashwatVv / Multimodal-Sarcasm-Explanation-MuSE

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


This is the repository for "Nice perfume. How long did you marinate in it? Multimodal Sarcasm Explanation" accepted at AAAI-22. In this paper, we propose a novel problem -- Multimodal Sarcasm Explanation (MuSE) -- given a multimodal sarcastic post containing an image and a caption, we aim to generate a natural language explanation to reveal the intended sarcasm. To this end, we develop MORE, a new dataset with explanation of 3510 sarcastic multimodal posts. Each explanation is a natural language (English) sentence describing the hidden irony. We benchmark MORE by employing a multimodal Transformer-based architecture, ExMore. It incorporates a cross-modal attention in the Transformer's encoder which attends to the distinguishing features between the two modalities. Subsequently, a BART-based auto-regressive decoder is used as the generator.

MuSE Example


Dataset images can be found at this link.

The format of train, validation and test set TSV files:

  • Column 1: PID, the identifier of a post
  • Column 2: Caption, the text associated with the image in a post
  • Column 3: Annotated explanation, the ground truth explanation for the sarcasm in a post

The image corresponding to a datapoint with, for example, PID=123 will be 123.jpg in the given link above.


If you find this repository useful, please cite our paper:

      title={Nice perfume. How long did you marinate in it? Multimodal Sarcasm Explanation}, 
      author={Poorav Desai and Tanmoy Chakraborty and Md Shad Akhtar},


License:MIT License


Language:Jupyter Notebook 100.0%