Awesome Multimodal Named Entity Recognition 🎶📜

A collection of resources on multimodal named entity recognition.

Content

- 1. Description
- 2. Topic Order
- - 👑 Dataset
- 3. Chronological Order
- - Survey
- - 2020
- - 2021
- - 2022
- 4. Courses
Contact Me

1.Description

🐌 Markdown Format:

(Conference/Journal Year) Title, First Author et al. [Paper] [Code] [Project]

(Conference/Journal Year) [💬Topic] Title, First Author et al. [Paper] [Code] [Project]

(Optional) 🌱 or 📌

(Optional) 🚀 or 👑 or 📚

🌱: Novel idea
📌: The first...
🚀: State-of-the-Art
👑: Novel dataset/model
📚：Downstream Tasks

2. Topic Order

👑 Dataset
- (AAAI 2017) Adaptive Co-attention Network for Named Entity Recognition in Tweets [paper]
- (ACL 2018) Visual Attention Model for Name Tagging in Multimodal Social Media [paper]

3. Chronological Order

2020
- (ACL 2020) Improving Multimodal Named Entity Recognition via Entity Span Detection with Unified Multimodal Transformer [paper]
- (ACL 2020) RIVA: A Pre-trained Tweet Multimodal Model Based on Text-image Relation for Multimodal NER [paper]
2021
- (AAAI 2021) Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance [paper]
- (AAAI 2021) RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER [paper] [code]
- (EMNLP 2021) Can images help recognize entities? A study of the role of images for Multimodal NER [paper] [code]
2022
- (CVPR 2022) Flat Multi-modal Interaction Transformer for Named Entity Recognition [paper]
  - 📌 1st interpolating FLAT with MNER
  - 🚀 SOTA on Twitter15 with Bert_base_uncased but code is unavailable
- (NAACL Findings 2022) Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction [paper] [code]
  - 📌 code using refined Twitter15 dataset
- (WSDM 2022) MAF: A General Matching and Alignment Framework for Multimodal Named Entity Recognition [paper] [code]
- (SIGIR 2022) Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion [paper] [paper]
  - 📌 1st fully Transformer structure
  - 🚀 SOTA on Twitter17 using Bert_base_uncased but only implement on Twitter17
- (NAACL 2022) ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition [paper] [code]
  - 📌 Roberta_large as backbone provides powerful improvements
  - 🌱 Using OCR ect without directly using images
- (MM 2022) Query Prior Matters: A MRC Framework for Multimodal Named Entity Recognition [paper]
  - 🌱 1st MRC based framework for MNER
- (SIGIR 2022) Learning from Different text-image Pairs: A Relation-enhanced Graph Convolutional Network for Multimodal NER [paper]
  - 📌 Trustworthy performance by reimplementation
- (ICME 2022) CAT-MNER: Multimodal Named Entity Recognition with Knowledge-Refined Cross-Modal Attention [paper]
  - 🚀 SOTA on Twitter15 and Twitter17 with Roberta_large
  - 📌 Require 8 V100 GPU
- (DSAA 2022) PromptMNER: Prompt-Based Entity-Related Visual Clue Extraction and Integration for Multimodal Named Entity Recognition [paper]
  - 🚀 SOTA on Twitter15 and Twitter17 with Roberta_large
  - 📌 Require 8 V100 GPU
  - 🌱 Prompt-based
- （arxiv 2022） Multi-Granularity Cross-Modality Representation Learning for Named Entity Recognition on Social Media [paper] [code]
- (arxiv) Multi-Granularity Contrastive Knowledge Distillation for Multimodal Named Entity Recognition

Chenfeng1271 / awesome-MNER

Awesome Multimodal Named Entity Recognition 🎶📜

Content

1.Description

2. Topic Order

3. Chronological Order

About