LMM hallucination refers to occasional instances where LMMs generate content that appears plausible but deviates from or conflicts with the provided image. LMMs tend to rely more on their own parametric knowledge than on provided visual features, causing them to respond with guesses and generate multimodal hallucinations.
In the MLLM community, we've developed methods for detecting, evaluating, and mitigating hallucinationsš.
- FDPO: Detecting and Preventing Hallucinations in Large Vision Language Models, (Gunjal et al. 2023)
- HaELM : Evaluation and Analysis of Hallucination in Large Vision-Language Models, (Wang et al. 2023a)
- HallE-Switch : Rethinking and Controlling Object Existence Hallucinations in Large Vision-Language Models for Detailed Caption, (Zhai et al. 2023)
- POPE: Evaluating Object Hallucination in Large Vision-Language Models, (Li et al. EMNLP 2023)
- HaELM : Evaluation and Analysis of Hallucination in Large Vision-Language Models, (Wang et al. 2023a)
- HallusionBench : An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Model, (Liu et al. 2023)
- HallE-Switch : Rethinking and Controlling Object Existence Hallucinations in Large Vision-Language Models for Detailed Caption, (Zhai et al. 2023)
- NOPE: Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models, (Lovenia et al.)
- Bingo : Holistic Analysis of Hallucination in GPT-4V(ision): Bias and Interference Challenges, (Cui et al.)
- FaithScore : Evaluating Hallucinations in Large Vision-Language Models, (Jing et al.)
- AMBER : An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation, (Wang et al.)
- LRV-Instruction : Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning, (Liu et al.)
- LURE : Analyzing and Mitigating Object Hallucination in Large Vision-Language Models, (Zhou et al. 2023b)
- HallE-Switch : Rethinking and Controlling Object Existence Hallucinations in Large Vision-Language Models for Detailed Caption, (Zhai et al. 2023)
- Woodpecker : Hallucination Correction for Multimodal Large Language Models, (Yin et al.)
- LLaVA-RLHF : Aligning Large Multimodal Models with Factually Augmented RLHF, (Sun et al.)
- Volcano : Mitigating Multimodal Hallucination through Self-Feedback Guided Revision, (Lee et al.)
- HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Dataļ¼ (Yu et al.)
- VCD: Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
- HA-DPO: Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
- Mitigating Hallucination in Visual Language Models with Visual Supervision