lvlm

There are 0 repository under lvlm topic.

NVlabs / EAGLE
Eagle: Frontier Vision-Language Models with Data-Centric Strategies
demo eagle gpt4 huggingface llama llama3 llava lmm lvlm mllm llm large-language-models nvdia
Language:Python 865
YingqingHe / Awesome-LLMs-meet-Multimodal-Generation
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
aigc large-language-models large-vision-language-models multimodal-generation multimodal-large-language-models multimodal-models multimodality text-to-3d text-to-audio text-to-image text-to-music text-to-sound text-to-speech text-to-video llm lvlm mllm
Language:HTML 452
ModelTC / llmc
[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
awq benchmark deployment evaluation internlm2 large-language-models lightllm llama3 llm lvlm mixtral omniquant post-training-quantization pruning quantization quarot smoothquant spinquant tool vllm
Language:Python 450
MMStar-Benchmark / MMStar
[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
evaluation large-language-models large-multimodal-models large-vision-language-model large-vision-language-models llm llms lvlm lvlms multimodal multimodal-learning multimodality visual-question-answering
Language:Python 194
NishilBalar / Awesome-LVLM-Hallucination
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
hallucination hallucination-detection large-vision-language-models multimodal-large-language-models hallucination-mitigation hallucination-survey hallucination-evaluation large-language-models vision-language-models hallucination-benchmark hallucination-research llm lvlm mllm mlm multimodal-language-model
167
thu-nics / FrameFusion
The official code implementation of paper "Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models"
efficient-deep-learning llm lvlm video
Language:Python 37
The-Martyr / Awesome-Multimodal-Reasoning
Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models
chain-of-thought cot large-language-models llm lvlm mllm r1 video-reasoning video-understanding multimodal-learning multimodal-reasoning reinforcement-learning rl o1 image-generation image-reasoning image-understanding video-generation
36
wang2226 / Awesome-LLM-Decoding
📜 Paper list on decoding methods for LLMs and LVLMs
llm llm-inference lvlm natural-language-processing nlp awesome awesome-list
34
OpenSparseLLMs / CLIP-MoE
CLIP-MoE: Mixture of Experts for CLIP
clip lvlm mixture-of-experts moe openai-clip
Language:Python 29
hasanar1f / HiRED
[AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Vision-Language Models (e.g., LLaVA-Next) under a fixed token budget.
llava llava-next lvlm ml vlm
Language:Python 28
tsinghua-fib-lab / SmartAgent
The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".
chain-of-thought embodied-ai human-centric-ai human-computer-interaction large-language-model llm-agent llm-reasoning lvlm multi-modal openai-o1 personalization
25
fan19-hub / LEMMA
LEMMA: An effective and explainable way to detect multimodal misinformation with LVLM and external knowledge augmentation, incorporating the intuition and reasoning capbility inside LVLM.
lvlm misinformation rag
Language:Jupyter Notebook 19
top-yun / SPARK
A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.
benchmark lvlm lvm multisensor
Language:Python 18
CharlieDDDD / AISurveyPapers
Large Visual Language Model(LVLM), Large Language Model(LLM), Multimodal Large Language Model(MLLM), Alignment, Agent, AI System, Survey
agent agi ai-system llm mllm survey lvlm alignment
17
Sreyan88 / VDGD
Code for ICLR 2025 Paper: Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs
decoding hallucination lvlm reasoning
Language:Python 9
Ruiyang-061X / Awesome-MLLM-Reasoning
📖Curated list about reasoning abilitiy of MLLM, including OpenAI o1, OpenAI o3-mini, and Slow-Thinking.
chain-of-thought chain-of-thought-reasoning deepseek-r1 lvlm mllm multi-modal multi-modal-large-language-model o1 openai reasoning reasoning-language-models slow-thinking awesome o3-mini
4
UBSec / UGCG-Guard
Code for USENIX Security 2024 paper: Moderating Illicit Online Image Promotion for Unsafe User Generated Content Games Using Large Vision-Language Models.
children-games lvlm nsfw-detection
Language:Python 4
camilochs / visgraphvar
VisGraphVar: A Benchmark Generator for Assessing Variability in Graph Analysis Using Large Vision-Language Models
computer-vision graph-theory large-vision-language-models lvlm
Language:Python 2
patrickamadeus / vqa-nle-llava
Novel approach that leverages LVLMs to efficiently generate high-quality synthetic VQA-NLE datasets.
generative-ai lmm lvlm question-answering synthetic-data synthetic-data-generation vqa
Language:Python 2
codewithdark-git / TalkTube
A powerful Streamlit application that allows users to analyze and interact with YouTube video content through natural language questions.
agents genai genai-domain groq groq-api langchain langchain-python llm lvlm lvlms pyhton3 python rag streamlit webapp whisper youtube youtube-bot
Language:Python

lvlm

NVlabs / EAGLE

YingqingHe / Awesome-LLMs-meet-Multimodal-Generation

ModelTC / llmc

MMStar-Benchmark / MMStar

NishilBalar / Awesome-LVLM-Hallucination

thu-nics / FrameFusion

The-Martyr / Awesome-Multimodal-Reasoning

wang2226 / Awesome-LLM-Decoding

OpenSparseLLMs / CLIP-MoE

hasanar1f / HiRED

tsinghua-fib-lab / SmartAgent

fan19-hub / LEMMA

top-yun / SPARK

CharlieDDDD / AISurveyPapers

Sreyan88 / VDGD

Ruiyang-061X / Awesome-MLLM-Reasoning

UBSec / UGCG-Guard

camilochs / visgraphvar

patrickamadeus / vqa-nle-llava

codewithdark-git / TalkTube