Ocean-627

Xinhao Xu's starred repositories

llama.cpp

LLM inference in C/C++

Language:C++MIT67341 551 3958

yolov10

YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]

Language:PythonAGPL-3.09877 51 411

FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language:PythonMIT7428 46 1046

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonMIT6643 65 82

YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Language:PythonGPL-3.04617 39 450

mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

Language:PythonMIT2309 30 229

Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

1498 46 4

Awesome-LLM-Long-Context-Modeling

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

MIT970 42 6

LLaMA-VID

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)

Language:PythonApache-2.0728 14 109

arena-hard-auto

Arena-Hard-Auto: An automatic LLM benchmark.

Language:Jupyter NotebookApache-2.0626 7 27

LongLM

[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Language:PythonMIT610 10 37

Awesome-MLLM-Hallucination

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

436 6 7

LEval

[ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark

Language:PythonGPL-3.0355 4 17

ChunkLlama

[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"

Language:PythonApache-2.0351 7 21

TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Language:PythonBSD-3-Clause283 5 45

FastV

[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Language:Python267 3 29

long-llms-learning

A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks

Language:Jupyter Notebook253 8 2

LongAlign

[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs

Language:PythonApache-2.0211 8 11

Awesome_Long_Form_Video_Understanding

Awesome papers & datasets specifically focused on long-term videos.

193 10 1

CEPE

[ACL 2024] Long-Context Language Modeling with Parallel Encodings

Language:PythonMIT141 5 5

SCLIP

Official implementation of SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference

Language:Python126 4 16

amazon-sagemaker-ground-truth-task-uis

Example task UIs for Amazon SageMaker Ground Truth

Language:HTMLMIT-0108 10 13

ControlMLLM

[NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'

Language:PythonApache-2.085 3 5

HA-DPO

Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization

Language:PythonApache-2.063 4 8

PAI

[ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs

Language:PythonMIT62 2 4

hallucination-foundation-model-survey

A Survey of Hallucination in Large Foundation Models

50 10

Q-LLM

This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"

Language:Python38 1 3

VideoHallucer

VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)

Language:PythonMIT22 5 2

gist-icl

Repository for "GistScore: Learning Better Representations for In-Context Example Selection with Gist Bottlenecks", NAACL'25 Best Student Paper.

Language:Python3 1 2

StadiumsTHU

Project

Language:JavaScript1 2 28