zhanghaonan777's starred repositories
tensorrtllm_backend
The Triton TensorRT-LLM Backend
llm-inference-benchmark
LLM Inference benchmark
VideoMamba
VideoMamba: State Space Model for Efficient Video Understanding
ABigSurveyOfLLMs
A collection of 150+ surveys on LLMs
Chat-UniVi
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
gemma_pytorch
The official PyTorch implementation of Google's Gemma models
GroundingGPT
[ACL 2024] GroundingGPT: Language-Enhanced Multi-modal Grounding Model
SoraReview
The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".
FlagEmbedding
Retrieval and Retrieval-augmented LLMs
jsonformer
A Bulletproof Way to Generate Structured JSON from Language Models
OpenAgents
OpenAgents: An Open Platform for Language Agents in the Wild
TAAC-2021-Task2-Rank6
2021 腾讯广告赛算法大赛 赛道二 决赛第六名
multimodal-knowledge-graph
A collection of resources on multimodal knowledge graph, including datasets, papers and contests.
pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
Awesome-Multimodality
A Survey on multimodal learning research.
Multimodal-AND-Large-Language-Models
Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.
Visual-Chinese-LLaMA-Alpaca
多模态中文LLaMA&Alpaca大语言模型(VisualCLA)