whai362

Wenhai Wang's starred repositories

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION24659 208 208

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Language:PythonMIT4421 43 376

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT3882 114 73

kimi-free-api

🚀 KIMI AI 长文本大模型逆向API白嫖测试【特长：长文本解读整理】，支持高速流式输出、智能体对话、联网搜索、长文档解读、图像OCR、多轮对话，零配置部署，多路token支持，自动清理会话痕迹。

Language:TypeScriptGPL-3.03466 30 101

xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Language:PythonApache-2.03451 33 457

chatgpt-prompts-for-academic-writing

This list of writing prompts covers a range of topics and tasks, including brainstorming research ideas, improving language and style, conducting literature reviews, and developing research plans.

2656 34 3

InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Language:PythonApache-2.02294 41 349

T-Rex

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Language:PythonNOASSERTION2028 36 76

Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Language:PythonMIT1596 22 98

Emu

Emu Series: Generative Multimodal Models from BAAI

Language:PythonApache-2.01576 21 85

MaskDINO

[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"

Language:PythonApache-2.01108 34 106

automatic_prompt_engineer

Language:PythonMIT1063 16 17

draw-fast

Language:TypeScriptNOASSERTION1009 70

UniRepLKNet

[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition

Language:PythonApache-2.0875 12 18

SAM-Med2D

Official implementation of SAM-Med2D

Language:Jupyter NotebookApache-2.0808 13 63

VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 30+ benchmarks

Language:PythonApache-2.0757 11 104

AlphaCLIP

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Language:Jupyter NotebookApache-2.0598 11 46

APE

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Language:PythonApache-2.0464 6 53

DCNv4

[CVPR 2024] Deformable Convolution v4

Language:PythonMIT429 7 68

MultimodalOCR

On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)

Language:Python377 13 23

DCI-VTON-Virtual-Try-On

[ACM Multimedia 2023] Taming the Power of Diffusion Models for High-Quality Virtual Try-On with Appearance Flow.

Language:PythonMIT374 30 43

Vision-RWKV

Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures

Language:PythonApache-2.0303 4 32

Mini-DALLE3

Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models

Language:Python298 4 9

LESS

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

Language:Jupyter NotebookMIT293 4 20

UniPose

[ECCV 2024] Official implementation of the paper "UniPose : Detecting Any Keypoints"

Language:PythonNOASSERTION268 18 17

HIMLoco

Learning-based locomotion control from OpenRobotLab, including Hybrid Internal Model & H-Infinity Locomotion Control

Language:PythonNOASSERTION221 12 9

ControlLLM

ControlLLM: Augment Language Models with Tools by Searching on Graphs

Language:Python177 8 5

chug

Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

Language:PythonApache-2.0139 11 3

coconut_cvpr2024

Language:Jupyter NotebookApache-2.0136 4 16

StyleRF

[CVPR 2023] StyleRF: Zero-shot 3D Style Transfer of Neural Radiance Fields

Language:Python135 4 27