wondervictor

followers

following

stars

Huazhong University of Science and Technology

China

https://scholar.google.com/citations?user=PH8rJHYAAAAJ&hl

Organizations

HRNet

hustvl

msra-alumni

Tianheng Cheng's starred repositories

ChatTTS

ChatTTS is a generative speech model for daily dialogue.

Language:Jupyter NotebookNOASSERTION20466 132 220

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookMIT9864 64 11

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Language:PythonMIT8481 80 33

yolov10

YOLOv10: Real-Time End-to-End Object Detection

Language:PythonAGPL-3.06859 38 168

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonApache-2.05789 67 183

HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Language:PythonNOASSERTION2087 26 62

MambaOut

MambaOut: Do We Really Need Mamba for Vision?

Language:PythonApache-2.01769 6 236

CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Language:PythonApache-2.01199 22 73

Chinese-LLaMA-Alpaca-3

中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3

Language:PythonApache-2.01040 16 41

LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

Language:Python698 10 30

CMMLU

CMMLU: Measuring massive multitask language understanding in Chinese

Language:Python598 12 31

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonMIT555 18 14

recurrentgemma

Open weights language model from Google DeepMind, based on Griffin.

Language:PythonApache-2.0541 16 5

Grounding-DINO-1.5-API

API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

Language:PythonApache-2.047200

APE

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Language:PythonApache-2.0444 6 46

LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.

Language:PythonApache-2.0417 21 27

megalodon

Reference implementation of Megalodon 7B model

Language:CudaMIT397 9 6

SimPO

SimPO: Simple Preference Optimization with a Reference-Free Reward

Language:Python369 7 14

Vista

A Generalizable World Model for Autonomous Driving

Language:PythonApache-2.026900

MM-Vet

MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)

Language:PythonApache-2.0190 2 5

HallusionBench

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

Language:PythonBSD-3-Clause185 4 10

DeLVM

Language:Python93 1 9

gated_linear_attention

Language:PythonMIT84 6 8

DiG

DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention

Language:PythonMIT7500

tiny-flash-attention

flash attention tutorial written in python, triton, cuda, cutlass

Language:Cuda6500

GR-1

Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"

Language:PythonApache-2.057 3 7

ViG

MIT5300

CCoT

[CVPR 2024] Official Code for the Paper "Compositional Chain-of-Thought Prompting for Large Multimodal Models"

Language:PythonMIT2400

OnnxSlim

A Toolkit to Help Optimize Large Onnx Model

Language:PythonMIT1400

Linearized-LLM

[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models

Apache-2.0200