Tianheng Cheng's starred repositories
llama3-from-scratch
llama3 implementation one matrix multiplication at a time
HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Chinese-LLaMA-Alpaca-3
中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3
flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
recurrentgemma
Open weights language model from Google DeepMind, based on Griffin.
Grounding-DINO-1.5-API
API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
HallusionBench
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
tiny-flash-attention
flash attention tutorial written in python, triton, cuda, cutlass
Linearized-LLM
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models