wuziheng

followers

following

stars

Alibaba

Beijing

ZihengWu's starred repositories

grok-1

Grok open release

Language:PythonApache-2.049208 561 202

professional-programming

A collection of learning resources for curious software engineers

Language:PythonMIT45985 985 28

MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonNOASSERTION34622 309 877

LLMs-from-scratch

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Language:Jupyter NotebookNOASSERTION23346 265 64

MediaCrawler

小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频｜评论爬虫、微博帖子｜评论爬虫

Language:PythonNOASSERTION15426 88 245

AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Language:PythonNOASSERTION9912 131 48

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Language:PythonMIT8803 82 36

yolov9

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Language:PythonGPL-3.08653 57 480

dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.

Language:Jupyter NotebookApache-2.08480 95 374

Firefly

Firefly: 大模型训练工具，支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Language:Python5343 54 267

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonApache-2.05190 38 37

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT3882 114 73

jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

Language:PythonNOASSERTION2566 37 50

InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Language:PythonApache-2.02287 41 349

data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据！

Language:PythonApache-2.01802 17 149

Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Language:PythonMIT1596 22 98

Emu

Emu Series: Generative Multimodal Models from BAAI

Language:PythonApache-2.01576 21 85

Awesome-Video-Diffusion-Models

[Arxiv] A Survey on Video Diffusion Models

FeatUp

Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024

Language:Jupyter NotebookMIT1307 19 58

EasyAnimate

📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion

Language:PythonApache-2.0867 14 53

history_rag

Language:Python795 3 59

Make-An-Audio

PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model

Language:PythonMIT725 71 13

animate-anything

Fine-Grained Open Domain Image Animation with Motion Guidance

Language:PythonMIT668 16 54

HuggingFace-Download-Accelerator

利用HuggingFace的官方下载工具从镜像网站进行高速下载。

Language:Python654 2 26

distrifuser

[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Language:PythonMIT513 8 18

video2numpy

Optimized library for large-scale extraction of frames and audio from video.

Language:PythonMIT199 3 26

SMT

This is an official implementation for "Scale-Aware Modulation Meet Transformer".

Language:PythonMIT176 2 24

coze-beautify

针对 coze （目前可免费使用 GPT-4）https://www.coze.com （海外版）和 https://www.coze.cn （大陆版）的 bot 界面优化的 Chrome 插件

Language:TypeScriptMIT6300

llm-scheduling-artifact

Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“

Language:PythonApache-2.046 4 1

blog

张振虎的博客

Language:Python18 3 16