zhangshushu15

followers

following

stars

zhangshushu15's starred repositories

dreambooth

CC-BY-4.078200

Tune-A-Video

[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

Language:PythonApache-2.0414100

AnimateDiff

Official implementation of AnimateDiff.

Language:PythonApache-2.0964600

Mr.-Ranedeer-AI-Tutor

A GPT-4 AI Tutor Prompt for customizable personalized learning experiences.

edm

Elucidating the Design Space of Diffusion-Based Generative Models (EDM)

Language:PythonNOASSERTION117200

consistency_models

Official repo for consistency models.

Language:PythonMIT602000

conceptual-12m

Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.

NOASSERTION33300

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonAGPL-3.0250600

adapters

A Unified Library for Parameter-Efficient and Modular Transfer Learning

Language:Jupyter NotebookApache-2.0245500

Fooocus

Focus on prompting and generating

Language:PythonGPL-3.03780300

style2paints

sketch + style = paints :art: (TOG2018/SIGGRAPH2018ASIA)

Language:JavaScriptApache-2.01788100

ControlNet-v1-1-nightly

Nightly release of ControlNet 1.1

Language:Python446900

distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Language:PythonMIT332200

latent-consistency-model

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Language:PythonMIT418400

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookNOASSERTION1049300

lyra

A Very Low-Bitrate Codec for Speech Compression

Language:C++Apache-2.0379600

audio-diffusion-pytorch

Audio generation using diffusion models, in PyTorch.

Language:PythonMIT186700

encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Language:PythonMIT329100

Make-An-Audio

PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model

Language:PythonMIT71900

riffusion

Stable diffusion for real-time music generation

Language:PythonMIT328300

Mubert-Text-to-Music

A simple notebook demonstrating prompt-based music generation via Mubert API

Language:Jupyter Notebook272500

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonMIT2013300

generative-models

Generative Models by Stability AI

Language:PythonMIT2314100

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Language:PythonMIT404400

GPTQ-for-LLaMa

4 bits quantization of LLaMA using GPTQ

Language:PythonApache-2.0294500

skypilot

SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.

Language:PythonApache-2.0615900

openai-python

The official Python library for the OpenAI API

Language:PythonApache-2.02099400

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonApache-2.03827200

InternLM

Official release of InternLM2 7B and 20B base and chat models. 200K context support

Language:PythonApache-2.0547500

xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Language:PythonApache-2.0314600