zhangshushu15's starred repositories

ollama

Get up and running with Llama 3, Mistral, Gemma, and other large language models.

InstantID

InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥

Language:PythonLicense:Apache-2.0Stargazers:10346Issues:123Issues:200

magic-animate

[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Language:PythonLicense:BSD-3-ClauseStargazers:10069Issues:103Issues:140

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:9114Issues:95Issues:624

PhotoMaker

PhotoMaker

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:8577Issues:98Issues:122

tutorials

PyTorch tutorials.

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:7938Issues:177Issues:768

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7273Issues:84Issues:1488

gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.

Language:C++License:Apache-2.0Stargazers:5670Issues:37Issues:72

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:5539Issues:46Issues:73

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4139Issues:55Issues:121

highway

Performance-portable, length-agnostic SIMD with runtime dispatch

Language:C++License:Apache-2.0Stargazers:3919Issues:45Issues:362

WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

Language:Jupyter NotebookLicense:MITStargazers:3524Issues:71Issues:94

T2I-Adapter

T2I-Adapter

Language:PythonLicense:Apache-2.0Stargazers:3273Issues:40Issues:107

Moore-AnimateAnyone

Character Animation (AnimateAnyone, Face Reenactment)

Language:PythonLicense:Apache-2.0Stargazers:2879Issues:34Issues:136

DeepDanbooru

AI based multi-label girl image classification system, implemented by using TensorFlow.

Language:PythonLicense:MITStargazers:2524Issues:37Issues:90

swift-coreml-diffusers

Swift app demonstrating Core ML Stable Diffusion

Language:SwiftLicense:Apache-2.0Stargazers:2429Issues:40Issues:62

mixtral-offloading

Run Mixtral-8x7B models in Colab or consumer desktops

Language:PythonLicense:MITStargazers:2265Issues:30Issues:26

gemma

Open weights LLM from Google DeepMind.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2175Issues:33Issues:25

direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Language:PythonLicense:Apache-2.0Stargazers:1793Issues:19Issues:77

RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)

Language:Jupyter NotebookLicense:AGPL-3.0Stargazers:1578Issues:25Issues:48

FreeU

FreeU: Free Lunch in Diffusion U-Net (CVPR2024 Oral)

lectures

Material for cuda-mode lectures

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1279Issues:23Issues:6

MOSS-RLHF

MOSS-RLHF

Language:PythonLicense:Apache-2.0Stargazers:1200Issues:33Issues:50

summarize-from-feedback

Code for "Learning to summarize from human feedback"

Language:PythonLicense:NOASSERTIONStargazers:964Issues:147Issues:21

improved-aesthetic-predictor

CLIP+MLP Aesthetic Score Predictor

Language:PythonLicense:Apache-2.0Stargazers:766Issues:6Issues:10
Language:PythonLicense:Apache-2.0Stargazers:765Issues:12Issues:34

tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation

Language:PythonLicense:MITStargazers:682Issues:15Issues:58

StyleSelectorXL

This repository contains a Automatic1111 Extension allows users to select and apply different styles to their inputs using SDXL 1.0.

aesthetic-predictor

A linear estimator on top of clip to predict the aesthetic quality of pictures

Language:Jupyter NotebookLicense:MITStargazers:403Issues:13Issues:6

ava_downloader

:arrow_double_down: Download AVA dataset (A Large-Scale Database for Aesthetic Visual Analysis)