zhangshushu15's starred repositories
magic-animate
[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
PhotoMaker
PhotoMaker
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
T2I-Adapter
T2I-Adapter
Moore-AnimateAnyone
Character Animation (AnimateAnyone, Face Reenactment)
DeepDanbooru
AI based multi-label girl image classification system, implemented by using TensorFlow.
swift-coreml-diffusers
Swift app demonstrating Core ML Stable Diffusion
mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
summarize-from-feedback
Code for "Learning to summarize from human feedback"
improved-aesthetic-predictor
CLIP+MLP Aesthetic Score Predictor
StyleSelectorXL
This repository contains a Automatic1111 Extension allows users to select and apply different styles to their inputs using SDXL 1.0.
aesthetic-predictor
A linear estimator on top of clip to predict the aesthetic quality of pictures
ava_downloader
:arrow_double_down: Download AVA dataset (A Large-Scale Database for Aesthetic Visual Analysis)