Adam (adamfils)

adamfils

Geek Repo

Company:@Google

Location:San Francisco, CA

Github PK Tool:Github PK Tool

Adam's starred repositories

OpenVoice

Instant voice cloning by MIT and MyShell.

Language:PythonLicense:MITStargazers:28244Issues:211Issues:227

InstantID

InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥

Language:PythonLicense:Apache-2.0Stargazers:10741Issues:125Issues:217

yolov9

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Language:PythonLicense:GPL-3.0Stargazers:8788Issues:55Issues:502

EMO

Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

mmagic

OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:6842Issues:97Issues:707

dejavu

Audio fingerprinting and recognition in Python

Language:PythonLicense:MITStargazers:6397Issues:262Issues:242

OOTDiffusion

Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Language:PythonLicense:NOASSERTIONStargazers:5354Issues:74Issues:197

MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Language:PythonLicense:MITStargazers:4342Issues:39Issues:158

OLMo

Modeling, training, eval, and inference code for OLMo

Language:PythonLicense:Apache-2.0Stargazers:4337Issues:45Issues:189

YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Language:PythonLicense:GPL-3.0Stargazers:4252Issues:39Issues:423

invisible-watermark

python library for invisible image watermark (blind image watermark)

Language:PythonLicense:MITStargazers:1563Issues:16Issues:29

rq-scheduler

A lightweight library that adds job scheduling capabilities to RQ (Redis Queue)

Language:PythonLicense:MITStargazers:1424Issues:42Issues:180

Speech-Emotion-Analyzer

The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

Language:Jupyter NotebookLicense:MITStargazers:1287Issues:36Issues:62

minisora

MiniSora: A community aims to explore the implementation path and future development direction of Sora.

Language:PythonLicense:Apache-2.0Stargazers:1153Issues:18Issues:62

Real3DPortrait

Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis; ICLR 2024 Spotlight; Official code

Language:PythonLicense:MITStargazers:870Issues:24Issues:74

NeuS2

[ICCV 2023] Official code for NeuS2

Language:CudaLicense:NOASSERTIONStargazers:612Issues:22Issues:80

AnimateLCM

AnimateLCM: Let's Accelerate the Video Generation within 4 Steps!

Language:PythonLicense:MITStargazers:566Issues:29Issues:33

cp-vton

Reimplemented code for "Toward Characteristic-Preserving Image-based Virtual Try-On Network"

Language:PythonLicense:MITStargazers:474Issues:16Issues:44

M2UGen

This is the official repository for M2UGen

Language:Jupyter NotebookLicense:MITStargazers:436Issues:10Issues:11

OpenGPT

A framework for creating grounded instruction based datasets and training conversational domain expert Large Language Models (LLMs).

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:328Issues:9Issues:6

vampnet

music generation with masked transformers!

Language:Jupyter NotebookLicense:MITStargazers:288Issues:8Issues:34

frechet-audio-distance

A lightweight library for Frechet Audio Distance calculation.

Language:PythonLicense:MITStargazers:229Issues:2Issues:13

stable-audio-metrics

Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.

Language:PythonLicense:MITStargazers:135Issues:3Issues:0

Diffstyler

DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization

Language:Jupyter NotebookStargazers:129Issues:1Issues:8

MORPHEUS-1

Implementation of "MORPHEUS-1" from Prophetic AI and "The world’s first multi-modal generative ultrasonic transformer designed to induce and stabilize lucid dreams. "

Language:PythonLicense:MITStargazers:126Issues:7Issues:2

Self-Cascade

[ECCV2024] Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation

video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Language:PythonLicense:Apache-2.0Stargazers:54Issues:1Issues:0

music-text-representation-pp

Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval (TTMR++) [ICASSP24]

invisible-watermark

python library for invisible image watermark (blind image watermark)

Language:PythonLicense:MITStargazers:1Issues:1Issues:0