Florian Stègre's starred repositories

MusePose

MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation

Language:PythonLicense:NOASSERTIONStargazers:1814Issues:0Issues:0

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonLicense:MITStargazers:1125Issues:0Issues:0

pytorch-lightning

Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.

Language:PythonLicense:Apache-2.0Stargazers:27420Issues:0Issues:0

Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonLicense:NOASSERTIONStargazers:51443Issues:0Issues:0

screenshot-to-code

Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)

Language:PythonLicense:MITStargazers:54392Issues:0Issues:0

metavoice-src

Foundational model for human-like, expressive TTS

Language:PythonLicense:Apache-2.0Stargazers:3450Issues:0Issues:0
Language:TypeScriptLicense:Apache-2.0Stargazers:7Issues:0Issues:0

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

License:BSD-4-ClauseStargazers:26Issues:0Issues:0

Fooocus

Focus on prompting and generating

Language:PythonLicense:GPL-3.0Stargazers:37894Issues:0Issues:0

llamafile

Distribute and run LLMs with a single file.

Language:C++License:NOASSERTIONStargazers:16763Issues:0Issues:0

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:10509Issues:0Issues:0

deepface

A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python

Language:PythonLicense:MITStargazers:10823Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:21935Issues:0Issues:0

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonLicense:MPL-2.0Stargazers:31699Issues:0Issues:0

Deepfake-using-Wave2Lip

A deep learning model to lip-sync a given video with any given audio. It uses GAN architecture to orchestrate loss reconstruction or training.

Language:Jupyter NotebookLicense:MITStargazers:100Issues:0Issues:0

sd-wav2lip-uhq

Wav2Lip UHQ extension for Automatic1111

Language:PythonLicense:Apache-2.0Stargazers:1156Issues:0Issues:0

DeepFaceLab

DeepFaceLab is the leading software for creating deepfakes.

Language:PythonLicense:GPL-3.0Stargazers:46210Issues:0Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookLicense:MITStargazers:5455Issues:0Issues:0

inaSpeechSegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Language:PythonLicense:MITStargazers:717Issues:0Issues:0

Retrieval-based-Voice-Conversion-WebUI

Easily train a good VC model with voice data <= 10 mins!

Language:PythonLicense:MITStargazers:20705Issues:0Issues:0

GFPGAN

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

Language:PythonLicense:NOASSERTIONStargazers:35043Issues:0Issues:0

ultimatevocalremovergui

GUI for a Vocal Remover that uses Deep Neural Networks.

Language:PythonLicense:MITStargazers:16446Issues:0Issues:0

so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

Language:PythonLicense:NOASSERTIONStargazers:8520Issues:0Issues:0

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Language:PythonStargazers:9702Issues:0Issues:0

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Language:PythonLicense:BSD-4-ClauseStargazers:9914Issues:0Issues:0

MaskFreeVIS

Mask-Free Video Instance Segmentation [CVPR 2023]

Language:PythonLicense:Apache-2.0Stargazers:354Issues:0Issues:0

diart

A python package to build AI-powered real-time audio applications

Language:PythonLicense:MITStargazers:887Issues:0Issues:0

GHunt

🕵️‍♂️ Offensive Google framework.

Language:PythonLicense:NOASSERTIONStargazers:15194Issues:0Issues:0

pysot

SenseTime Research platform for single object tracking, implementing algorithms like SiamRPN and SiamMask.

Language:PythonLicense:Apache-2.0Stargazers:4395Issues:0Issues:0

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:63888Issues:0Issues:0