virtualflo

followers

following

stars

Florian Stègre's starred repositories

MusePose

MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation

Language:PythonNOASSERTION181400

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonMIT112500

pytorch-lightning

Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.

Language:PythonApache-2.02742000

Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonNOASSERTION5144300

screenshot-to-code

Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)

Language:PythonMIT5439200

metavoice-src

Foundational model for human-like, expressive TTS

Language:PythonApache-2.0345000

homebridge-panasonic-heat-pump

Language:TypeScriptApache-2.0700

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

BSD-4-Clause2600

Fooocus

Focus on prompting and generating

Language:PythonGPL-3.03789400

llamafile

Distribute and run LLMs with a single file.

Language:C++NOASSERTION1676300

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookNOASSERTION1050900

deepface

A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python

Language:PythonMIT1082300

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.02193500

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonMPL-2.03169900

Deepfake-using-Wave2Lip

A deep learning model to lip-sync a given video with any given audio. It uses GAN architecture to orchestrate loss reconstruction or training.

Language:Jupyter NotebookMIT10000

sd-wav2lip-uhq

Wav2Lip UHQ extension for Automatic1111

Language:PythonApache-2.0115600

DeepFaceLab

DeepFaceLab is the leading software for creating deepfakes.

Language:PythonGPL-3.04621000

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookMIT545500

inaSpeechSegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Language:PythonMIT71700

Retrieval-based-Voice-Conversion-WebUI

Easily train a good VC model with voice data <= 10 mins!

Language:PythonMIT2070500

GFPGAN

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

Language:PythonNOASSERTION3504300

ultimatevocalremovergui

GUI for a Vocal Remover that uses Deep Neural Networks.

Language:PythonMIT1644600

so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

Language:PythonNOASSERTION852000

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Language:Python970200

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Language:PythonBSD-4-Clause991400

MaskFreeVIS

Mask-Free Video Instance Segmentation [CVPR 2023]

Language:PythonApache-2.035400

diart

A python package to build AI-powered real-time audio applications

Language:PythonMIT88700

GHunt

🕵️‍♂️ Offensive Google framework.

Language:PythonNOASSERTION1519400

pysot

SenseTime Research platform for single object tracking, implementing algorithms like SiamRPN and SiamMask.

Language:PythonApache-2.0439500

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonMIT6388800