Dan's repositories

pawlyglot

Service-based Multilingual TTS and Lip-Syncing Pipeline

squim-report

Using TorchAudio-SQUIM to create dataset quality reports

Language:PythonStargazers:3Issues:1Issues:0

speech_explorer

A minimal repository for NVIDIA NeMo's Speech Explorer

Language:PythonStargazers:0Issues:2Issues:0

VoiceCraft-TTS

Minimal repository for VoiceCraft-TTS inference, built with Docker

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:0Issues:1Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

chatbot-ui

An open-source ChatGPT UI.

Language:TypeScriptLicense:MITStargazers:0Issues:1Issues:0
Language:PythonStargazers:0Issues:1Issues:0

DiscreteSpeechMetrics

Reference-aware automatic speech evaluation toolkit

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0
Language:PythonLicense:MITStargazers:0Issues:2Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0
Language:PythonStargazers:0Issues:1Issues:0
Language:PythonStargazers:0Issues:1Issues:0
Language:PythonStargazers:0Issues:0Issues:0

nllb-with-hqq

NLLB inference with HQQ Optimization

Language:PythonStargazers:0Issues:0Issues:0

SadTalker

(CVPR 2023)SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

Language:PythonStargazers:0Issues:1Issues:0

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:CLicense:NOASSERTIONStargazers:0Issues:0Issues:0

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Talking-Face_PC-AVS

Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)

Language:PythonLicense:CC-BY-4.0Stargazers:0Issues:1Issues:0
Language:Jupyter NotebookStargazers:0Issues:2Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

License:Apache-2.0Stargazers:0Issues:0Issues:0

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

License:NOASSERTIONStargazers:0Issues:0Issues:0

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020.

Language:PythonStargazers:0Issues:1Issues:0

Wav2Lip-GFPGAN

High quality Lip sync

Language:PythonStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:1Issues:0
Language:PythonStargazers:0Issues:1Issues:0

zeno-build

Build, evaluate, understand, and fix LLM-based apps

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

zeno-hub

AI Evaluation Platform

Language:CSSLicense:MITStargazers:0Issues:0Issues:0