jack139

followers

following

stars

Amoy, China

https://jack139.top

比你笨's starred repositories

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonMIT65498 5480

ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Language:PythonGPL-3.044099 348 2620

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonApache-2.035998 348 1736

MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonNOASSERTION34675 310 877

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookMIT34077 318 425

so-vits-svc

SoftVC VITS Singing Voice Conversion

Language:PythonAGPL-3.025000 174 130

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Language:Python9915 162 641

video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Language:PythonApache-2.06202 70 234

LLM-Agent-Paper-List

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

Time-Series-Library

A Library for Advanced Deep Time Series Models.

Language:PythonMIT5469 63 425

DiffSinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Language:PythonMIT4208 43 100

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonApache-2.03979 91 1007

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonNOASSERTION3955 48 841

GeneFace

GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code

Language:PythonMIT2476 50 281

protocol

Specification of the Farcaster Protocol

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION1301 25 62

fastsdcpu

Fast stable diffusion on CPU

Language:PythonMIT1040 19 138

deep_learning_and_the_game_of_go

Code and other material for the book "Deep Learning and the Game of Go"

Language:Python959 75 87

ER-NeRF

[ICCV'23] Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis

Language:PythonMIT930 16 152

RAD-NeRF

Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition

Language:PythonMIT866 30 94

awesome_talking_face_generation

betago

BetaGo: AlphaGo for the masses, live on GitHub.

Language:PythonMIT675 56 28

Yuan-2.0

Yuan 2.0 Large Language Model

Language:PythonNOASSERTION674 5 91

representation-engineering

Representation Engineering: A Top-Down Approach to AI Transparency

Language:Jupyter NotebookMIT661 29 41

Lipreading-DenseNet3D

DenseNet3D Model In "LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild", https://arxiv.org/abs/1810.06990

Language:Python118 6 12

copycat

Modern port of Melanie Mitchell's and Douglas Hofstadter's Copycat

Language:PythonMIT113 140

FARGonautica

Language:SchemeMIT112 24 8

co.py.cat

co.py.cat extends Hofstadter's, pythonically

Language:PythonMIT53 8 7

copycat

A translation of Melanie Mitchell's original Copycat project from Lisp to Python.

Language:PythonGPL-2.041 70

Fluid-Concepts-and-Creative-Analogies

Language:Scheme11 30