edwin-19

edwincheong's starred repositories

vscode

Visual Studio Code

Language:TypeScriptMIT162722 3280 183366

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

Language:TypeScriptNOASSERTION46259 345 3866

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT32939 200 1207

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.027221 225 4540

fabric

fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.

Language:GoMIT22985 310 448

unsloth

Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonApache-2.015898 105 818

fish-speech

Brand new TTS solution

Language:PythonNOASSERTION12352 89 352

txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

Language:PythonApache-2.08738 85 750

FastUI

Build better UIs faster.

Language:PythonMIT8102 64 211

EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Language:PythonApache-2.07238 63 150

Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Language:PythonApache-2.06798 49 211

gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.

Language:C++Apache-2.05925 40 85

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonBSD-3-Clause5536 63 98

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonApache-2.05241 39 37

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonMIT4473 58 152

WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

Language:Jupyter NotebookMIT3786 76 103

WhisperKit

On-device Speech Recognition for Apple Silicon

Language:SwiftMIT3154 28 114

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonMIT1167 56 52

vocos

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

Language:PythonMIT767 33 46

Conv-TasNet

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement

Language:Python415 6 54

Awesome-Document-Image-Rectification

A comprehensive list of awesome document image rectification papers.

351 14 4

VoiceFlow-TTS

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

Language:Python299 15 15

dataspeech

Language:PythonMIT274 13 15

nanoowl

A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.

Language:PythonApache-2.0232 4 27

torchseg

Segmentation models with pretrained backbones. PyTorch.

Language:PythonMIT98 6 21

ppgs

High-Fidelity Neural Phonetic Posteriorgrams

Language:PythonMIT76 8 13

OpenPhonemizer

An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPL phonemizer.

Language:PythonBSD-3-Clause-Clear75 4 6

appjsonify

A handy PDF-to-JSON conversion tool for academic papers implemented in Python.

Language:PythonMIT52 3 3

Aty-TTS

Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech

Language:Python10 2 1

SpeechEmotionAVLearning

Language:HTML9 1 1