edwincheong (edwin-19)

edwin-19

Geek Repo

Location:Malaysia

Github PK Tool:Github PK Tool

edwincheong's starred repositories

vscode

Visual Studio Code

Language:TypeScriptLicense:MITStargazers:162722Issues:3280Issues:183366

dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

Language:TypeScriptLicense:NOASSERTIONStargazers:46259Issues:345Issues:3866

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:32939Issues:200Issues:1207

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:27221Issues:225Issues:4540

fabric

fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.

unsloth

Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonLicense:Apache-2.0Stargazers:15898Issues:105Issues:818

fish-speech

Brand new TTS solution

Language:PythonLicense:NOASSERTIONStargazers:12352Issues:89Issues:352

txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

Language:PythonLicense:Apache-2.0Stargazers:8738Issues:85Issues:750

FastUI

Build better UIs faster.

Language:PythonLicense:MITStargazers:8102Issues:64Issues:211

EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Language:PythonLicense:Apache-2.0Stargazers:7238Issues:63Issues:150

Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Language:PythonLicense:Apache-2.0Stargazers:6798Issues:49Issues:211

gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.

Language:C++License:Apache-2.0Stargazers:5925Issues:40Issues:85

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonLicense:BSD-3-ClauseStargazers:5536Issues:63Issues:98

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonLicense:Apache-2.0Stargazers:5241Issues:39Issues:37

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4473Issues:58Issues:152

WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

Language:Jupyter NotebookLicense:MITStargazers:3786Issues:76Issues:103

WhisperKit

On-device Speech Recognition for Apple Silicon

Language:SwiftLicense:MITStargazers:3154Issues:28Issues:114

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonLicense:MITStargazers:1167Issues:56Issues:52

vocos

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

Language:PythonLicense:MITStargazers:767Issues:33Issues:46

Conv-TasNet

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement

Awesome-Document-Image-Rectification

A comprehensive list of awesome document image rectification papers.

VoiceFlow-TTS

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

nanoowl

A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.

Language:PythonLicense:Apache-2.0Stargazers:232Issues:4Issues:27

torchseg

Segmentation models with pretrained backbones. PyTorch.

Language:PythonLicense:MITStargazers:98Issues:6Issues:21

ppgs

High-Fidelity Neural Phonetic Posteriorgrams

Language:PythonLicense:MITStargazers:76Issues:8Issues:13

OpenPhonemizer

An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPL phonemizer.

Language:PythonLicense:BSD-3-Clause-ClearStargazers:75Issues:4Issues:6

appjsonify

A handy PDF-to-JSON conversion tool for academic papers implemented in Python.

Language:PythonLicense:MITStargazers:52Issues:3Issues:3

Aty-TTS

Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech