Samsul Rahmadani (munggok)'s repositories
baize-chatbot
Let ChatGPT teach your own chatbot in hours with a single GPU!
bark-voice-cloning-HuBERT-quantizer
The code for the bark-voicecloning model. Training and inference.
datasets
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
dps
Data processing system for polyglot
gensim
Topic Modelling for Humans
MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
MyGirlGPT
MyGirl GPT is a project to build your own AI girlfriend Running on Your Personal Server with local LLM.
NeMo
NeMo: a toolkit for conversational AI
nusa-crowd
A collaborative project to collect datasets in Indonesian languages.
olm-datasets
Pipeline for pulling and processing online language model pretraining data from the web
Open-Assistant
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
sailor-llm
Sailor: Open Language Models for South-East Asia
so-vits-svc-5.0
Core Engine of Singing Voice Conversion & Singing Voice Clone
stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
VocalForge
Your one-stop solution for voice dataset creation
voice-cloning-collab
an improved version of Real-time-voice-cloning
whisper
Robust Speech Recognition via Large-Scale Weak Supervision
whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)