mamezy's starred repositories
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
2021-ISMIR-MSS-Challenge-CWS-PResUNet
Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.
song-solver
A Python application that allows users to sing in front of their laptop's microphone, processes the recording using the Whisper API, and then leverages a Large Language Model (LLM) to recognize the song.
basic-pitch
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
python-speech-recognition-course
Python Speech Recognition Course
segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
audio-transcription-bot
Audio Transcription WhatsApp Bot using Whisper
BasicAutoTranscriptionRepo
Basics of Pitch Estimation and Automatic Music Transcription
dtw-python
Python port of R's Comprehensive Dynamic Time Warp algorithms package