Sutirtha Chakraborty's repositories
python-audio-separator
Easy to use vocal separation on CLI or as a python package, using the amazing MDX-Net models from UVR trained by @Anjok07
AI-Dance-based-on-Human-Pose-Estimation
Human Pose Estimation using Deep Learning.
AICoverGen
A WebUI to create song covers with any RVC v2 trained AI voice from YouTube videos or audio files.
av-se
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
Banging-interaction
Banging interaction: A ubimus-design strategy for the musical internet
BlurFaceRealtime
Real time Face Blurring for multiple people
doctr
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
Funny-Application
This is a repository to try out funny things.
GFPGAN
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
lightning-hydra-template
PyTorch Lightning + Hydra. A very user-friendly template for rapid and reproducible ML experimentation with best practices. ⚡🔥⚡
Melody-extraction-with-melodic-segnet
The source code of "A Streamlined Encoder/Decoder Architecture for Melody Extraction"
mind-vis
Code base for MinD-Vis
Multimodal-Synchronization-in-Musical-Ensembles
Investigating Audio and Visual Cues
Music-Source-Separation-Training
Repository for training models for music source separation.
musicinformationretrieval.com
Instructional notebooks on music information retrieval.
PaddleDetection
Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
stable-ts
Transcription, forced alignment, and audio indexing with OpenAI's Whisper
SutirthaChakraborty
My CV website
VirtualConductor
首届国际“远见杯”元智能数据挑战大赛——动作认知赛道比赛数据
Visual_Speech_Recognition_for_Multiple_Languages
Visual Speech Recognition for Multiple Languages