jkoprax's starred repositories
UdioWrapper
UdioWrapper is a Python package that enables the generation of music tracks using Udio's API through textual prompts. This package is based on the reverse engineering of the Udio API (https://www.udio.com/) and is not officially endorsed by Udio.
SpectralCluster
Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.
transcriptionstream
turnkey self-hosted offline transcription and diarization service with llm summary
3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
pytorch-kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
accent_rating
A collection of scripts and data I used when working on my dissertation
idvoice-gpt-android-demo
IDVoice + ChatGPT Android demo app
FishBoardMix
The FishBoardMix corpus is designed to explore Speaker-Age estimation technology.
idvoice-gpt-ios-demo
IDVoice + ChatGPT iOS demo app
sr_labs_book
The project is related to the development of labs for the ITMO Speaker Recognition Course.
voiceprint
Voice biometric authentication PAM module for Linux
Voice-Authentication-CNN
Voice authentication system implementation using Python
semantic-kernel
Integrate cutting-edge LLM technology quickly and easily into your apps
signal-cli-rest-api
Dockerized Signal Messenger REST API
voice-overlay-android
🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI
SISinusWaveView
A Siri like voice input visualizer using EZAudio.
wearable-reply
Simplify text input for Android Wear 2.0, by voice, keyboard, or canned response.