Beast code in Giters

jkoprax's starred repositories

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookMIT3352500

suno-api

Use API to call the music generation AI of suno.ai, and easily integrate it into agents like GPTs.

Language:TypeScriptLGPL-3.083000

UdioWrapper is a Python package that enables the generation of music tracks using Udio's API through textual prompts. This package is based on the reverse engineering of the Udio API (https://www.udio.com/) and is not officially endorsed by Udio.

Language:PythonMIT6900

EEND

End-to-End Neural Diarization

Language:PythonMIT35900

SpectralCluster

Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.

Language:PythonApache-2.049800

transcriptionstream

turnkey self-hosted offline transcription and diarization service with llm summary

Language:PythonGPL-3.061000

3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Language:PythonApache-2.087400

diart

A python package to build AI-powered real-time audio applications

Language:PythonMIT87500

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookMIT541600

espnet

End-to-End Speech Processing Toolkit

Language:PythonApache-2.0804900

DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Language:C++MPL-2.02463400

opensmile

The Munich Open-Source Large-Scale Multimedia Feature Extractor

Language:C++NOASSERTION53800

pytorch-kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Language:Python235900