jkoprax

jkoprax

Geek Repo

Github PK Tool:Github PK Tool

jkoprax's starred repositories

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:33525Issues:0Issues:0

suno-api

Use API to call the music generation AI of suno.ai, and easily integrate it into agents like GPTs.

Language:TypeScriptLicense:LGPL-3.0Stargazers:830Issues:0Issues:0

UdioWrapper

UdioWrapper is a Python package that enables the generation of music tracks using Udio's API through textual prompts. This package is based on the reverse engineering of the Udio API (https://www.udio.com/) and is not officially endorsed by Udio.

Language:PythonLicense:MITStargazers:69Issues:0Issues:0

EEND

End-to-End Neural Diarization

Language:PythonLicense:MITStargazers:359Issues:0Issues:0

SpectralCluster

Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.

Language:PythonLicense:Apache-2.0Stargazers:498Issues:0Issues:0

transcriptionstream

turnkey self-hosted offline transcription and diarization service with llm summary

Language:PythonLicense:GPL-3.0Stargazers:610Issues:0Issues:0

3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Language:PythonLicense:Apache-2.0Stargazers:874Issues:0Issues:0

diart

A python package to build AI-powered real-time audio applications

Language:PythonLicense:MITStargazers:875Issues:0Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookLicense:MITStargazers:5416Issues:0Issues:0

espnet

End-to-End Speech Processing Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8049Issues:0Issues:0

DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Language:C++License:MPL-2.0Stargazers:24634Issues:0Issues:0

opensmile

The Munich Open-Source Large-Scale Multimedia Feature Extractor

Language:C++License:NOASSERTIONStargazers:538Issues:0Issues:0

pytorch-kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Language:PythonStargazers:2359Issues:0Issues:0

kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Language:ShellLicense:NOASSERTIONStargazers:13883Issues:0Issues:0

accent_rating

A collection of scripts and data I used when working on my dissertation

Language:PythonLicense:GPL-3.0Stargazers:1Issues:0Issues:0

idvoice-gpt-android-demo

IDVoice + ChatGPT Android demo app

Language:KotlinLicense:MITStargazers:1Issues:0Issues:0

FishBoardMix

The FishBoardMix corpus is designed to explore Speaker-Age estimation technology.

Language:ShellLicense:Apache-2.0Stargazers:1Issues:0Issues:0
Language:PythonStargazers:3Issues:0Issues:0

idvoice-gpt-ios-demo

IDVoice + ChatGPT iOS demo app

Language:SwiftLicense:MITStargazers:6Issues:0Issues:0

sr_labs_book

The project is related to the development of labs for the ITMO Speaker Recognition Course.

Language:Jupyter NotebookStargazers:10Issues:0Issues:0

voiceprint

Voice biometric authentication PAM module for Linux

Language:PythonLicense:AGPL-3.0Stargazers:40Issues:0Issues:0

Voice-Authentication-CNN

Voice authentication system implementation using Python

Language:PythonStargazers:31Issues:0Issues:0

VoiceSens

A Voice Biometric Application using Watson Speech to Text

Language:JavaScriptLicense:Apache-2.0Stargazers:71Issues:0Issues:0

semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps

Language:C#License:MITStargazers:19242Issues:0Issues:0

extension

WIP: Open WebUI Chrome Extension (Requires Open WebUI v0.2.0+)

Language:SvelteStargazers:28Issues:0Issues:0

assistant

No longer actively being worked on, Please use https://github.com/open-webui/extension instead

Language:TypeScriptStargazers:25Issues:0Issues:0

signal-cli-rest-api

Dockerized Signal Messenger REST API

Language:GoLicense:MITStargazers:1163Issues:0Issues:0

voice-overlay-android

🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI

Language:KotlinLicense:MITStargazers:248Issues:0Issues:0

SISinusWaveView

A Siri like voice input visualizer using EZAudio.

Language:Objective-CStargazers:274Issues:0Issues:0

wearable-reply

Simplify text input for Android Wear 2.0, by voice, keyboard, or canned response.

Language:JavaLicense:NOASSERTIONStargazers:120Issues:0Issues:0