There are 0 repository under voice-processing topic.
A library for real-time voice processing in web browsers
AI-powered disaster response platform with offline-first architecture using Gemma 3n. Provides computer vision hazard detection, voice analysis with emergency keywords, PDF report generation, and multi-user coordination - all working without internet access.
A comprehensive AI companion leveraging advanced semantic analysis, sentiment detection, and voice processing to provide personalized and context-aware interactions using Autogen, semantic-router, and VoiceProcessingToolkit.
A cutting-edge AI-powered phone agent designed for seamless voice interactions, dynamic data handling, and scalable communication. Perfect for modern sales and customer engagement solutions.
Live microphone quality detection system in browser Js
The VoiceProcessingToolkit is an all-encompassing suite designed for sophisticated voice detection, wake word recognition, text-to-speech synthesis, and advanced audio processing. It offers intuitive interfaces to streamline the integration of voice processing capabilities into your applications
A Telegram bot that processes voice messages using Sber's speech recognition API. This bot converts audio formats, generates authentication tokens, and transcribes voice messages into text, enabling seamless communication via Telegram.
This repository is made in lieu of submission towards the solution of problem statement 2 of the OPEN AI NLP hackathon. The objective here is to classify the voice recordings of a call center proceeding by treating them as consumer complaints into the said categories of the automotive industry.
An audio signal processing project that detects speaker gender from recorded voice samples and enhances speech using spectral subtraction techniques in MATLAB.
Timbre Transfer for R2D2-alike Robot voice turning into instrument using Diffusion Model
🖼️ framed picture cloud base smart photo frame with voice activation paired with an android app
Coursework 1 of the Voice Signal Processing course at ITBA. Real-time LPC Vocoder written in Python
Final_Project_of_Siganls_&_Sytems_Spring_1401
Web Application that Identifies Animal from their Sound. Right now restricted to binary classification between cat and dog sounds.
🎧 Transcribe any audio to text in seconds using OpenAI Whisper — right in Google Colab. No setup needed! Upload your MP3, WAV, M4A, or FLAC file and get accurate, multilingual transcriptions powered by Whisper’s medium model — all free in the cloud. ☁️
Curso de procesado digital de la señal (24-25) : Aplicación al procesado de la voz.
This repository presents a comprehensive PyTorch implementation of an end-to-end Speaker Verification system, incorporating state-of-the-art deep learning architectures and language models. The system features robust speaker recognition capabilities, with specialized support for the Vietnamese
AI-powered platform for creative content generation and management, featuring advanced AI integrations, seamless accessibility, and community collaboration.
This is an algorithm to identify human voice and do segmentation automatically. The result will be compared to the manual segmentation data, then a accuracy report will be generated based on match rate, insertion rate and omission rate.