Muhammad Wasim's repositories
DEEPFAKE-AUDIO
An audio deepfake is when a “cloned” voice that is potentially indistinguishable from the real person’s is used to produce synthetic audio.
detectron2
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
gpt-discord-bot
Example Discord bot written in Python that uses the completions API to have conversations with the `text-davinci-003` model, and the moderations API to filter the messages.
mint
Multi-modal Content Creation Model Training Infrastructure including the FACT model (AI Choreographer) implementation.
Multi-Tacotron-Voice-Cloning
Phoneme multilingual(Russian-English) voice cloning based on
Neural_Voice_Cloning
Open Source Implementation of Neural Voice Cloning with Few Audio Samples (Baidu Research)
openai-openapi
OpenAPI specification for the OpenAI API
openai-python
The OpenAI Python library provides convenient access to the OpenAI API from applications written in the Python language.
ParlAI
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
PowerApps-Samples
Sample code for Power Apps, including Dataverse, model-driven apps, canvas apps, Power Apps component framework, portals, and ai-builder.
Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
RealTimeVoiceCloning
Real Time Voice cloning
resemble-alexa
This is sample code for an Alexa skill that uses realistic voice cloning powered by Resemble AI's text-to-speech API, and Open AI’s GPT-3 AI engine.
resemble-unity-text-to-speech
Resemble's voice cloning engine within Unity
triton
Development repository for the Triton language and compiler
voice-changer
Time-Domain Pitch and Time Scale Modification of Speech Signal
voicemailtool
Tool for processing uncompressed voicemail attachments
Whatsapp_Clone
WhatsApp, is an American freeware, cross-platform centralized messaging and voice-over-IP service owned by Facebook, Inc. It allows users to send text messages and voice messages, make voice and video calls, and share images, documents, user locations, and other content.
whisper
Robust Speech Recognition via Large-Scale Weak Supervision