Haider Asad (haiderasad)

haiderasad

Geek Repo

Location:Pakistan

Github PK Tool:Github PK Tool

Haider Asad's repositories

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

camelot

Camelot: PDF Table Extraction for Humans

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

CodeFormer

[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

deepdoctection

A Repo For Document AI

License:Apache-2.0Stargazers:0Issues:0Issues:0

doctr

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

License:Apache-2.0Stargazers:0Issues:0Issues:0

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and FastChat-T5.

License:Apache-2.0Stargazers:0Issues:0Issues:0

faster-whisper

Faster Whisper transcription with CTranslate2

License:MITStargazers:0Issues:0Issues:0

google-research

Google Research

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

image-matching-webui

🤗 image matching toolbox webui

Stargazers:0Issues:0Issues:0

Lip_Wise

LipWise is a powerful video dubbing tool that leverages optimized inference for Wav2Lip, this also utilizes models like GFPGAN and CodeFormer. These sophisticated models seamlessly integrate the new audio with the lip movements of the reference video, resulting in a stunningly natural and realistic final output.

License:Apache-2.0Stargazers:0Issues:0Issues:0

modal-examples

Examples of programs built using Modal

License:MITStargazers:0Issues:0Issues:0

multilingual_kws

Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus

Stargazers:0Issues:0Issues:0

mycroft-precise

A lightweight, simple-to-use, RNN wake word listener

License:Apache-2.0Stargazers:0Issues:0Issues:0

PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

License:Apache-2.0Stargazers:0Issues:0Issues:0

PronouncUR

PronouncUR: An Urdu Pronunciation Lexicon Generator

License:MITStargazers:0Issues:0Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

License:MITStargazers:0Issues:0Issues:0

quillman

A chat app that transcribes audio in real-time, streams back a response from a language model, and synthesizes this response as natural-sounding speech.

License:MITStargazers:0Issues:0Issues:0

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

License:MITStargazers:0Issues:0Issues:0

speaker-transcription

Transcription with speaker diarization pipeline

License:MITStargazers:0Issues:0Issues:0

tabular_data_extraction

A repo utilizing Document table extraction models and serving it as a standalone API

Language:PythonStargazers:0Issues:0Issues:0

text2speech

Towards Building Text-To-Speech Systems for the Next Billion Users - Microsoft Research Intern Work - Accepted at ICASSP 2023

Stargazers:0Issues:0Issues:0

Turkish-Text-to-Speech

Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan

Stargazers:0Issues:0Issues:0

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

License:MITStargazers:0Issues:0Issues:0

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

License:NOASSERTIONStargazers:0Issues:0Issues:0

vall-e

An unofficial PyTorch implementation of the audio LM VALL-E

License:MITStargazers:0Issues:0Issues:0

VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

License:MITStargazers:0Issues:0Issues:0

whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

License:BSD-2-ClauseStargazers:0Issues:0Issues:0

whisper.cpp

Port of OpenAI's Whisper model in C/C++

License:MITStargazers:0Issues:0Issues:0

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

License:BSD-4-ClauseStargazers:0Issues:0Issues:0