Faris Alasmary's repositories
psu-language-modeling-session
The code of the "Language Models and Their Applications" session
psu-sentiment-analysis-session
PSU Sentiment Analysis Session Code
adversarial-machine-learning-example
Train a CNN model on MNIST dataset and use it to develop an adversarial example to fool the model
BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
bottom-up-attention
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
CLIP-ViL
[ICLR 2022] code for "How Much Can CLIP Benefit Vision-and-Language Tasks?" https://arxiv.org/abs/2107.06383
ctcdecode
PyTorch CTC Decoder bindings
CTDNN
MMM 2021: Crossed-Time Delay Neural Network for Speaker Recognition
DeepFilterNet
Noise supression using deep filtering
deepspeech.pytorch
Speech Recognition using DeepSpeech2.
Face-Transformer
Face Transformer for Recognition
kaldi-serve
Server framework for Kaldi ASR Toolkit
Listen-Attend-and-Spell
PyTorch implementation of Listen, Attend and Spell (LAS) speech recognition paper
nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
NeMo
NeMo: a toolkit for conversational AI
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
pydub
Manipulate audio with a simple and easy high level interface
recurrent-memory-transformer-pytorch
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
RegionCLIP
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
sequitur-g2p
This is a github repository of the abandonware Sequitur G2P by Bisani & Ney
Speech-Transformer
PyTorch re-implementation of Speech-Transformer
train-transformer-xl-huggingface
This repo contains a notebook that illustrates how to train Transformer-XL on 🤗 Transformers library
transformer
PyTorch Implementation of "Attention Is All You Need"
TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
vinvl-visualbackbone
Original VinVL visual backbone with simplified APIs to easily extract features, boxes, object detections, in a few lines of Python code.
VQA-AttReg
This is an official PyTorch implementation of “Answer Questions with Right Image Regions: A Visual Attention Regularization Approach” (https://arxiv.org/abs/2102.01916).
VQVAE-Pytorch
This repo implements VQVAE on mnist and as well as colored version of mnist images. It also implements simple LSTM for generating sample numbers using the encoder outputs of trained VQVAE