farisalasmary

followers

following

stars

Faris Alasmary's repositories

psu-language-modeling-session

The code of the "Language Models and Their Applications" session

Language:Jupyter NotebookMIT9 10

psu-sentiment-analysis-session

PSU Sentiment Analysis Session Code

Language:Jupyter NotebookMIT4 10

sbvqa2.0

The official implementation of the paper: SBVQA 2.0: Robust End-to-End Speech-Based Visual Question Answering for Open-Ended Questions

Language:PythonMIT4 20

shieldrnn

The implementation of ShieldRNN

Language:PythonMIT3 10

adversarial-machine-learning-example

Train a CNN model on MNIST dataset and use it to develop an adversarial example to fool the model

Language:Jupyter NotebookMIT2 10

BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Language:Jupyter NotebookBSD-3-Clause000

bottom-up-attention

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

Language:Jupyter NotebookMIT000

CLIP-ViL

[ICLR 2022] code for "How Much Can CLIP Benefit Vision-and-Language Tasks?" https://arxiv.org/abs/2107.06383

Language:PythonMIT000

ctcdecode

PyTorch CTC Decoder bindings

Language:C++MIT000

CTDNN

MMM 2021: Crossed-Time Delay Neural Network for Speaker Recognition

Language:PythonMIT000

DeepFilterNet

Noise supression using deep filtering

Language:PythonNOASSERTION000

deepspeech.pytorch

Speech Recognition using DeepSpeech2.

Language:PythonMIT000

Face-Transformer

Face Transformer for Recognition

Language:PythonMIT000

kaldi-serve

Server framework for Kaldi ASR Toolkit

Language:C++Apache-2.0000

Listen-Attend-and-Spell

PyTorch implementation of Listen, Attend and Spell (LAS) speech recognition paper

Language:Python000

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

MIT000

NeMo

NeMo: a toolkit for conversational AI

Language:PythonApache-2.0000

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:PythonMIT000

pydub

Manipulate audio with a simple and easy high level interface

Language:PythonMIT000

recurrent-memory-transformer-pytorch

Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch

Language:PythonMIT000

RegionCLIP

[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"

Language:PythonApache-2.0000

sequitur-g2p

This is a github repository of the abandonware Sequitur G2P by Bisani & Ney

GPL-2.0000

Speech-Transformer

PyTorch re-implementation of Speech-Transformer

Language:PythonMIT000

train-transformer-xl-huggingface

This repo contains a notebook that illustrates how to train Transformer-XL on 🤗 Transformers library

Language:Jupyter NotebookMIT010

transformer

PyTorch Implementation of "Attention Is All You Need"

Language:Python000

transformer-xl

Language:PythonApache-2.0000

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

MPL-2.0000

vinvl-visualbackbone

Original VinVL visual backbone with simplified APIs to easily extract features, boxes, object detections, in a few lines of Python code.

Language:Python000

VQA-AttReg

This is an official PyTorch implementation of “Answer Questions with Right Image Regions: A Visual Attention Regularization Approach” (https://arxiv.org/abs/2102.01916).

Language:PythonMIT000

VQVAE-Pytorch

This repo implements VQVAE on mnist and as well as colored version of mnist images. It also implements simple LSTM for generating sample numbers using the encoder outputs of trained VQVAE

000