jtang-asapp's starred repositories

annotated_deep_learning_paper_implementations

🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Language:PythonLicense:MITStargazers:55516Issues:456Issues:132

alt-tab-macos

Windows alt-tab on macOS

Language:SwiftLicense:GPL-3.0Stargazers:10857Issues:43Issues:3420
Language:PythonLicense:Apache-2.0Stargazers:1248Issues:22Issues:48

Multimodal-Transformer

[ACL'19] [PyTorch] Multimodal Transformer

Language:PythonLicense:MITStargazers:812Issues:15Issues:49

neuspell

NeuSpell: A Neural Spelling Correction Toolkit

Language:PythonLicense:MITStargazers:667Issues:10Issues:74

voxpopuli

A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation

Language:PythonLicense:NOASSERTIONStargazers:509Issues:18Issues:22

Graph2Seq

Graph2Seq is a simple code for building a graph-encoder and sequence-decoder for NLP and other AI/ML/DL tasks.

Language:PythonLicense:Apache-2.0Stargazers:238Issues:15Issues:10

PLOME

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

Language:PythonLicense:Apache-2.0Stargazers:228Issues:3Issues:31

bidirectional-cross-attention

A simple cross attention that updates both the source and target in one step

Language:PythonLicense:MITStargazers:147Issues:4Issues:2

speech-datasets

Various speech datasets made available to the public

Language:Jupyter NotebookStargazers:98Issues:15Issues:12

confidence-aware-learning

Confidence-Aware Learning for Deep Neural Networks (ICML2020)

Language:PythonLicense:MITStargazers:72Issues:6Issues:4

CRASpell

The code for our ACL2022 findings paper: CRACSpell: A Contextual Typo Robust Approach with Copy Mechanism to Improve Chinese Spelling Correction

Language:PythonLicense:MITStargazers:72Issues:2Issues:5

awesome-asr-contextualization

A curated list of awesome papers on contextualizing E2E ASR outputs

Language:Jupyter NotebookLicense:MITStargazers:63Issues:2Issues:9

sentence-doctor

Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of the art SBD, they often depend on text extractors (e.g pdf text extractors or OCR). The quality of these extractors greatly influence the quality of SBD libraries and as a consequence, the performance of downstream models as well. To help address this problem, we fine-tuned a T5 model from the hugging face hub that attempts to reconstruct “broken sentences”

gridspace-stanford-harper-valley

The Gridspace-Stanford Harper Valley speech dataset. Created in support of CS224S.

Language:PythonLicense:CC-BY-4.0Stargazers:41Issues:10Issues:2

Awesome-Failure-Detection

A list of papers that studies out-of-distribution (OOD) detection and misclassification detection (MisD)

emoASR

End-to-end MOdeling of ASR (Automatic Speech Recognition)

OpenMix

PyTorch implementation of our CVPR2023 paper "OpenMix: Exploring Out-of-Distribution samples for Misclassification Detection"

Language:PythonLicense:MITStargazers:23Issues:2Issues:6

fbai-speech

Repo for the FB AI Speech team.

Language:PythonLicense:MITStargazers:22Issues:6Issues:1

aligned-cross-entropy

Test implementation of "Aligned Cross Entropy for Non-Autoregressive Machine Translation" https://arxiv.org/abs/2004.01655

Language:Jupyter NotebookLicense:MITStargazers:21Issues:4Issues:1

contextual-attention-nlm

Accompanying code for paper "Attention-Based Contextual Language Model Adaptation for Speech Recognition", submitted to ACL 2021.

Language:PythonLicense:NOASSERTIONStargazers:14Issues:2Issues:0

SoundsLike

A python package for finding words that sound like other words. Useful for entity resolution and poetry, among other things.

Language:PythonLicense:Apache-2.0Stargazers:13Issues:1Issues:0

Contextual-Biasing-Dataset

open-source Mandarian biased word dataset

Language:PythonLicense:MITStargazers:5Issues:1Issues:0

NeMo

NeMo: a toolkit for conversational AI

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0

tapir

Code for "TAPIR: Learning Adaptive Revision for Incremental Natural Language Understanding with a Two-Pass Model", Findings of ACL 2023

Language:PythonLicense:MITStargazers:1Issues:0Issues:0