Tracy Shen's starred repositories
edm_main_algorithms
Fit EDM main algorithms on Lalilo data
PlotNeuralNet
Latex code for making neural networks diagrams
Tools-to-Design-or-Visualize-Architecture-of-Neural-Network
Tools to Design or Visualize Architecture of Neural Network
mlm-scoring
Python library & examples for Masked Language Model Scoring (ACL 2020)
vokenization
PyTorch code for EMNLP 2020 Paper "Vokenization: Improving Language Understanding with Visual Supervision"
TextAttack
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
kaggle-quest
TPU-Ready TF 2.1 Solution to Google QUEST Q&A Labeling using Siamese RoBERTa Encoder Model
ML-and-Data-Analysis
collection of jupyter notebook for various data analysis related tasks.
dont-stop-pretraining
Code associated with the Don't Stop Pretraining ACL 2020 paper
aws-support-tools
Tools and sample code provided by AWS Premium Support.
coursera-dl
Script for downloading Coursera.org videos and naming them.
TextGAN-PyTorch
TextGAN is a PyTorch framework for Generative Adversarial Networks (GANs) based text generation models.
clinicalBERT
repository for Publicly Available Clinical BERT Embeddings
bert-vocab-builder
Builds wordpiece(subword) vocabulary compatible for Google Research's BERT
Albert_Finetune_with_Pretrain_on_Custom_Corpus
1. Pretrain Albert on custom corpus 2. Finetune the pretrained Albert model on downstream task
AutoPhrase
AutoPhrase: Automated Phrase Mining from Massive Text Corpora
tensor2tensor
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.