Sashank Santhanam's starred repositories
DEPRECATED-data-structures
A collection of powerful data structures
vokenization
PyTorch code for EMNLP 2020 Paper "Vokenization: Improving Language Understanding with Visual Supervision"
dm_memorytasks
A set of 13 diverse machine-learning tasks that require memory to solve.
haystack
:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
fastformers
FastFormers - highly efficient transformer models for NLU
lambda-networks
Implementation of LambdaNetworks, a new approach to image recognition that reaches SOTA with less compute
lambda-bert
A 🤗-style implementation of BERT using lambda layers instead of self-attention
stay-hungry-stay-focused
This repository hosts the authors' implementation of the paper "Stay Hungry, Stay Focused: Generating Informative and Specific Questions in Information-Seeking Conversations", published in Findings of EMNLP 2020.
wiki-reading
This repository contains the three WikiReading datasets as used and described in WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia, Hewlett, et al, ACL 2016 (the English WikiReading dataset) and Byte-level Machine Reading across Morphologically Varied Languages, Kenter et al, AAAI-18 (the Turkish and Russian datasets).
SentAugment
SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in combination with self-training and knowledge-distillation, or for retrieving paraphrases.
electra_pytorch
Pretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)
DeepPavlov
An open source library for deep learning end-to-end dialog systems and chatbots.
task_oriented_dialogue_as_dataflow_synthesis
Code to reproduce experiments in the paper "Task-Oriented Dialogue as Dataflow Synthesis" (TACL 2020).
ContrastiveLearning4Dialogue
The codebase for "Group-wise Contrastive Learning for Neural Dialogue Generation" (Cai et al., Findings of EMNLP 2020)