Sumanth Doddapaneni's starred repositories
NLP-progress
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
nlp-phd-global-equality
A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP
fastformers
FastFormers - highly efficient transformer models for NLU
aclpubcheck
Tools for checking ACL paper submissions
indicnlp_catalog
A collaborative catalog of NLP resources for Indic languages
llm-seminar
Seminar on Large Language Models (COMP790-101 at UNC Chapel Hill, Fall 2022)
Indic-BERT-v1
Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.com/AI4Bharat/IndicBERT
IndicTrans2
Translation models for 22 scheduled languages of India
indicTrans
indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2
IndicWav2Vec
Pretraining, fine-tuning and evaluation scripts for Indic-Wav2Vec2
Just-Another-Research-CV
📝 A not-so-fancy but still a pretty research CV :fireworks: :tada:
Lightweight-Low-Resource-NMT
Official code for "Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models" to appear in WMT 2022.
MMLMCalibration
Code for EMNLP 2022 Paper: On the Calibration of Massively Multilingual Language Models
masakhane-reading-group
Agile reading group that works
indicnlp.ai4bharat.org
Archived old website for AI4Bhārat Indic-NLP