Sumanth Doddapaneni (sumanthd17)

sumanthd17

Geek Repo

Company:IIT Madras, AI4Bharat

Location:Hyderabad

Home Page:https://sumanthd17.github.io

Twitter:@sumanthd17

Github PK Tool:Github PK Tool


Organizations
AI4Bharat

Sumanth Doddapaneni's starred repositories

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonLicense:MITStargazers:29755Issues:425Issues:4161

NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Language:PythonLicense:MITStargazers:22415Issues:1271Issues:100

datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Language:PythonLicense:Apache-2.0Stargazers:18705Issues:275Issues:2813

RedPajama-Data

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Language:PythonLicense:Apache-2.0Stargazers:4431Issues:76Issues:87
Language:Jupyter NotebookLicense:MITStargazers:4217Issues:71Issues:17

EasyNMT

Easy to use, state-of-the-art Neural Machine Translation for 100+ languages

Language:PythonLicense:Apache-2.0Stargazers:1098Issues:19Issues:90

SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Language:PythonLicense:MITStargazers:1091Issues:24Issues:76

nlp-phd-global-equality

A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP

fastformers

FastFormers - highly efficient transformer models for NLU

Language:PythonLicense:NOASSERTIONStargazers:696Issues:19Issues:18

flores

Facebook Low Resource (FLoRes) MT Benchmark

Language:PythonLicense:NOASSERTIONStargazers:671Issues:67Issues:44

aclpubcheck

Tools for checking ACL paper submissions

Language:PythonLicense:MITStargazers:550Issues:5Issues:45

indicnlp_catalog

A collaborative catalog of NLP resources for Indic languages

llm-seminar

Seminar on Large Language Models (COMP790-101 at UNC Chapel Hill, Fall 2022)

Indic-BERT-v1

Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.com/AI4Bharat/IndicBERT

Language:PythonLicense:MITStargazers:272Issues:18Issues:25

terashuf

terashuf shuffles multi-terabyte text files using limited memory

Language:C++License:MITStargazers:198Issues:5Issues:9

IndicTrans2

Translation models for 22 scheduled languages of India

Language:PythonLicense:MITStargazers:193Issues:9Issues:76

impact

ML has an impact on the climate. But not all models are born equal. Compute your model's emissions with our calculator and add the results to your paper with our generated latex template

Language:HTMLLicense:MITStargazers:187Issues:6Issues:16

yanmtt

Yet Another Neural Machine Translation Toolkit

Language:PythonLicense:MITStargazers:166Issues:6Issues:58

indicTrans

indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2

Language:Jupyter NotebookLicense:MITStargazers:111Issues:10Issues:45

IndicWav2Vec

Pretraining, fine-tuning and evaluation scripts for Indic-Wav2Vec2

Language:Jupyter NotebookLicense:MITStargazers:73Issues:10Issues:44

Just-Another-Research-CV

📝 A not-so-fancy but still a pretty research CV :fireworks: :tada:

Litmus

AI Assistant for Building Reliable, High-performing and Fair Multilingual NLP Systems

Language:PythonLicense:MITStargazers:46Issues:4Issues:0

bifixer

Tool to fix bitexts and tag near-duplicates for removal

Language:PythonLicense:GPL-3.0Stargazers:27Issues:7Issues:11

Lightweight-Low-Resource-NMT

Official code for "Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models" to appear in WMT 2022.

Language:PythonLicense:MITStargazers:17Issues:3Issues:0

MMLMCalibration

Code for EMNLP 2022 Paper: On the Calibration of Massively Multilingual Language Models

Language:PythonLicense:MITStargazers:14Issues:3Issues:0

webcorpus

Generate large textual corpora for almost any language by crawling the web

Language:PythonLicense:NOASSERTIONStargazers:10Issues:1Issues:0
Language:PythonLicense:Apache-2.0Stargazers:6Issues:2Issues:3

indicnlp.ai4bharat.org

Archived old website for AI4Bhārat Indic-NLP