Rifki Afina Putri's starred repositories
RAGatouille
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
warta-scrap
Indonesia Index News Crawler, including 10 online media
nlp-phd-global-equality
A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP
expand-via-lexicon-based-adaptation
Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"
kbbi-python
A Python module that fetches a page of a word/phrase from the Online Indonesian Dictionary (https://kbbi.kemdikbud.go.id).
indonesian-nlp
A curated list of research papers and resources on Indonesian languages
id-multi-label-hate-speech-and-abusive-language-detection
The Dataset for Multi Label Hate Speech and Abusive Language Detection in Indonesian Twitter
deepchecks
Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.
malaysian-dataset
We gather Malaysian dataset! https://malaysian-dataset.readthedocs.io/
question-generator
An NLP research mainly exploring sequence-to-sequence (s2s) architecture to build Indonesian Automatic Question Generator (AQG). You can check the paper publication in README.
qa-dataset-converter
Code from the paper "What do Models Learn from Question Answering Datasets?" (EMNLP 2020)
knockknock
🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code
NL-Augmenter
NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations
KoBERT-NER
NER Task with KoBERT (with Naver NLP Challenge dataset)
pytorch-balanced-sampler
PyTorch implementations of `BatchSampler` that under/over sample according to a chosen parameter alpha, in order to create a balanced training distribution.
py-googletrans
(unofficial) Googletrans: Free and Unlimited Google translate API for Python. Translates totally free of charge.
spacy-clausie
Implementation of the ClausIE information extraction system for python+spacy
Question-Generation-Paper-List
A summary of must-read papers for Neural Question Generation (NQG)