Alham Fikri Aji's repositories
summerschool-KD-PEFT
Mexican NLP 2024 Summerschool Tutorial on Knowledge Distillation and Parameter Efficient Finetuning
acl-anthology
Data and software for building the ACL Anthology.
Semantic_Relatedness_SemEval2024
SemEval 2024 Task 1 : Textual Semantic Relatedness
lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
DataLab
The unified platform for data-related resources.
xmtf
Crosslingual Generalization through Multitask Finetuning
karonese
Karonese dataset
nusa-catalogue
Dataset Catalogue Homepage for Indonesian Languages
promptsource
Toolkit for creating, sharing and using natural language prompts.
ARBML
Implementation of many Arabic NLP and ML projects. Providing real time experience using many interfaces like web, command line and notebooks.
evaluation-robustness-consistency
Tools for evaluating model robustness and consistency
data_tooling
Tools for managing datasets for governance and training.
variant-lite
variant lite - A C++17-like variant, a type-safe union for C++98, C++11 and later in a single-file header-only library
datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
paracotta-paraphrase
Synthetic multilingual paraphrase data
id-nlp-resource
A list of Indonesian NLP resources.
mosesdecoder
Moses, the machine translation system
indonesian-mt-data
Benchmarking Multidomain English-Indonesian Machine Translation
indolem
IndoLEM is a comprehensive Indonesian NLU benchmark, comprising three pillars NLP task: morpho-syntax, semantic, and discourse. Presented in COLING 2020.
stif-indonesia
Implementation of "Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation". TBD
Marian-transfer
Transfer learning experiment demo with Marian
rosie
Base content for AIML 2.0 chatbot
intgemm
int8_t and int16_t matrix multiply based on https://arxiv.org/abs/1705.01991