CAMeL Lab's repositories
camel_tools
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
Arabic_ALA-LC_Romanization
Romanizing Arabic bibliographic records in the ALA-LC standard.
arabic-gec
Code, models, and data for "Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation". EMNLP 2023.
arabic_error_type_annotation
The Arabic Error Type Annotation tool aims to annotate Arabic error types following the ALC tagset annotation.
arafix_ocr
A tool for improving the output of generic Arabic OCR systems using an n-gram based post-correction approach.
camel_morph
Camel Morph’s goal is to build large open-source morphological models for Arabic and its dialects across many genres and domains.
gender-reinflection
Code, models, and data for "Gender-Aware Reinflection using Linguistically Enhanced Neural Models". COLING 2020, GeBNLP.
camel-tools-data
Repo containing data packages and catalogues used by CAMeL Tools.
ced_word_alignment
A character edit distance based word aligner.
camel-kenlm
KenLM: Faster and Smaller Language Model Queries
CAMeLBERT_morphosyntactic_tagger
Code, models, and data for "Morphosyntactic Tagging with Pre-trained Language Models for Arabic and its Dialects". Findings of ACL, 2022.
gender-rewriting
Code, models, and data for "User-Centric Gender Rewriting". NAACL 2022.
Arabic-ATB-closed-class-list
A Modern Standard Arabic Closed-Class Word List
Camel_Arabic_Frequency_Lists
The repository for the CAMeL Arabic Frequency Lists dataset
gender-rewriting-shared-task
Evaluation code and data for the gender rewriting shared task
conllx_evaluation
Evaluate accuracy of CoNLL-X annotations performed by annotators
palmyra_server
A server that adds extra functionality to Palmyra
camel_tools_updates
This page will have the latest updates on the different components from CAMeL Tools.
emad
Unify Arabic tagsets
wild_diacritics
Wild Diacritics paper repo.