There are 3 repositories under african-languages topic.
A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.
Yorùbá language training text for NLP, ASR and TTS tasks
AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/
SemEval2024-task 11: Bridging the Gap in Text-Based Emotion Detection
Masakhane Web is a translation web application for solely African Languages.
This is a repository for NaijaSenti. A Lacuna Funded Project for the development of sentiment corpus for four Nigerian languages: Igbo, Hausa, Yoruba and Pidgin.
AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.
Automatic Diacritic Restoration of Yorùbá language Text
Cross-lingual Language Model (XLM) pretraining and Model-Agnostic Meta-Learning (MAML) for fast adaptation of deep networks
Ìrànlọ́wọ́ is a utility library for analysis & (pre)processing of Yorùbá text → https://pypi.org/project/iranlowo
stoplists for African languages generated from the ASP corpus
Introduction to "Tencent’s Multilingual Machine Translation System for WMT22 Large-Scale African Languages".
Website that hosts the African Voices projects. Users can download datasets and synthesizers, and synthesize speech in African languages
Sankofa Display is a typeface that draws inspiration from African art styles, with a focus on straight-line geometric designs.
The dataset contains editions from the South African government magazine Vuk'uzenzele. Data was scraped from PDFs that have been placed in the data/raw folder. The PDFS were obtained from the Vuk'uzenzele website.
AfricanWordNet: Implementation of WordNets for African languages. Citation paper "Practical Approach on Implementation of WordNets for South African Languages" https://www.aclweb.org/anthology/2021.gwc-1.3.pdf
Adinkra Symbols API - meanings of adinkra symbols, symbol images and synopsis around them
Code + data for the EMNLP'20 publication "Transfer Learning and Distant Supervision for Multilingual Transformer Models: A Study on African Languages"
A Roberta-based language model specially designed for Setswana, using the new PuoData dataset.
URH-DIGITS is a connected digits speech recognition task
This repo contains LUO corpus for Named Entity Recognition. The text comes from the news domain and was scrapped from Radio Ramogi.
Fur language (poór'íŋ belé) [iso 639-3: fvr] resources, and computer aids.
Plain swahili dastaset. Public sourced from public repositories
This is an open-source mobile application that augments the wazobia Automatic Voice Recognition System - AVRS. It is the interface between our voice donors and the wazobia core platform
[morph] Scrape business stories to be used on TaxClock KE accessible at https://taxclock.codeforkenya.org/
Lan_Tran is an app written in Kivy to make translations between Lantuosir and English easy! Lantuosir is a constructed language that I created based on Latin (and it's variants) & Bantu languages. It is developed as a fantasy lingua franca for the African Diaspora. The main influences are Spanish, English, and Yoruba.
Auto-generated stopwords for South African Bantu Languages
Open source project to help people learn African languages
Open source project to help people learn African languages
A browser extension built with the GhanaNLP platform for language translation.
A demo Flask app to showcase and provide information about the diverse languages spoken across the African continent.
Ntshob (Language in Mə̀dʉ̂mbɑ̀). The Number one African languages dictionary.
Final project for FRS159 at princeton