There are 6 repositories under urdu-nlp topic.
UTRNet: High-Resolution Urdu Text Recognition In Printed Documents (ICDAR'23)
Compilation of Manually Tagged Roman Urdu Dataset (Urdu written in Latin/Roman Script), along with other helpful Roman Urdu NLP resources
Repository dedicated to a collection of resources and helping material for Urdu language Processing related tasks
Pashto Natural Language Processing Toolkit
Fake news detection using Naïve Bayes in Python along with confusion matrix calculated using sklearn.
This is an Urdu Word Spell Checker using Noisy Channel Model implemented in Python3.
Generating Urdu poetry using SpaCy in Python. Poetry has been generated by using Uni-grams, Bi-grams, Tri-grams and through Bidirectional Bigram Model and Backward Bigram model.
A simple python based Urdu stemmer which tries to find a stem word from a list of affixes.
A Sequence to Sequence Model Implementation of Urdu Natural Language Processing
High-quality synthetic text data generation for Urdu Text Recognition
A list of most frequently used Roman Urdu words with different spellings and usages to help make Roman Urdu text processing easier.
The first Urdu search engine crawler for web.
We have presented a new dataset for question and answering models. Our dataset contains 27 different Urdu paragraphs which are taken from different available resources i.e Urdu Wikipedia, youtube and news articles etc. All selected paragraphs have an average of 3 to 7 questions along with their possible answers that range from 1 to 3. The data contains mostly Urdu words as well as some words from English language.
AI-based Train Reservation System that uses Urdu language to chat with it.
List of Most Used or Stop Words of Urdu. Approximately 300 Words
UrduFeel: Deep Learning Sentiment Analysis for Emotional Insights
This project contains Urdu characters and some preprocessing functions
This repository contains code for Urdu Text preprocessing natural language data for use in NLP applications.
Urdu Spell & Grammar Checker: A Python app for accurate spell-checking and grammar correction in Urdu text.
Dataset generation for Urdu OCR.
This project is a grapheme-to-phoneme (G2P) converter for Urdu language. It can generate lexicons for Urdu words using a deep learning model.
It's all about natural language processing task
This repository contains python script for calculating Longest Common Subsequences (LSC) between tokenized URDU sentences.