iWAN Research Group (iwan-rg)

iwan-rg

User data from Github https://github.com/iwan-rg

Company:King Saud University

Location:Riyadh, Saudi Arabia

Home Page:http://iwan.ksu.edu.sa

GitHub:@iwan-rg

Twitter:@iwan_rg

iWAN Research Group's repositories

ArabicSurvey

مستودع الأوراق المسحية في معالجة اللغة العربية (أسبر) A Repository for survey and review papers in Arabic Natural Language processing (ANLP).

Arabic-Topic-Modeling

BERT for Arabic Topic Modeling: An Experimental Study on BERTopic Technique

Language:Jupyter NotebookStargazers:27Issues:2Issues:0

Saudi-Dialect-Irony-Dataset

The Saudi irony dataset was collected using Twitter API and it consists of 19,810 tweets, 8,089 of them are labeled as ironic tweets

License:CC0-1.0Stargazers:7Issues:1Issues:0

Arabic-Paraphrased-Dataset

The Arabic paraphrased parallel dataset, sourced from diverse origins and expanded through data augmentation, is invaluable in NLP. It aids education, boosts search engines, supports content creation, aids social media and domain-specific applications, and advances language technology.

ArabicLLMs

This repository contains resources from the paper A Survey of Large Language Models for Arabic Language and its Dialects

Saudi-Bank-Sentiment-Dataset

This dataset contains customers’ sentiments on Twitter toward four Saudi Banks. A total of 12k tweets 8,669 of them is labeled as "Negative", 2,143 is labeled as "Positive", and 1,236 tweets is labeled as "Neutral".

License:GPL-3.0Stargazers:4Issues:1Issues:0

Arabic-Humor

The Arabic humor dataset was collected using Twint and Sketch Engine and it consists of 10k tweets.

License:CC0-1.0Stargazers:2Issues:1Issues:0
Language:Jupyter NotebookLicense:CC0-1.0Stargazers:2Issues:1Issues:0

ARC-WMI

A baseline results towards constructing readability corpus ARC-WMI, a new Arabic collection of written medicine information annotated with readability levels.

License:NOASSERTIONStargazers:2Issues:1Issues:0

NLP-Patents

A repository for Patents in the field of Natural Language Processing (NLP).

License:GPL-3.0Stargazers:2Issues:1Issues:0

CLEANANERCorp

CLEANANERCorp, a corrected version of the classic Arabic NER benchmark ANERcorp with updated and more consistent NER labels

License:GPL-3.0Stargazers:1Issues:1Issues:0

OpenTriviaQA

A creative commons dataset of trivia questions and answers

Language:RubyLicense:CC-BY-SA-4.0Stargazers:1Issues:1Issues:0
Language:C#Stargazers:0Issues:1Issues:0
Language:HTMLStargazers:0Issues:1Issues:0
Language:HTMLLicense:MITStargazers:0Issues:1Issues:0
Language:JavaStargazers:0Issues:2Issues:0
Language:Jupyter NotebookStargazers:0Issues:0Issues:0

Saudi_Privacy_policy

Saudi Arabic Privacy Policy Dataset

Stargazers:0Issues:2Issues:0
Language:TeXLicense:MITStargazers:0Issues:1Issues:0