jlmartin100

The repository contains a collection of Arabic tweets IDs associated with the novel coronavirus COVID-19. The dataset contains Tweets' ids from 2020-01-01 to 2020-04-30. The Twitter search API was used to gather real-time tweets that contained specific keywords in the Arabic language. The dataset contains almost four millions and half Arabic tweets.

Language:Jupyter NotebookNOASSERTION2700

WIDH_2020_Arabic_Text_Analysis

Material for the Text Analysis of Arabic course taught at the NYU Abu Dhabi Winter Institute in Digital Humanities 2020.

Language:Jupyter Notebook1200

arabic-stop-words

Largest list of Arabic stop words on Github. أكبر قائمة لمستبعدات الفهرسة العربية على جيت هاب

MIT30200

dldiy-practicals

Slides, Jupyter Notebooks and scripts for the Deep Learning: Do-It-Yourself! lectures at ENS

Language:Jupyter Notebook2100

Topic-Modeling-of-Tweets-Related-to-NFL-and-National-Anthem

My fourth project that I completed at Metis uses topic modeling to detect structure in tweets related to the nfl and national anthem.

Language:Jupyter Notebook100

arabic_word_embeddings_CNN

Word Embeddings and Convolutional Neural Network for Arabic Sentiment Classification (Coling 2016)

Language:Python300

twint

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.

Language:PythonMIT1567100

IntroNLP

Language:Jupyter Notebook300

hULMonA

hULMonA (حلمنا): tHe first Universal Language MOdel iN Arabic

Language:Jupyter Notebook800

Arabic-Image-Captioning

Generate Arabic captions for images using Deep Learning

Language:Jupyter Notebook2600

Arabic-Image-Captioning

Generate Arabic captions for images using Deep Learning

Language:Jupyter Notebook1600

Arabic-Empathetic-Chatbot

Seq2Seq-based open domain empathetic conversational model for Arabic: Dataset & Model

Language:Jupyter Notebook5500

hULMonA

hULMonA (حلمنا): tHe first Universal Language MOdel iN Arabic

Language:Jupyter Notebook4600

arabert

Pre-trained Transformers for Arabic Language Understanding and Generation (Arabic BERT, Arabic GPT2, Arabic ELECTRA)

Language:Python61300

Arabic-named-entity-recognition

Arabic named entity recognition using AnerCorp corpus (location , organisation, person, Miscellaneous Word)

Language:Jupyter Notebook3700

document_cluster

A guide to document clustering in Python

Language:Jupyter Notebook50500

Text-Scraping-Document-Clustering-Topic-modeling

The objective of this project is to scrape a corpus of news articles from a set of web pages, pre-process the corpus, and then to apply unsupervised clustering algorithms to explore and summarise the contents of the corpus. Part 1. Text Data Scraping This part of the project should be implemented as a Python script 1. Identify the URLs for all news articles listed on the website: http://mlg.ucd.ie/modules/COMP41680/news/index.html 2. Retrieve all web pages corresponding to these article URLs. 3. From the web pages, extract the main body text containing the content of each news article. Save the body of each article as plain text. Part 2. Corpus Exploration Tasks to be completed in your IPython notebook: 1. Load the text corpus generated in Part 1. Apply any appropriate pre-processing steps and construct a document-term matrix representation of the corpus. 2. Summarise the overall corpus by identifying the most characteristic terms and phrases in the corpus. 3. Apply two alternative clustering algorithms of your choice to the document-term matrix to produce clusters of related documents. This might require applying each algorithm several times with different parameter values. 4. For each clustering generated in Step 3, summarise the contents of the clusters. Based on your summary, suggest a topic/theme for each cluster.

Language:Jupyter Notebook4900

jlmartin100

Jessica Martin's starred repositories

corex_topic

topic_modelling_demo

tqdm

nlp-resources

EPIC

twitter-protest-analysis

metis-project4

Twitter_NLP

tweet-clustering

COVID-19-Arabic-Tweets-Dataset

WIDH_2020_Arabic_Text_Analysis

arabic-stop-words

dldiy-practicals

Topic-Modeling-of-Tweets-Related-to-NFL-and-National-Anthem

arabic_word_embeddings_CNN

twint

IntroNLP

hULMonA

Arabic-Image-Captioning

Arabic-Image-Captioning

Arabic-Empathetic-Chatbot

hULMonA

arabert

Arabic-named-entity-recognition

document_cluster

Text-Scraping-Document-Clustering-Topic-modeling

04_biden_election_tweets_NLP

nlp-in-python-tutorial

gt-nlp-class

open-data-registry