admarko / EmojiTweets

This projects scrapes and geolocates tens of thousands of tweets looking at linguistic data on the internet, specifically data on emoji usage.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


Contains:
emoji_tweets.py -Run this.
-Usage: python3 emoji_tweets.py [-h, --help] [-v, --version] [-n, --numtweets] [-q, --query] [-usa]
where [-h] is run for help, [-v] is the version number, [-n] is the max number of tweets to scrape (100 by default), [-q] is the search query to test, and [-usa] is used to search domestically rather than internationally
-Sample usage: python3 emoji_tweets.py -n 10000 -q olympics config.py -Stores API keys (not committed to Github) reports -folder containing all output files -files named: "report_[-n]tweets_on[-q]"
README -this file


This projects scrapes and geolocates tens of thousands of tweets looking at linguistic data on the internet, specifically data on emoji usage.

Using the user inputted search query, this script searches through [-n] many tweets and first filters out any statuses that do not contain an emoji. Based on whether the user would like to look at tweets from the US or around the globe, the script then attempts to geolocate the tweet (if possible) and sort into one a few dictionaries created for the final report. If for any reason the script crashes (connection error, API error, etc.) the final report will be published for every tweet looked at so far. The config.py file is to hide my API keys protected from github with a .gitignore file

The key finding obtained from this project was that emojis tend to be used to share feelings of happiness, laughter and excitement. While I originally hypothesized that countries (or states) that were opposed to a given topic indicated in the search country would produce more negative emojis, it turned out that the origin location of the tweet had less impact on the sentiment expressed through the emoji than the actual content of the search query. Or more simply states, emoji usage is more dependent on status topic rather than location.

Included in this repository is also the final paper I wrote for the Linguistics class this project was originally for (Also the Latex code to go along with that report).

About

This projects scrapes and geolocates tens of thousands of tweets looking at linguistic data on the internet, specifically data on emoji usage.


Languages

Language:Python 100.0%