Krystana / Docker-Tweetbot-Pipeline

Docker compose pipeline twitter slackbot

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SPICED ACADEMY DATA SCIENCE BOOTCAMP - week 6: "The Data Pipeline" - Sentiment analysis on Twitter tweets, displayed by Slackbot.

The project's goal was mainly to get a docker compose pipeline running, dealing with an SQL and a NOSQL server and practicing the ETL job.

  1. twitter_streamer.py

A twitter API streams filtered tweets to a mongoDB database.

Twitter API --> https://developer.twitter.com/en/use-cases/listen-and-analyze

The code in twitter_streamer.py is based on the code in this repository: https://github.com/pawlodkowski/twitter-mongoDB, which was provided by SPICED ACADEMY and adjusted by myself.

  1. etl.py

The tweets' text contents (and userid) then are requested and loaded to a created pandas dataframe. Sentiment Analysis is performed on the text data and the sentiment compound score loaded to the dataframe.

For the sentiment analysis I used a Vader Sentiment analysis tool --> https://github.com/cjhutto/vaderSentiment/blob/master/README.rst#installation, referring to: Hutto, C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014.

The dataframe is then transferred to a Postgres database.

  1. slackbot.py

Text and sentiment values of the last row in the postgres table are then queried and then posted in a slackbot via a webhookURL.

About

Docker compose pipeline twitter slackbot

License:MIT License


Languages

Language:Python 95.9%Language:Dockerfile 4.1%