docker etl-pipeline mongodb postgresql slack-bot twitter

Dockerized data-pipeline to analyze the sentiment of tweets

1. Clone repository with:

git clone https://github.com/to-schi/tweet-analyzer-pipeline.git

2. Enter your credentials in "twitter_cred.py":

Get your API-credentials from developer.twitter.com.

nano ./tweet-analyzer-pipeline/tweet_collector/src/twitter_cred.py
# Enter your credentials and save file with ctrl-x
# Limit is set to 200 tweets

3. Enter your preferred tweet-query:

nano ./tweet-analyzer-pipeline/tweet_collector/src/tweet_collector.py
# Insert query-text in line 9: "query = '#SOME_PHRASE'" and save file with ctrl-x

4. Insert the webhook-url to your slack-channel:

nano ./tweet-analyzer-pipeline/tweet_slack/src/conf.py
# Insert webhook-url and save file with ctrl-x

5. Change directory and start docker:

cd tweet-analyzer-pipeline
sudo docker-compose up
# Add '-d' for background-mode

Docker will build 5 containers and start the pipeline automatically: tweet_mongodb, tweet_collector, tweet_postgres, tweet_etl, and tweet_slack.

This project was created during the Spiced Academy Data Science Bootcamp Nov/2021.

About

Dockerized data-pipeline to analyze tweet-sentiments

docker etl-pipeline mongodb postgresql slack-bot twitter

Languages

Language:Python 98.0%Language:Dockerfile 2.0%