lcbm / cs-data-viz

πŸ“ˆ activity for my data visualization class

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

πŸ“ˆ Data Viz Activity

This repository contains the specification for the Data Visualization Activity 7 Stack. Currently, the services are organized as a Docker swarm stack in compose-like files (located in the cs-data-viz/docker directory).

Contents

Activity

  1. Export the Brazilian eCommerce dataset to one of the DBMS (Database Management Systems) below:
  2. Connect the chosen DBMS to Metabase;
  3. Create data exploration visualizations in Metabase;

Development

To install the development pre-requisites, please follow the instructions in the links below:

Installing development dependencies

First, change your current working directory to the project's root directory and bootstrap the project:

# change current working directory
$ cd <path/to/cs-data-viz>

# bootstraps development and project dependencies
$ make bootstrap

NOTE: By default, poetry creates and manages virtual environments to install project dependencies -- meaning that it will work isolated from your global Python installation. This avoids conflicts with other packages installed in your system.

After bootstraping the project, run the ETL scripts in order to extract, load and transform data from the Brazilian eCommerce dataset:

# change current working directory
$ cd <path/to/cs-data-viz>

# runs scripts to extract, load and transform data
$ make run-etl

Now that the data is ready, you may proceed to the next step in order to deploy the stack.

Deploying the Stack

Requirements

Considering that the stack is organized as a Docker swarm stack in compose-like files (located in the cs-data-viz/docker directory), the following dependencies must be installed:

note: if you're using a Linux system, please take a look at Docker's post-installation steps for Linux!

Setup

Firstly, make sure to build the database Docker image. From the project's root directory:

# change current working directory
$ cd docker/database

# build the docker image
$ docker build . -f Dockerfile -t postgres:cs-data-viz

Now, pull the remaining docker images used by the stack:

# fetches services' docker images
$ make docker-pull

Finally, update the docker/env.d files for each service with the appropriate configurations, credentials, and any other necessary information.

Initialize Swarm mode

In your deployment machine, initialize Docker Swarm mode:

# joins the swarm
$ docker swarm init

NOTE: For more information on what is Swarm and its key concepts, please refer to Docker's documentation.

Deploying services

After following the previous steps, ensure you are in the docker stack directory and then deploy the stack:

# change current working directory
$ cd docker

# deploys/updates the stack from the specified file
$ docker stack deploy -c compose.yml cs-data-viz

Verifying the Stack's Status

Check if all the services are running and have exactly one replica:

# list the services in the cs-data-viz stack
$ docker stack services cs-data-viz

You should see something like this:

ID                  NAME                    MODE                REPLICAS            IMAGE                          PORTS
acob7yl286jg        cs-data-viz_postgres    replicated          1/1                 postgres:cs-data-viz
vkzcj6n9t7tt        cs-data-viz_metabase    replicated          1/1                 metabase/metabase:v0.37.2      *:3000->3000/tcp

At this point, the following resources will be available to you:

  • Metabase UI is available at http://localhost:3000

NOTE: In case localhost doesn't work, you may try http://0.0.0.0:<port> instead.

Logging

In order to check a service's logs, use the following command:

# fetch the logs of a service
$ docker service logs <service_name>

NOTE: You may also follow the log output in realtime with the --follow option (e.g. docker service logs --follow cs-data-viz_postgres). For more information on service logs, refer to Docker's documentation.

Wrapping up

Once you're done, you may remove what was created by docker swarm init:

# removes the cs-data-viz stack from swarm
$ docker stack rm cs-data-viz

# leaves the swarm
$ docker swarm leave

NOTE: All the data created by the stack services will be lost. For more information on swarm commands, refer to Docker's documentation.

Contributing

We are always looking for contributors of all skill levels! If you're looking to ease your way into the project, try out a good first issue.

If you are interested in helping contribute to the project, please take a look at our Contributing Guide. Also, feel free to drop in our community chat and say hi. πŸ‘‹

Also, thank you to all the people who already contributed to the project!

License

Copyright Β© 2020-present, CS Data Viz Contributors. This project is ISC licensed.

About

πŸ“ˆ activity for my data visualization class

License:ISC License


Languages

Language:Python 92.0%Language:Shell 4.0%Language:Makefile 3.4%Language:Dockerfile 0.5%