Assignment #1:

Find a relatively simple dataset that you can work with and perform some sort of aggregation on, such as movie scores, movie reviews, sporting event stats, web server stats, etc.
Pick a dataset with a lot of different distinct entities (different movies, candies, etc.).
Pick a dataset with a standard numerical range as a rating or average of some field within the data.
Pick a dataset not too small, not too large. (5k rows < num < 1m rows)
Try to save the dataset to disk so you’re not requesting the data from an API or web site each time you run your script.
Write a script to ingest that data from a file and save to a database. (SQLite, PostgreSQL, MySQL/MariaDB)
Don’t worry about adding indexes at this point.
Write a script to output basic stats about that data from the database to prove the visibility and accessibility of the data.
Push your code to your personal GitLab repo. (call it “onboarding” or something)
Set up linting and testing and get your build to be successful/green. (see https://gitlab.s.fpint.net/collections/bmt/blob/master/.gitlab-ci.yml and https://gitlab.s.fpint.net/collections/bmt/blob/master/prova.unit.yml )

matthewepler / python_postgres