matthewepler / python_postgres

practice

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Assignment #1:

  • Find a relatively simple dataset that you can work with and perform some sort of aggregation on, such as movie scores, movie reviews, sporting event stats, web server stats, etc.
  • Pick a dataset with a lot of different distinct entities (different movies, candies, etc.).
  • Pick a dataset with a standard numerical range as a rating or average of some field within the data.
  • Pick a dataset not too small, not too large. (5k rows < num < 1m rows)
  • Try to save the dataset to disk so you’re not requesting the data from an API or web site each time you run your script.
  • Write a script to ingest that data from a file and save to a database. (SQLite, PostgreSQL, MySQL/MariaDB)
  • Don’t worry about adding indexes at this point.
  • Write a script to output basic stats about that data from the database to prove the visibility and accessibility of the data.
  • Push your code to your personal GitLab repo. (call it “onboarding” or something)
  • Set up linting and testing and get your build to be successful/green. (see https://gitlab.s.fpint.net/collections/bmt/blob/master/.gitlab-ci.yml and https://gitlab.s.fpint.net/collections/bmt/blob/master/prova.unit.yml )

About

practice