jattenberg / monte-karl-malone

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MDS in a box

This project serves as end to end example of running the "Modern Data Stack" in a local environment. Development is primarily done on Windows via WSL, which means Mac is untested (but should work).

Current progress

Right now, you can get the nba schedule and elo ratings from this project and generate the following query. more to come, see to-dos at bottom of readme. And of course, the dbt docs are self hosted in Github Pages, check them out here. image image

Getting started - OS-X

  1. build your project & run your pipeline
make build
make run
  1. Connect duckdb to superset. first, create an admin users
meltano invoke superset:create-admin
  • then boot up superset
meltano run superset:ui
  • lastly, connect it to duck db. navigate to localhost:8088, login, and add duckdb as a database.

    • SQL Alchemy URL: duckdb:////tmp/mdsbox.db

    • Advanced Settings > Other > Engine Parameters: {"connect_args":{"read_only":true}}

  1. Explore your data inside superset. Go to SQL Labs > SQL Editor and write a custom query. A good example is SELECT * FROM reg_season_end.

Running your pipeline on demand

After your run make run, you can run your pipeline again at any time with the following meltano command:

meltano run tap-spreadsheets-anywhere target-duckdb dbt-duckdb:build

Todos

  • write initial steps
  • create a makefile so you 'make pipeline' and it just all happens
  • get data and load to github storage
  • add extraction steps to spreadsheets anywhere
  • build basic data frame w/dbt
  • build the monte carlo sim
  • add meta-stats
    • playoff seeding
    • playin game stuff
    • playoff schedule
    • series winners
    • playoff wins
  • some basic charts in superset (replicate 538?)
  • add github action to build it
  • add dbt docs as github pages

Optional stuff

  • add dbt tests
  • add model descriptions
  • change elo calculation to a udf
  • make playoff elimination stuff a macro (param: schedule type)

About


Languages

Language:Makefile 100.0%