iampawanpoojary / FooBank

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FooBank

  • main.py contains the python script
  • Foobank.ipynb contains the same code but with data visualized
  • Html version of the jupyter notebook is included
  • foobank_beam_pipeline.py consists of unfinished apache beam pipeline which i thought would be fun to do, but did not have enough time to finish. However its a dataflow pipeline which does a left join (cogroupbykey) on loan and customer data (sample data in code), and stores them into bigquery.
Link to jupyternotebook

https://nbviewer.jupyter.org/github/iampawanpoojary/FooBank/blob/main/foobank.ipynb

To run the dataflow pipeline

python -m foobank_beam_pipeline.py 
--runner DataflowRunner 
--project=pawpooja 
--region=europe-west1 
--staging_location=gs://pawpooja/test 
--temp_location gs://pawpooja/tmp/

dataflow execution graph alt text

bigquery output alt text

About


Languages

Language:HTML 92.1%Language:Jupyter Notebook 6.0%Language:Python 1.9%