Project: Data Modeling with Postgres

Completed by Ken Jung, as part of the Udacity Data Engineering Nanodegree Program

Introduction

Fictional start-up called Sparkify wants to analyze the data they've been collecting on songs and user activity on their new music streaming app. Their analytics team is particularly interested in understanding what songs users are listening to. Currently, they don't have an easy way to query their data, which resides in a directory of JSON logs on user activity on the app, as well as a directory with JSON metadata on the songs in their app. As a data engineer, the tasks involve creating a Postgres database with tables to optimize queries on song play analysis.

Project Workspace and Files

Data: original dataset for logs and songs in the format of JSON
create_tables.py: Schema creation
etl.py: ETL process
sql_queries.py: SQL queries
etl.ipynb: ETL helper notebok
test.ipynb: Postgres SQL notebook

ohikendoit / data_modelling_with_postgres

Project: Data Modeling with Postgres

Introduction

Project Workspace and Files

About

Languages