Athoillah21 / Project-3-Data-Warehouse-Automating

Project 3 from DigitalSkola to understand how to make an automate ingestion data to postgresql from python code.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


Datawarehouse Automating

The purpose of this project is to answer question 2a. I answered the question using the python programming language.

Step 1 : Unload .json file become a standard PostgreSQL DDL command

Define the database column of the given json file. Here I use the following workflow, the json data that has been given is loaded, then the format is converted into a list and then converted into a tuple so that it resembles a DDL command in general in PostgreSQL.

Step 2 : Establishing a connection with PostgreSQL

Using psycopg2 module to make connection with local postgreSQL. Before that, create a new database on localhost according to the name commanded.

Step 3 : Load dataset to PandasDataframe before inserting into Database

Use the zipfile module to extract the dataset and then convert it to pandas dataframe form. Perform commands in the problem, such as filtering data by date.

Step 4 : Create engine and insert dataframe into PostgreSQL Database

Using sqlalchemy module create engine that connect code written in python with defined postgresql connections. Along with what code will be executed. In this case, the code inserts data into the database. advantage of using python, i.e. code can be reused.

About

Project 3 from DigitalSkola to understand how to make an automate ingestion data to postgresql from python code.


Languages

Language:Python 100.0%