data-engineer data python s3 example data-lake

Data Enginner - Datathon Example

Data Engineers are the data professionals who prepare the "big data" infrastructure to be analyzed. They often design, build, integrate data from various data resources. Also they, usually, run some ETL (Extract, Transform and Load) on top of big datasets.

This repository present an example how to consume a web API and develop a web crawler, in order to develop our own Data Lake (i.e. , data repository of blobs or raw files).

As Data Lake was used Amazon S3. Also, the code was developed in Python 3.

Python

Example consuming the API.

python python/deputados_api.py

Example of a web crawler.

python python/noticias_crawler.py

Amazon S3 handler.

python/s3_handler.py

Also look ~

Created by Leonardo Mauro ~ leomaurodesenv
Presented by Itera - GitHub

About

An example how to consume a web API and develop a web crawler

data-engineer data python s3 example data-lake

Languages

Language:Python 100.0%