Ronald Angel's repositories
owl-data-sanitizer
A pyspark lib to validate data quality
dataset_deduplication_sparkml
Dataset deduplication using the spark ML lib and Scala
airflow
Apache Airflow
Apache-2.0000
charts
Curated applications for Kubernetes
Apache-2.0000
cube-summation-approach
Cube Summation problem (https://www.hackerrank.com/challenges/cube-summation) solution using a python dictionaries approach.
Language:Python000
datahub
The Metadata Platform for the Modern Data Stack
Apache-2.0000
files-partitioner-HDFS-like
File Partitioner of random data that simulates HDFS (version 1) data node behaviours but storing locally.
000
poetry
Python dependency management and packaging made easy.
MIT000
predictor-flask-hd
Service to classify hand written digits using tensorflow + keras + flask
Language:Python000
research-papers
Research papers - master degree in computer science (Distributed Systems + IA)
000