Mario Renau's repositories
data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
spark-playground
Code snippets used in demos recorded for the blog.
a-kafka-story
Kafka ecosystem ... but step by step!
acid-file-formats
Code for Apache Hudi, Apache Iceberg and Delta Lake analysis
akka-cassandra-demo
The repository for the demonstration of Akka & Cassandra integration
awesome-data-engineering
A curated list of data engineering tools for software developers
bigdata_stack
Dockerized Hadoop/Minio/Hive/Presto stack
code
Example application code for the python architecture book
CursoIntroPython
Curso de introducción a la programación con python para Launch X de Innovacción Virtual
data-product-streaming
data-product-streaming
datamesh
Material for the DataMesh presentation at GoDataFest 2021
efficient_data_processing_spark_fork
Code for "Efficient Data Processing in Spark" Course
etl-with-airflow
ETL best practices with airflow, with examples
incubator-pekko-samples_fork
Apache Pekko Sample Projects
kubeflow-spark
Orchestrate Spark Jobs from Kubeflow Pipelines and poll for the status.
machine-learning-engineering-for-production-public
Public repo for DeepLearning.AI MLEP Specialization
ml-deployment
Repo for post
nessie-demos
Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.
OpenMetadata
Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.
presto-workload-analyzer
The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them
spark-daria
Essential Spark extensions and helper methods ✨😲
talos
Lawful circuit breakers for Scala. Akka and monix circuit breaker implementations with monitoring.
zeppelin_fork
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.