Mario Renau (mrenau)

mrenau

Geek Repo

Location:Madrid / Valencia / Remote

Github PK Tool:Github PK Tool

Mario Renau's starred repositories

project-based-learning

Curated list of project-based tutorials

OpenSearch

🔎 Open source distributed and RESTful search engine.

Language:JavaLicense:Apache-2.0Stargazers:9165Issues:142Issues:5368

seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.

Language:JavaLicense:Apache-2.0Stargazers:7669Issues:172Issues:3064

postgresml

The GPU-powered AI application database. Get your app to market faster using the simplicity of SQL and the latest NLP, ML + LLM models.

Language:RustLicense:MITStargazers:5793Issues:54Issues:222

data-diff

Compare tables within or across databases

Language:PythonLicense:MITStargazers:2924Issues:22Issues:318

jgrapht

Master repository for the JGraphT project

Language:JavaLicense:EPL-2.0Stargazers:2559Issues:115Issues:518

etl-with-airflow

ETL best practices with airflow, with examples

xlskubectl

xlskubectl — a spreadsheet to control your Kubernetes cluster

frameless

Expressive types for Spark.

Language:ScalaLicense:Apache-2.0Stargazers:872Issues:29Issues:157

celeborn

Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.

Language:JavaLicense:Apache-2.0Stargazers:836Issues:38Issues:493

Daily-Dose-of-Data-Science

A collection of code snippets from the publication Daily Dose of Data Science on Substack: http://www.dailydoseofds.com/

Language:Jupyter NotebookStargazers:759Issues:53Issues:1

testcontainers-scala

Docker containers for testing in scala

Language:ScalaLicense:MITStargazers:624Issues:21Issues:106

automate-dv

A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)

ActivitySchema

Repository for the ActivitySchema spec and supporting materials

kafka-delta-ingest

A highly efficient daemon for streaming data from Kafka into Delta Lake

Language:RustLicense:Apache-2.0Stargazers:343Issues:26Issues:63

mack

Delta Lake helper methods in PySpark

Language:PythonLicense:MITStargazers:290Issues:15Issues:71

gtfs-validator

Canonical GTFS Validator project for schedule (static) files.

Language:JavaLicense:Apache-2.0Stargazers:270Issues:23Issues:811

CursoIntroPython

Curso de introducción a la programación con python para Launch X de Innovacción Virtual

Language:Jupyter NotebookStargazers:225Issues:7Issues:0

awesome-dataops

:sunglasses: A curated list of awesome DataOps tools

Language:PythonStargazers:130Issues:6Issues:0

diepvries

The Picnic Data Vault framework.

Language:PythonLicense:MITStargazers:126Issues:31Issues:2

spark-sql-flow-plugin

Visualize column-level data lineage in Spark SQL

Language:ScalaLicense:Apache-2.0Stargazers:83Issues:6Issues:4

analytical_dp_with_sql

Code for my "Efficient Data Processing in SQL" book.

Scala-Category-Theory

Bartosz Milewski great book on Category Theory implemented in scala, with property Tests

Language:ScalaLicense:GPL-3.0Stargazers:30Issues:5Issues:0

scalacrashcourse

Crash course in Scala

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:24Issues:11Issues:0
License:CC0-1.0Stargazers:22Issues:0Issues:0

hitchhikers_guide_to_deltalake_streaming

Don't Panic. This guide will help you when it feels like the end of the world.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:19Issues:1Issues:20

trino-plugins

Simplified custom plugins for Trino

Language:ScalaLicense:Apache-2.0Stargazers:16Issues:6Issues:0

dbt-data-ai-summit

Code that was used as an example during the Data+AI Summit 2020

License:Apache-2.0Stargazers:15Issues:0Issues:0

df-gtfs

Script para importar dataset de "df_gtfs" a PostgreSQL