Oswaldo Fuenmayor (fuenos01)

fuenos01

Geek Repo

Company:Nielsen

Location:Oldsmar, FL

Home Page:http://www.nielsen.com/

Github PK Tool:Github PK Tool

Oswaldo Fuenmayor's starred repositories

Language:ScalaLicense:Apache-2.0Stargazers:584Issues:0Issues:0

MapReduce-Performance_Testing

MapReduce performance testing using teragen and terasort

Language:ShellStargazers:18Issues:0Issues:0

terraform-aws-eks

Terraform module to create Amazon Elastic Kubernetes (EKS) resources 🇺🇦

Language:HCLLicense:Apache-2.0Stargazers:4413Issues:0Issues:0

terragrunt

Terragrunt is a flexible orchestration tool that allows Infrastructure as Code written in OpenTofu/Terraform to scale.

Language:GoLicense:MITStargazers:7987Issues:0Issues:0

terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.

Language:GoLicense:NOASSERTIONStargazers:42466Issues:0Issues:0

pyenv-virtualenv

a pyenv plugin to manage virtualenv (a.k.a. python-virtualenv)

Language:ShellLicense:MITStargazers:6304Issues:0Issues:0

pyenv

Simple Python version management

Language:RoffLicense:MITStargazers:38881Issues:0Issues:0

docker-airflow

Docker Apache Airflow

Language:ShellLicense:Apache-2.0Stargazers:3768Issues:0Issues:0

Miscellaneous

Includes notes on using Apache Spark in general, notes on using Spark for Physics, how to run TPCDS on PySpark, how to create histograms with Spark, tools for performance testing CPUs, Jupyter notebooks examples for Spark, examples for Oracle and other DB systems.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:422Issues:0Issues:0

brew

🍺 The missing package manager for macOS (or Linux)

Language:RubyLicense:BSD-2-ClauseStargazers:41015Issues:0Issues:0

gradle

Adaptable, fast automation for all

Language:GroovyLicense:Apache-2.0Stargazers:16738Issues:0Issues:0
Language:PythonLicense:GPL-3.0Stargazers:39Issues:0Issues:0

flintrock

A command-line tool for launching Apache Spark clusters.

Language:PythonLicense:Apache-2.0Stargazers:637Issues:0Issues:0

spark-ec2

Scripts used to setup a Spark cluster on EC2

Language:PythonLicense:Apache-2.0Stargazers:392Issues:0Issues:0

hudi

Upserts, Deletes And Incremental Processing on Big Data.

Language:JavaLicense:Apache-2.0Stargazers:5348Issues:0Issues:0

azkaban

Azkaban workflow manager.

Language:JavaLicense:Apache-2.0Stargazers:4458Issues:0Issues:0

spark-tpcds-datagen

All the things about TPC-DS in Apache Spark

Language:ScalaLicense:Apache-2.0Stargazers:104Issues:0Issues:0

spark-tpc-ds-performance-test

Use the TPC-DS benchmark to test Spark SQL performance

Language:TSQLLicense:Apache-2.0Stargazers:175Issues:0Issues:0

spark-daria

Essential Spark extensions and helper methods ✨😲

Language:ScalaLicense:MITStargazers:751Issues:0Issues:0

spark-style-guide

Spark style guide

Language:Jupyter NotebookStargazers:256Issues:0Issues:0

awesome-spark

A curated list of awesome Apache Spark packages and resources.

Language:ShellLicense:CC0-1.0Stargazers:1701Issues:0Issues:0

airflow-maintenance-dags

A series of DAGs/Workflows to help maintain the operation of Airflow

Language:PythonLicense:Apache-2.0Stargazers:1667Issues:0Issues:0

airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Language:PythonLicense:Apache-2.0Stargazers:36561Issues:0Issues:0

powermock

PowerMock is a Java framework that allows you to unit test code normally regarded as untestable.

Language:JavaLicense:Apache-2.0Stargazers:4156Issues:0Issues:0

jvm-profiler

JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter

Language:JavaLicense:NOASSERTIONStargazers:1781Issues:0Issues:0

s4cmd

Super S3 command line tool

Language:PythonLicense:Apache-2.0Stargazers:1367Issues:0Issues:0

s3cmd

Official s3cmd repo -- Command line tool for managing S3 compatible storage services (including Amazon S3 and CloudFront).

Language:PythonLicense:GPL-2.0Stargazers:4560Issues:0Issues:0

delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

Language:ScalaLicense:Apache-2.0Stargazers:7492Issues:0Issues:0

Spark-The-Definitive-Guide

Spark: The Definitive Guide's Code Repository

Language:ScalaLicense:NOASSERTIONStargazers:2839Issues:0Issues:0

scopt

command line options parsing for Scala

Language:ScalaLicense:NOASSERTIONStargazers:1433Issues:0Issues:0