There are 0 repository under google-dataproc topic.
Google BigQuery data source for Apache Spark
StackExchange data procured is cleaned with pig, queried with hive-ql, performed tf-idf to obtain top 10 words used by top 10 users of StackExchange.
This project orchestrates a data processing workflow using Apache Airflow, Spark, Google Cloud Storage (GCS), and Snowflake. The workflow is designed to handle daily data updates, filter completed orders, and update a Snowflake target table with the latest information. The project leverages Apache Airflow for workflow scheduling and management.
Welcome to the Learning and Experiments Hub—a dynamic repository capturing my journey of exploration and experimentation in the vast world of technology. This space serves as a digital canvas where I document my learning process, experiments, and discoveries.
Welcome to the MiniProjects Playground—an interactive space where learning meets doing! This repository is a collection of hands-on mini-projects that I've crafted after delving into various tech stacks and frameworks. From theory to application, each project is a testament to the practical side of coding.
Desafio do curso Criando um Ecossistema Hadoop Totalmente Gerenciado com Google Cloud Dataproc
Experiments with MapReduce (mrjob) and Google Dataproc