janaom's repositories
KodeKloud-Engineer-2.0
Solutions for the KodeKloud Engineer 2.0 tasks.
gcp-data-engineering-etl-with-composer-dataflow
This project leverages GCS, Composer, Dataflow, BigQuery, and Looker on Google Cloud Platform (GCP) to build a robust data engineering solution for processing, storing, and reporting daily transaction data in the online food delivery industry.
gcp-de-project-streaming-pubsub-beam-dataflow
This project demonstrates an end-to-end solution for processing and analyzing real-time conversations data from a JSON file using GCP services and infrastructure automation, showcasing data storage, streaming, processing, and analysis at scale.
GCP-DE-project-uber-etl-pipeline
Technologies used: GCS, Compute Engine, Mage, BigQuery, Looker, Python
Meta-Database-Engineer-Professional-Certificate
Labs from Meta Database Engineer Professional Certificate on Coursera
GCP-BigQuery-Project-Exploring-Londons-Travel-Network
Use BigQuery to build a project. Use SQL to analyze a database containing information about Transport for London journeys over 12 years
gcp-professional-data-engineer-exam-prep-guide
This repository contains my personal notes for the Google Cloud Professional Data Engineer certification exam, compiled from official Google Cloud documentation.
terraform-zero-to-hero
Master Terraform in 7 days using this Zero to Hero course.
Apache-Beam-practice
It's time to learn Beam! This repository contains a collection of tasks and exercises focused on Apache Beam.
gcp-de-project-connect-four-with-python-dataflow
Connect Four Data Engineering Project: leveraging GCS for scalable and durable storage, Dataflow for data extraction and transformation, BigQuery as the data repository, Slack Integration for real-time sharing, Looker for insightful reports and visualizations, and Email Scheduler for automated report delivery.
gcp-de-project-weather-forecast-sms-with-airflow
This project was born out of the need to know the weather forecast for Paris/Vilnius while attending the KubeCon Europe conference. Fetches next-day forecast for Paris and Vilnius using a weather API, securely stores data in GCS bucket, and sends personalized SMS updates via Twilio. Powered by GCP and automated with Composer/Airflow
airflow-vars-conn-secret-manager
This RP highlights the importance of securing Airflow variables and connections using GCP Secret Manager.
Automating-Real-World-Tasks-with-Python
Solutions for the final course "Automating Real-World Problems with Python" from the course Google IT Automation with Python Professional Certificate
certified-kubernetes-administrator-course
Certified Kubernetes Administrator - CKA Course
databricks-learning
Get started with Databricks! This repository provides a beginner-friendly introduction to the platform, covering fundamental concepts and practical exercises from DataCamp to help you build your skills in data processing and analysis.
datacamp-professional-data-engineer-in-python
Dive deep into advanced skills and state-of-the-art tools revolutionizing data engineering roles today with DataCamp Professional Data Engineer track.
devops-exercises
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
etl-pipeline-terraform-airflow-gcp
This repo contains the solution for an ETL pipeline on GCP, using Terraform for infrastructure and Airflow for orchestration.
example-voting-app
Example distributed app composed of multiple containers for Docker, Compose, Swarm, and Kubernetes
gcp-associate-data-practitioner-exam-prep-guide
This repository contains my personal notes from the "Introduction to Data Engineering on Google Cloud" course, part of the Associate Data Practitioner Learning Path.
gcp-de-project-data-pipeline-with-cloud-run-functions-airflow-biggueryml
Build a data pipeline on Google Cloud using an event-driven architecture, leveraging GCS, Cloud Run functions, and BigQuery. Explore both VM and Composer options for Airflow management, and utilize Logging & Monitoring for pipeline health. Discover how SQL-based BigQuery ML can be used for initial ML implementation in specific scenarios.
introduction-to-pyspark
This repository serves as a comprehensive guide to PySpark, featuring theory and exercises sourced from DataCamp. It is designed for beginners looking to understand the fundamentals of PySpark and its applications in big data processing.
it-cert-automation-practice
Google IT Automation with Python Professional Certificate - Practice files
Kubernetes-and-Cloud-Native-Associate-KCNA
Useful notes for the KCNA - Kubernetes and Cloud Native Associate
Python-Pandas-Data-Science-Tutorial
Python Pandas Data Science Tutorial (Read CSV/Excel, add/delete columns, Filter, Groupby, Slice)