Joshua Omolewa's repositories

Stock_streaming_pipeline_project

Built a real-time streaming pipeline to extract stock data, using Apache Nifi, Debezium, Kafka, and Spark Streaming. Loaded the transformed data into Glue database and created real-time dashboards using Power BI and Tableau with Athena. The pipeline is orchestrated using Airflow.

Language:PythonStargazers:17Issues:4Issues:0

Retailstore_ETL_pipeline_project

Built a Data Pipeline for a Retail store using AWS services that collects data from its transactional database (OLTP) in Snowflake and transforms the raw data (ETL process) using Apache spark to meet business requirements and also enables Data Analyst create Data Visualization using Superset. Airflow is used to orchestrate the pipeline

Language:PythonStargazers:5Issues:3Issues:0

edmonton_weather_aws_serverless_project

This is an AWS data engineering serverless project to track Edmonton weather in near real time using services like Kinesis Data Firehose, S3, AWS lambda, AWS Glue, Athena, IAM,

Language:PythonStargazers:2Issues:0Issues:0

Job_API_ETL_datapipeline_project

Building an ETL pipeline using AWS services that extract data from a Job API and then transforms data to meet business requirements and load data to S3 bucket

Language:PythonStargazers:1Issues:0Issues:0
Language:PythonStargazers:1Issues:0Issues:0

30-Days-Of-Python

30 days of Python programming challenge is a step-by-step guide to learn the Python programming language in 30 days. This challenge may take more than100 days, follow your own pace. These videos may help too: https://www.youtube.com/channel/UC7PNRuno1rzYPb1xLa4yktw

Language:PythonStargazers:0Issues:0Issues:0

awesome-interview-questions

:octocat: A curated awesome list of lists of interview questions. Feel free to contribute! :mortar_board:

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

ci-cd-project-1

Practicing CI/CD using github actions

Language:DockerfileStargazers:0Issues:0Issues:0

container-images

Docker images for Debezium. Please log issues in our JIRA at https://issues.redhat.com/projects/DBZ/summary

License:MITStargazers:0Issues:0Issues:0

Covid-19-analysis

Covid 19 Canada data analysis

Language:HTMLLicense:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

data-engineer-handbook

This is a repo with links to everything you'd ever want to learn about data engineering

Stargazers:0Issues:0Issues:0

Data-Engineering-learning

My data engineering practice

Language:ShellStargazers:0Issues:0Issues:0

data-engineering-practice

Data Engineering Practice Problems

Stargazers:0Issues:0Issues:0

debezium-examples

Examples for running Debezium (Configuration, Docker Compose files etc.)

License:Apache-2.0Stargazers:0Issues:0Issues:0

devops-exercises

Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions

License:NOASSERTIONStargazers:0Issues:0Issues:0

devops-resources

DevOps resources - Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP

Stargazers:0Issues:0Issues:0

docker_ETL_pipeline_project

ETL project that uses docker container containing a python script to extract the csv data, transform the csv data by combining files into a single file and then load data into an output folder and also ensure the output csv file file is still available even if the container is shutdown.

Language:PythonStargazers:0Issues:0Issues:0

flink

Apache Flink

License:Apache-2.0Stargazers:0Issues:0Issues:0

Git_practice

I use this repo to practice my git skills

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

markdown-here

Google Chrome, Firefox, and Thunderbird extension that lets you write email in Markdown and render it before sending.

License:MITStargazers:0Issues:0Issues:0

Miscellaneous

Includes notes on Apache Spark, Spark for Physics, Jupyter notebook examples for Spark, Oracle and other DB systems.

License:Apache-2.0Stargazers:0Issues:0Issues:0

spark-syntax

This is a repo documenting the best practices in PySpark.

Stargazers:0Issues:0Issues:0

sqlfluff

A modular SQL linter and auto-formatter with support for multiple dialects and templated code.

License:MITStargazers:0Issues:0Issues:0

system-design-notebook

Learn System Design step by step

License:NOASSERTIONStargazers:0Issues:0Issues:0

tech-interview-handbook

💯 Curated coding interview preparation materials for busy software engineers

License:MITStargazers:0Issues:0Issues:0

Toronto_Climate_API_ETL_project

Built an ETL Pipeline that extract Climate data from API and transform the data by combining all data extracted from API into a single file which is then loaded into an output folder

Language:ShellStargazers:0Issues:0Issues:0

trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

License:Apache-2.0Stargazers:0Issues:0Issues:0