Wazir Rohiman's repositories
hadoop_mapreduce_setup
Launch a single node Hadoop cluster using Docker and run MapReduce jobs
Apache_Spark_Basics
This series explores the basics of Apache Spark with the application of some practical elements of Spark, PySpark & SparkSQL
Apache_Spark_ETL_AND_SparkML
Final project for the IBM Data Engineer course on Apache Spark and Machine Learning covering ETL processes using Elyra and Apache Spark and running SparkML jobs
claimed-component-library
The goal of CLAIMED is to enable low-code/no-code rapid prototyping style programming to seamlessly CI/CD into production.
ETL-using-Linux-Shell
An ETL process using Linux shell command and Postgres SQL
etl_pipeline_using_airflow
Building an ETL Pipeline using Airflow
getting-started-with-mongodb
A guide to getting started with MongoDB. Connecting to the MongoDB server from the CLI and using python to access and run operations
highway_streaming_data_pipeline_using_kafka
Project that aims to de-congest the national highways by analyzing the road traffic data from different toll plazas. As a vehicle passes a toll plaza, the vehicle’s data like vehicle_id,vehicle_type,toll_plaza_id and timestamp are streamed to Kafka.
ibm_data-engineering_capstone_project
The final project for the IBM Data Engineering Professional Certificate
kafka_workflow
A workflow to stream data with kafka
MySQL_backup_restore_command_codes
Steps to follow for creating, backing up and restoring a logical backup of a MySQL database
populating_data_warehouse
Example of how to populate a data warehouse
Python-Project-DE---IBM_SN_Final_Assignment
The final assignment for the IBM Python Project for Data Engineering Course.
WAZIMAP-CSV-ETL-SCRIPT
A python ETL script to retrieve CSVs from wazimap. The final CSVs were loaded to PowerBI for visualisation
IBM_Cloudant_Lab
The repo describes the steps and the command used to create a document database on IBM Cloundant and Parsing JSON queries on the documents stored in the DB
using_cassandra_query_language_shell
A guide to connecting to the Cassandra client and running Keyspace, Table and CRUD operations using CQL Shell