3amory99

Omar Mahmoud's repositories

Gravity-Books-Sales-End-to-End-Project

This project encompasses the complete data lifecycle, from data extraction and transformation to in-depth analysis and compelling visualizations. The process is divided into three main phases

Language:TSQL8 10

Amazon-Product-Scrapping-with-Selenium

Gather essential product data from Amazon with ease using this Python web scraper and Selenium. Extract product descriptions, prices, ratings, and more for insightful market research and analysis.ct data from Amazon with ease using this Python web scraper. Extract product descriptions, prices, ratings, and more for insightful market resear

Language:Jupyter Notebook6 10

Building-Sales-Data-Mart-Using-ETL-SSIS

By using AdventureWorks2022 Dataset I have built a Sales Data Mart using (SQL Server Integration Services SSIS) SQL Server involves leveraging the capabilities of Integration Services (SSIS) and the Modeling of SQL Server, This Data mart offers several benefits, making them valuable components in the main purpose of data management and analytics wi

Language:TSQL6 10

Telecom-ETL-Using-SSIS

Data, resources, and tools, to help your business thrive. GSMA Services are here to enable companies and organizations within the telecoms ecosystem, to do business together more effectively. As well as to maintain the standards and reputation of the industry. Our services are underpinned by unique, accurate, and comprehensive data. Information, on

Language:TSQL5 10

Singularity-Data-Analysis

In this repository I will take you on a comprehensive tour of Data Analysis track

Language:Jupyter Notebook3 10

Houses-price-in-cairo-2023-data-analysis

Language:Jupyter Notebook200

Mastering-Python-Hackerrank-Workshop

This repo hosts the solutions for a comprehensive workshop on mastering Python programming through HackerRank challenges. Whether you're a beginner looking to learn Python from scratch or an experienced programmer aiming to enhance your Python skills, this workshop provides a structured learning path that covers a wide range of topics.

Language:Jupyter Notebook2 10

RedHat-CentOS-Linux-Commands-Cheat-sheet

Explore the power of Linux (CentOS) commands for efficient data engineering. This repository provides a comprehensive collection of essential Linux commands tailored for data engineers. From file manipulation to data transformation and pipeline management. It's mainly as a reference for me as a Data Engineer.

2 10

---------

1 10

3amory99

1 10

Analyze-A-B-Test-Results

Project to analyze A/B test results using python

Language:Jupyter Notebook1 10

Building-Northwind-DWH-Using-Talend

Unlock the power of data with our comprehensive Talend project aimed at constructing a robust (DWH) from the renowned Northwind dataset. Divided into two pivotal phases, this project seamlessly integrates data from the Northwind Access Database and the Transactional Database in SQL Server.

Language:Java1 10

Converting-nested-JSON-structures-to-Pandas-DataFrames

Converting Nested JSON Structures to Pandas DataFrames

Language:Jupyter Notebook1 10

Customer-Churn-Data-Analytics-Data-Pipeline

Customer Churn Data Analytics Data Pipeline using Apache Airflow, Glue, S3, Redshift, PowerBI

Language:Python1 10

Data-Cleaning-with-Python-and-Pandas

Language:Jupyter Notebook1 10

Data-Cleaning-with-Python-Using-Pandas

It's a process of preparing raw, unstructured, or messy data for analysis by using the Python programming language and Pandas library. This involves tasks such as handling missing values, removing duplicates, correcting data types, and transforming data into a more usable format. Data cleaning is a crucial step in the data preprocessing pipeline.

Language:Jupyter Notebook1 10

Data-Modeling-With-Apache-Cassandra-Using-Python

The Sparkify Music Streaming Analysis project focuses on creating a NoSQL database and an ETL pipeline for Sparkify, a music streaming startup. Sparkify aims to analyze the data collected from its new music streaming app, covering songs and user activities.

Language:Jupyter Notebook1 10

EDA-IBM-HR-Analytics-Employee-Attrition

This challenge is designed to explore and analyze factors contributing to employee attrition in a simulated HR setting using a dataset from IBM.

Language:Jupyter Notebook1 10

EDA-on-Netflix-dataset

When it comes to streaming media, Netflix is the king. The company that was founded 20 years ago as a mail-order DVD rental service has since transformed its business model completely to match the ever-changing tech landscape. As a result of that, the company now boasts more than 200 million subscribers worldwide and secures a spot as one of the bi

Language:Jupyter Notebook100

EDA-on-Netflix-Movies-and-TV-Shows-

Netflix is a leading player in streaming media with over 200 million global subscribers. Its transformation from DVD rental service to media publisher through its Netflix Originals program has made it a dominant player in the industry.

Language:Jupyter Notebook1 10

HealthCare-DWH-Integration-and-Analysis

Use Case: Data Warehouse Design and ETL Process for Healthcare Data and get insights using SSAS

1 10

Loading_and_Saving_JSON_Files

Working with JSON (JavaScript Object Notation) in Python is quite straightforward and commonly used, especially when dealing with data interchange between different systems or when storing configuration data. Python provides built-in libraries for working with JSON data. Here's how you can work with JSON in Python

Language:Jupyter Notebook1 10

OLAP-Cubes-in-SSAS

Build an OLAP Cube in SSAS from SQL Server Analysis Services Data

Language:MDX1 10

Olympic-Data-Analytics-End-to-End-Project

1 10

Podcust-Summary-Data-Pipeline-Using-Airflow

Creating a data pipeline using Airflow. The pipeline will download podcast episodes and automatically transcribe them using speech recognition. We'll store our results in a SQLite database that we can easily query.

Language:Python1 10

Sparkify-App-Data-Lake-Using-Apache-Spark-and-S3

Sparkify app, my objective is to assist Sparkify, a music streaming startup, in migrating its data warehouse to a data lake. To achieve this, I have developed an ETL (Extract, Transform, Load) pipeline. This pipeline is designed to extract data from S3, process it using Apache Spark, and subsequently load the processed data into a new S3 storage lo

Language:Jupyter Notebook1 10

Sparks-Foundation-Intern

Language:Jupyter Notebook1 10

Wuzzuf-Job-Postings-Web-Scraping

1 10

Movies-Production-Insights-Pipeline

Unveiling Cinematic Brilliance: Illuminating the Future of Movie Production through Data-Driven Insights

Language:Java01 1

Spark-Practical-Sessions

This repository contains practical exercises and examples for learning Apache Spark. Whether you are a beginner or have some experience with Spark, these practical sessions will help you sharpen your skills and understanding of big data processing with Spark.

Language:Jupyter Notebook010