Omar Mohamed's repositories
Data-Preprocessing-Techniques
When it comes to creating a Machine Learning pipeline, data preprocessing is the first step marking the initiation of the process. Typically, real-world data is incomplete, inconsistent, inaccurate (contains errors or outliers), and often lacks specific attribute values/trends. This is where data preprocessing enters the scenario – it helps to clean, format, and organize the raw data, thereby making it ready-to-go for Machine Learning models. Let’s explore various steps of data preprocessing in machine learning, but firstly we need to understand the concept of Noisy data.
binet
This is the complementary code repository for the BINet papers.
Data-Understanding-and-Visualization
Data on number of births every month for approx 135 counties. The columns are self-explanatory, I have described the important columns below. Country or area: The name of the country. Year: the year for which the record is stored. Month: The name of the month. Number of births: The total number of births that happed in the month. Code tries to understand and interpret data in Sweden and you can generalize by trying other areas.
DataCamp-ParlAI-tutor
In this tutorial we will: Chat with a neural network model! Show how to use common commands in ParlAI, like inspecting data and model outputs. See where to find information about many options. Show how to fine-tune a pretrained model on a specific task. And other things as well you can do with the large scale project ParlAI that aim to change the perception of NLPs.
Gradient-Descent-Types-implementation
Gradient descent different types implemented from scratch. The project covers the Stochastic, Mini-batch, and Batch gradient descent with different types including; Ada-Grad, RMS-Prop, Adam, NAG, and Momentum-Based Gradient descent.
Intermediate_Python_DataCamp
Data Camp Intermediate Python Course Certification
ivy
The Unified Machine Learning Framework
pdsnd_github
GitHub project (Project 3) repository for PDSND
Project_Time-Series-Analysis-of-NAICS
Project is a Time Series Analysis of NAICS dataset, the dataset provides employment data in which the project analyzes it and tries to create some visualizations and insights.
Titanic-Survivals-Data-Analysis-and-Modeling
The Titanic Problem The objective of the Titanic problem defined on the Kaggle website as stated in the following: "The sinking of the Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone onboard, resulting in the death of 1502 out of 2224 passengers and crew. While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others. In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (i.e. name, age, gender, socio-economic class, etc.)."