Baligh Mnassri's repositories
multiple-linear-regression
Estimating the rent in Paris using multiple linear regression with dummies variables.
Titanic-logistic-regression-with-python
This kernel was inspired in part by the work of SarahG's analysis that I thank very much for the quality of her analysis. This work represents a deeper analysis by playing on several parameters while using only logistic regression estimator.
customer-churn-prediction-with-python
This repository exposes some machine learning classifiers applied on data from Kaggle web site. It presents 18 classifiers that will be compared using the GridSearchCV method. In addition, it focuses on the data visualization using interactive plots.
time-series-forecasting-models-for-wind-speed
Time series forecasting models for weather features
categorical-data-python
A simple demo repository to show how to handling categorical data in python
pyspark-examples
This tutorial presents some examples in order to give a quick overview of the Spark APIs.
installing-spark-standalone-and-hadoop-yarn-on-cluster
This repository describes all the required steps to install Spark Standalone and Hadoop Yarn modes on multi-node cluster.
jupyter-putty-aws-ec2
Steps to access jupyter notebooks located in an Amazon EC2 instance from any browser for Windows users using PuTTy Console.
nltk-python-examples
In this tutorial we try to define what is NLP? and what are the benefits of learning NLP?
bankmarketing-sparkml-databricks
This tutorial analyses a binary classification example based on Spark ML applied with Python language programming and running a databricks cloud community edition cluster.
churn-prediction-with-pyspark
This tutorial is created by Baligh Mnassri. It is inspired from that realized by Ben Sadeghi. This work is improved, extended and adapted to be running on the databricks cloud. It presents six classifiers that will be compared at the cross validation part. I will explain how to compute the different evaluate metrics on the binary classification case.
mushrooms-classification-using-R
Investigating the mushroom dataset in order to build models able to identify poisonous and edible mushrooms.
covid-fr-dashboard
This application allows users to visualize some KPIs for the evolution of Covid-19 pandemic in France at country and departments scale.
installing-hadoop-cluster
This repository exposes all necessary steps to install Hadoop on single node cluster as well as multi-node cluster of virtual machines with Debian 9 Operating System.
prediction-of-house-prices
With 80 explanatory variables describing (almost) every aspect of residential homes in Ames city, Iowa in USA, the aim is to predict the final price of each home.
text-summarizer-app
Building and embedding text summarize based on machine learning model into a web app with Flask.