Data Science Portfolio

Resume | Blog | LinkedIn

Utilized statistical methods to find the significant differences in the modulation of brain cardiovascular pulse with respiration between controls and Alzheimer’s cases to help the neurological researchers to have a better understanding of Alzheimer’s diseases. The differences found were strongly significant (P<0.01) and novel.
Preprocessed and extracted features using a 3D multiresolution optical flow of 0.25 TB complex brain imaging data using python.
Publication: Youssef Hosni, Ahmed Elabasy et.al., Respiration modulates cardiovascular brain impulse pathology in Alzheimer’s disease. Submitted to Journal of Cerebral Blood Flow and Metabolism.
Publication: Ahmed Elabasy, Youssef Hosni et.al., Optical Flow Analysis of Propagating Respiratory Brain Pulsations. Submitted to IEEE Transactions on Medical Imaging.

Automobile price prediction: Utlitize python to implement end to end data science pipeline to predict the price of old Automobile based on the given features.

Sensor Activity Recogniation: Classifying the output of eight sensors into five activities and studied the effect of changing window sizes and axel combination.
Alzhimers CV-BOLD Classification: Utilized Python to develop supervised machine learning techniques to classify imbalanced Alzheimer’s CVBOLD data, which enhanced the classification performance by 10%.

Find the best location to open a new Gym: Utilized python to implement unsupervised techniques to helping the business owner to increase his revenue by finding the best neighborhood to open a new gym.
Customer identification for mail order products: Utilized python to implement unsupervised techniques to helping the business owner to increase his revenue by finding the best neighborhood to open a new gym.

Melenoma Classification: Classifying malignant Melanoma using skin lesion images using CNN-based classifiers.

Pose Estimation and Squat counter: Utilize python to develop a real-time pose estimation and squat counter using MovingNet lightning.
Real Time Sign Language interpretation App: Developed a real-time sign language interpretation application using React.js and tensorflow.js.

Sentiment Analysis web app: Web application for classification of reviews, using deep learning model implemented in PyTorch and deployed on Amazon SageMaker.
Plagirasm Detector web app: Creating plagiarism detector trained on LSC and containments features and deployed on AWS SageMaker.
Data Science Resume Selector: Selecting the resume that are eligbile to data scientist postions, the dataset used contains 125 resumes, in the resumetext column. Resumes were queried from Indeed.

Power consumption prediction: Classifying malignant Melanoma using skin lesion images using CNN-based classifiers.

Immigrants to Canada data visulization: Visualizing the data of the immigrants to Canada using different visualizing libraries in Python.
Geospatial visualization of San Francisco Police Department Incidents: Visualizaing the geospatial data of the San Francisco police department incidents for the year 2016.

San Diego Rainforest Fire Predicition: Predicting the occurance of rainforest fire in san Diego using weather data collected by san Diego weather center.
Cluster Analysis of the San Diego Weather Data: Ultilizing pyspark to implement unsupervised learning model to cluster the san Diego weather data so as to better understand the occurance of the rainforest fire.

Songs App User Activity Data Modeling : Modeling user activity data for a music streaming app called Sparkify to optimize queries for understanding what songs users are listening to by creating a Postgres relational database and ETL pipeline to build up Fact and Dimension tables and insert data into new tables.
Songs App data modeling using Apache Casandra: Create an Apache Cassandra database which can create queries on song play data to answer analysis questions.

Language:Jupyter Notebook 96.8%Language:Python 3.0%Language:MATLAB 0.1%Language:HTML 0.0%Language:M 0.0%