datasciencegeek1's repositories
Coronary-heart-disease-prediction-using-various-ML-algorithms
Analysed the performance of different Machine Learning algorithms on Coronary heart disease dataset acquired from Kaggle. Performed EDA, Data cleansing, data pre processing and feature correlation, feature selection. Implemented Logistic regression with 10 fold cross validation, Logistic regression with GridSearchCV, Random Forest, RNN and MLP
Azure-Machine-Learning-Pipeline-Big-Data-Cloud-Computing
An end-to-end machine learning pipeline on Microsoft Azure to predict flight delays using historical flight and weather data. The project integrated Azure Blob Storage, Data Factory, and Databricks for data ingestion, processing, and analysis, and employed various ML algorithms for model training and evaluation and visualisation on Power BI
ECMWF-Query-Prediction
Implemented RNN with LSTM to predict queries for retrieving weather data from the ECMWF. The study investigated data pre-processing, hyperparameter tuning, model, architecture and training. The model achieved 91% accuracy. The work will significantly reduce large data fetching latency from tape systems which are slow unlike disk drives
Cancer-cell-analysis
This repository contains the workflow and data for a machine learning project aimed at developing a cancer diagnosis model in KNIME. We evaluate the performance of two popular machine learning models, Naiive Bayes and Support Vector Machine (SVM), using precision and recall metrics
LA-Metro-Bike-Data
This work is a demonstration of EDA and BI in Python Jupyter Notebook on the Los Angeles Metro Bike Share system's bicycle trip data. The project involved data cleaning, visualization, and statistical analysis to identify key trends, patterns, and actionable insights that could help improve bike share utilization and operational efficiency.