There are 3 repositories under data-cleaning-and-preprocessing topic.
Welcome to my data science repository! Here you will find a collection of resources and examples for exploring, analyzing, and manipulating data using Python. The repository includes code templates, case studies, and exercises to help you learn and practice data science concepts and techniques. The topics covered include data exploration, data visu
Leveraging advanced data cleaning techniques and feature engineering, a robust food delivery prediction model was developed using regression algorithms.
A repository where I keep all of my data cleaning samples/portfolio items.
A project analyzing the Indian startup ecosystem between 2018 and 2021.
"Predicting a Greener Future 🌾📊 Delve into the world of agriculture and data science with our Yield Prediction project. We harness machine learning and weather data to forecast crop yields accurately. Join us in cultivating smarter farming practices for a sustainable tomorrow."
EDA and Prediction of F1 Race WInners
It was a competition on KAGGLE for prediction on the most sales products on bikes via their features
Comprehensive object detection using YOLOv5, trained from scratch. Includes data preparation, YOLOv5 training on 20 labels, and testing on images/videos. Utilizes Google Colab's V100 GPU for robust detection.
In this project, a real-world dataset from Zomato, one of the most widely used food ordering platforms, was worked on.
This project involves analyzing real-world medical appointment data through Time Series Analysis. The tasks include dataset cleaning, comprehensive analysis, and extracting insights using Python and MySQL.
Designed and implemented machine learning and deep learning models to diagnose gearbox faults. Preprocessed sensor data, engineered features, and trained models using techniques like SVM, random forests, LSTM and naive bias. Evaluated model performance and optimized hyperparameters to achieve high diagnostic accuracy.
This project delves into comprehensive insights extracted from the Stack Overflow Developer Survey 2022. The dataset provides a rich source of information about developers' demographics, coding experience, compensation, and various aspects of their professional lives.
Welcome to the FIFA Dataset Data Cleaning and Transformation project! This initiative focuses on refining and enhancing the FIFA dataset to ensure it is well-prepared for in-depth analysis. The project involves a comprehensive data cleaning process and transformation of key features to improve data quality and usability.
Explore my solo Customer Segmentation Project, diving into data analysis, clustering, and visualization. Uncover distinct customer segments for tailored marketing strategies and enhanced engagement. Discover the power of data-driven insights in this independent project.
This is my BrainStation capstone project on music genre classification. I use supervised and unsupervised learning to classify songs into genre based on specific attributes from two Spotify datasets..
This is a Capstone Project which provides an in-depth analysis of the Shakila DVD Rental Store, utilizing Excel, SQL, and Power BI to deliver actionable insights and dynamic dashboard visualization for enhancing business operations and customer experiences in the movie rental industry.
Explore NYC Green Taxi data, predicting fares and optimizing pickup locations using machine learning. Regression models uncover travel patterns and enhance taxi services for an efficient urban transport experience.
This repository is a compilation of my academic and personal projects accomplished using Python. The most common libraries used in these projects include NumPy, Pandas, Scipy, and Matplotlib. Also contained in this repository are the certificates I gained by upskilling in pursuit of my passion and eagerness for Data Science.
Analyze order data to identify the most and least popular menu items and types of cuisine
Power Outage Data Analysis in USA
Developed Interactive visualizations and a Shiny Dashboard using using R from a complicated and in-complete time-series dataset.
Project aims to forecast potato prices in India using LSTM, KNN, and Random Forest Regression, integrating historical data on prices, regional stats, and rainfall patterns. Targeting agricultural stakeholders for informed decision-making.
This project analyzes the 2022 T20 World Cup data to determine the top 11 players based on their performance. I used ParseHub to collect data from the ESPNcricinfo website, then cleaned and transformed it with NumPy and pandas, Python libraries, and created dashboards in Power BI.
Data Analysis of potential factors affecting water pipe breakage
Data analytics projects done at trainity throughout my 8week training
A Python library and its cli for converting grabcraft to schema (more specifically litematica schematic) files
The objective of this project is to analyze the customers of a bank, categorize them with K-Means and Hierarchical Clustering and evaluate their distinct characteristics
This project analyzes HR data using Tableau to uncover insights that optimize Human Resource operations. By visualizing key metrics such as staffing, salary distribution, gender balance, and performance trends, the dashboard supports data-driven decisions in areas like employee retention, salary management, and workforce diversity.
The main objective of the Zometo Data Analysis project was to explore and analyze the Zometo dataset to gain meaningful insights into user behavior, product performance, and other relevant factors. By leveraging data analytics techniques, we aimed to uncover patterns, trends, and correlations with the data, ultimately providing valuable information
The coffee restaurant will test a new menu in Denver and Chicago using TV ads to see if it boosts profits by at least 18%, justifying the marketing costs, and needs an analysis to decide on a wider rollout.
Explore Shark Tank investments with SQL. Uncover insights, success rates, and industry preferences.
Data cleaning , analysis and visualization of 4 different sector's data
Restaurant based analysis based on location.
E-commerce use case: This project conducts a comprehensive data cleaning exercise on the eCommerce data.