Harpreet Singh Dhoot's repositories


Development of a web app using Flask and built a data pipeline for ASCEND, processing data from 15,000+ users to provide insights for enhancing immigrants transitioning into the Canadian job market. Conducted data wrangling and transformation, translated 10,000+ French responses into English, and created Power BI dashboards showcasing KPIs

Language:Jupyter NotebookStargazers:0Issues:0Issues:0


Task is to develop a solution that builds off the Crow's Nest concept, utilizing cutting-edge technologies such as intelligent sensors, computer vision, and machine learning to monitor, detect, and report real-time insights into the condition of public spaces.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0


Participated in Data Hackathon to understand different causes and way to prevent wildfire in Canada



Brand Monitoring System using Sentimental Analysis with Python, MySQL, and Twitter API



Using real-world-data creating a model to recommend real policies to the individuals for better political campaigns



The comprehensive methodology, rooted in data integration, feature engineering, and advanced analytics, has yielded a powerful model capable of identifying deceptive patterns and potential fraud instances.

Language:Jupyter NotebookStargazers:1Issues:0Issues:0


Using NLP, creating model to understand reviews of the customers using amazon reiews

Language:Jupyter NotebookStargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:0Issues:0


Practicing Spark for Big Data

Language:Jupyter NotebookStargazers:0Issues:0Issues:0


Tableau dashboard using MySQL for Pizza Franchise Sales with 45k+ rows w.r.to different KPI's and Problem statement.



The dataset contains 119390 observations for a City Hotel and a Resort Hotel. Each observation represents a hotel booking between the 1st of July 2015 and 31st of August 2017, including booking that effectively arrived and booking that were canceled

Language:Jupyter NotebookStargazers:1Issues:0Issues:0


Using SQL and PowerBI creating a Human Resource data analysis.



Trying sqlite3 for connecting sql with python and using python pandas for analysis and seaborn for visualization

Language:Jupyter NotebookStargazers:0Issues:0Issues:0


In this case study, you will perform many real-world tasks of a junior data analyst. You will work for a fictional company, Cyclistic, and meet different characters and team members. In order to answer the key business questions, you will follow the steps of the data analysis process

Language:Jupyter NotebookStargazers:1Issues:0Issues:0


A real-time interactive web app based on data pipelines using streaming Twitter data, automated sentiment analysis, and MySQL&PostgreSQL database (Deployed on Heroku)

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0


Car Service Management System



Data@ANZ is about mining and linking datasets to develop stories that matter and challenge the status quo, to deliver on ANZ’s purpose “to shape a world where people and communities thrive”. Our data people love to explore opportunities, innovate, be challenged and transform their ideas, and have created this experience to give you a taste of some of the challenging problems they love to tackle.



This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics, (b) its assigned insurance risk rating, (c) its normalized losses in use as compared to other cars. The second rating corresponds to the degree to which the auto is more risky than its price indicates. Cars are initially assigned a risk factor symbol associated with its price. Then, if it is more risky (or less), this symbol is adjusted by moving it up (or down) the scale. Actuarians call this process "symboling". A value of +3 indicates that the auto is risky, -3 that it is probably pretty safe. The third factor is the relative average loss payment per insured vehicle year. This value is normalized for all autos within a particular size classification (two-door small, station wagons, sports/speciality, etc…), and represents the average loss per car per year. Note: Several of the attributes in the database could be used as a "class" attribute.

Language:Jupyter NotebookStargazers:1Issues:0Issues:0


The market research team at AdRight is assigned the task to identify the profile of the typical customer for each treadmill product offered by CardioGood Fitness. The market research team decides to investigate whether there are differences across the product lines with respect to customer characteristics. The team decides to collect data on individuals who purchased a treadmill at a CardioGoodFitness retail store during the prior three months. The data are stored in the CardioGoodFitness.csv file. The team identifies the following customer variables to study: product purchased, TM195, TM498, or TM798; gender; age, in years;education, in years; relationship status, single or partnered; annual household income ($); average number of times the customer plans to use the treadmill each week; average number of miles the customer expects to walk/run each week; and self-rated fitness on an 1-to-5 scale, where 1 is poor shape and 5 is excellent shape. Perform descriptive analytics to create a customer profile for each CardioGood Fitness treadmill product line.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0


This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage. Content The datasets consists of several medical predictor variables and one target variable, Outcome. Predictor variables includes the number of pregnancies the patient has had, their BMI, insulin level, age, and so on. Can you build a machine learning model to accurately predict whether or not the patients in the dataset have diabetes or not?

Language:Jupyter NotebookStargazers:0Issues:0Issues:0


This tutorial playlist covers data structures and algorithms in python. Every tutorial has theory behind data structure or an algorithm, BIG O Complexity analysis and exercises that you can practice on.



Solution of all InfyTQ Assignments, Exercise, Quiz



A collection of programs, that consists of the most primitive and simple, yet some of the most crucial elements of image processing using OpenCV. This was created while experimenting with and learning OpenCV as a result of being inspired by the "Digital Signal & Image Processing" subject in my final academic year.



Course Files for Complete Python 3 Bootcamp Course on Udemy



Data-Science project in Python



For extensive instructor led learning



Internship As Web Developer for making an website known as frood
