10tanmay100

Tanmay Chakraborty's repositories

Automatic-Vehicle-System-ADAS-

Language:Python000

Industry-Safety-Detection-using-Yolov8

Language:PythonMIT000

Credit-Score_Classification-End2End---English

Problem Statement You are working as a data scientist in a global finance company. Over the years, the company has collected basic bank details and gathered a lot of credit-related information. The management wants to build an intelligent system to segregate the people into credit score brackets to reduce the manual efforts.

Language:Jupyter NotebookMIT000

SQL-QUERY---8-Weeks-SQL

Apache-2.0000

MEDICAL-DATA-PROJECT-END2END-WITH-FEW-MLOPS

We are on a mission to transform medical data into actionable insights using the power of machine learning. Whether you are a data scientist, healthcare professional, or an enthusiast in the field, your contributions and ideas are invaluable to us. Join us in making a difference!

Language:Jupyter NotebookMIT200

sagemaker_yt_implementation

Language:Jupyter Notebook000

10tanmay100

Config files for my GitHub profile.

000

Population-Youtube-Ml-Case-Study

Language:Jupyter Notebook000

data-project

000

CustomEncoder_Library

The CustomEncoder is a Python class that provides a scikit-learn compatible implementation of a custom encoder for categorical variables. This encoder can be used within scikit-learn pipelines for preprocessing data, particularly when dealing with categorical features that require mapping to numerical values.

MIT000

yolo-in-sagemaker

Language:PythonApache-2.0000

data_for_target_Sales

used for data

000

8WEEKSQL

000

TANMAY-CHAKRABORTY-DA-ASSIGNMENT-RAPYDER

Language:Jupyter Notebook000

mL-project

Language:PythonApache-2.0000

Target_Sales_Prediction

Target Store Sales Prediction – Objective& Deliverables Content: You are provided with historical sales data for 45 stores located in different region search store contains a number of departments. The company also runs several promotional markdown events throughout the year. These markdowns precede prominent holidays, the four largest of which are the Super Bowl, Labor Day, Thanksgiving, and Christmas. The weeks including these holidays are weighted five times higher in the evaluation than non-holiday weeks. Objective & Deliverables Problem description: One challenge of modeling retail data is the need to make decisions based on limited history. Holidays and select major events come once a year, and so does the chance to see how strategic decisions impacted the bottom line. In addition, markdowns are known to affect sales the challenge is to predict which departments will be affected and to what extent. Recommended Project Steps & Guidelines: 1. Understand the data variables properly. Check the variable description to understand the data properly. 2. Clean the data: Clean the data, that is, fill the missing values (if any), treat the outliers (or odd values), etc. Ensure each variable’s data is as per the nature of the variable (e.g. – Date field should contain only date values – can extract year, month and day of the week, and numeric column should be formatted as numeric, etc.). 3. Conduct EDA (Exploratory Data Analysis) on the cleaned Data: Summarize, explore the data and then decide your strategy. Make note of any important assumptions that you make. 4. Uni-variate and Bi-variate Analysis: Check the distribution of independent variables and also compare them with the dependent variable. 5. Feature Engineering: Create new meaningful features based on the existing features by applying some aggregation functions on them. 6. Hypothesis Testing: Hypothesis testing in statistics is a way for you to test the results of a survey or experiment to see if you have meaningful results. You should give a brief summary of the data and a summary of the results of your statistical test. In the discussion, you can discuss whether your initial hypothesis was supported or refuted. TARGET STORE SALES PREDICTION 7. Identify the most important variables (or data parameters) that affect the final decision: Identify the impact of each variable on the final result graphically (correlation / scatter plots, regression plots, etc.). Keep those variables that affect the final outcome. 8. Develop and Validate Samples: Divide samples into 2 parts: Development Sample (70%) & Validation Sample (30%). Build your analysis model using the Development Sample, and validate it on the validation sample and then predict on test sample. 9. Model Building: Analyze the dependent variable and decide which technique out of regression or classification to use and hence build the model. 10. Improving model accuracy: We know that machine learning algorithms are driven by parameters. These parameters majorly influence the outcome of learning process. So, find the optimum value for each parameter to improve the accuracy of the model and repeat this process with a number of well performing models. 11. Model Comparison: Comparing the each model with other similar models and then choose that model which give highest accuracy. But it is not necessary that higher accuracy models always perform better (for unseen data points). So, find the right accuracy of the model, you must use cross validation technique before finalizing the model.

Language:Jupyter NotebookMIT200

shipment_pricing_prediction

The market for supply chain analytics is expected to develop at a CAGR of 17.3 percent from 2019 to 2024, more than doubling in size. This data demonstrates how supply chain organizations are understanding the advantages of being able to predict what will happen in the future with a decent degree of certainty. Supply chain leaders may use this data to address supply chain difficulties, cut costs, and enhance service levels all at the same time. The main goal is to predict the supply chain shipment pricing based on the available factors in the dataset.

Language:HTMLMIT000

forest-cover-classification-project

Problem Statement: Forest land is highly required for developing ecosystem management. Any changes that occur in ecosystem should be carefully noticed to avoid further loss. This model is helpful in noticing the changes occurred due to heavy floods or any other calamities which affected the forest land. The goal is to predict seven different cover types in four different wilderness areas of the Roosevelt National Forest of Northern Colorado with the best accuracy Four wilderness areas are: 1: Rawah 2: Neota 3: Comanche Peak 4: Cache la Poudre Seven categories numbered from 1 to 7 in the Cover_Type column, to be classified: 1: Spruce/Fir 2: Lodgepole Pine 3: Ponderosa Pine 4: Cottonwood/Willow 5: Aspen 6: Douglas-fir 7: Krummholz

Language:HTMLMIT200

sensor-fault-detection

The Air Pressure System (APS) is a critical component of a heavy-duty vehicle that uses compressed air to force a piston to provide pressure to the brake pads, slowing the vehicle down. The benefits of using an APS instead of a hydraulic system are the easy availability and long-term sustainability of natural air.

Language:PythonMIT000

10tanmay100

Tanmay Chakraborty's repositories

next_word_predictor

striver_dsa

Automatic-Vehicle-System-ADAS-

Industry-Safety-Detection-using-Yolov8

Insurance_Claim_Prediction

jenkins-cicd

Credit-Score_Classification-End2End---English

SQL-QUERY---8-Weeks-SQL

MEDICAL-DATA-PROJECT-END2END-WITH-FEW-MLOPS

sagemaker_yt_implementation

10tanmay100

Population-Youtube-Ml-Case-Study

data-project

CustomEncoder_Library

yolo-in-sagemaker

data_for_target_Sales

8WEEKSQL

TANMAY-CHAKRABORTY-DA-ASSIGNMENT-RAPYDER

mL-project

Target_Sales_Prediction

shipment_pricing_prediction

forest-cover-classification-project

sensor-fault-detection

Excel-Dashboard-Project

sign-detection-using-yolo

Gender_and_reaction-Classifier

Covid-19-Image-Classifier

DEEPCNNCLASSIFIER

Activity-Recognition-Project

dL_cnn_app_template