business-analytics decision-tree internship machine-learning regression-analysis sparks

The SPARKS Foundation DSBA Repository(September, 2020 - October, 2020)

This repo is for storing code and documents of the user Sandip Dutta related to internship at The SPARKS Foundation. During the internship, the following tasks were performed:

Data Science Mentor (October - December, 2020)

Cleared doubts pertaining to tasks by fellow interns
Received Letter of Recommendation for exceptional work

Data Science Intern (September 2020)

Business Analytics (TASK - 4) - In this final task, we were to analyse a business data.
- Data was analysed using pandas.
- EDA was performed using Seaborn and Matplotlib.
- Applied Hypothesis tests like chi2_contingency and kendalltau using scipy.stats.
- Fitted a RidgeRegression model from sklearn
- The final accuracy came to about 0.995.
Decision Tree (TASK - 3) - In this task we were to explore Decision Tree Algorithm using sklearn on IRIS dataset.
- Splitted the data into train and validation part.
- Fitted a DecisionTreeClassifier on the dataset.
- For visulaising it, we used matplotlib.
- Then we plotted decision surfaces for two features and checked the accuracy of the model.
- Decision tree gave a good f1-score(near to 1.00).
Iris_Unsupervised (TASK - 2) - This folder is for the iris data analysis using KMeans and DBSCAN algorithm.
- First plots were generated and features visualised.
- Then DBSCAN was applied and we got the optimum number of clusters as 3.
- We shifted to K Means(after scaling the data).
- We determined the ideal number of clusters using elbow method and it too came out to be 3.
- Lastly, we plotted a confusion matrix to see the classification.
Student_data (TASK - 1) - This folder contains data for some students.
- Task is to predict whether score increases if number of hours of study increases.
- We performed EDA and fitted a linear Regression model for this data.
- The accuracy came to be about 95 % based on r2_score metric.

About

This repository is for storing code related to internship at The SPARKS Foundation

business-analytics decision-tree internship machine-learning regression-analysis sparks

Languages

Language:Jupyter Notebook 100.0%