NikhilOO7 / 100_Days_MLDL

Hello Data Enthusiast! I will be updating my 100-day Journey here along with detailed Code Files Starting from Essential Libraries to Advanced Machine Learning and Deep Learning Algorithm Theory with Implementation. Save for Later ⭐ Happy Learning :)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

100 Days Machine Learning and Deep Learning

How it Started? Day 0 - 18 Sept 2023

Over the past months, I've dived into the world of data science, mastering tools like Pandas, NumPy, Matplotlib, Seaborn. Now, I'm ready to take my skills to the next level!

This 100-day journey will be all about understanding statistics, machine learning, and deep learning algorithms at their core, along with a lot of hands-on projects. I'm eager to delve deep into the theory behind these powerful algorithms, ensuring I grasp every concept intricately. But there's a twist!

Throughout this challenge, I'll be sharing my newfound insights with our amazing community. Each day, I'll revisit these topics and create articles to teach what I've learned. You can Follow me on Medium for Detailed Articles. My goal is simple: to enhance my own understanding while helping others on their data science journeys.

What Inspired Me?

One of the things is definitely the “Show Your Work” book by Austin Kleon, and I believe it can motivate you as well. Read more about it here.

Click Here to Find Detailed Articles.


Daily Progress of 100 Days MLDL

DAY 1 (19 Sept 2023):

Topic: Pandas Revision through Handwritten Notes

  1. Data Structures
  2. Data Loading and Data Inspection
  3. Data Selection and Indexing
  4. Data Cleaning
  5. Data Manipulation

Detailed Medium Article: Pandas Demystified: A Comprehensive Handbook for Data Enthusiasts

Detailed Source Code: Day 1 Commit

LinkedIn post: Day 1 Update

LeetCode Problems Solved:

  1. Combine Two Tables
  2. Second Highest Salary

DAY 2 (20 Sept 2023):

Topic: Advanced Pandas Topics Revision

  1. Data Aggregations
  2. Data Visualizations
  3. Time Series Data Handling
  4. Handling Categorical Data
  5. Advanced Topics

Detailed Medium Article: Advanced Pandas: A Comprehensive Handbook for Data Enthusiasts

Detailed Source Code: Day 2 Commit

LinkedIn post: Day 2 Update , Pandas Complete Guide Post

LeetCode Problems Solved:

  1. Rank Scores
  2. Nth Highest Salary
  3. Duplicate Emails

DAY 3 (21 Sept 2023):

Topic: Numpy Revision

  1. Numpy Array Basics
  2. Array Inspection
  3. Array Operations
  4. Working with Numpy Arrays
  5. NumPy for Data Cleaning
  6. NumPy for Statistical Analysis
  7. NumPy for Linear Algebra
  8. Advanced NumPy Techniques
  9. Performance Optimization with NumPy

Detailed Medium Article: Mastering NumPy: A Data Enthusiast’s Essential Companion

Detailed Source Code: Day 3 Commit

LinkedIn post: Day 3 Update

LeetCode Problems Solved:

  1. Median of Two Sorted Arrays
  2. Consecutive Numbers

DAY 4 (22 Sept 2023):

Topic: Matplotlib Fundamentals Revision

  1. Basic Plotting
  2. Plot Types
    • 2.1 Bar Chart
    • 2.2 Histograms
    • 2.3 Scatter plots
    • 2.4 Pie Charts
    • 2.5 Box Plot (Box and Whisker Plot)
    • 2.6 Heatmap, and Displaying Images
    • 2.7 Stack Plot

Detailed Medium Article: Mastering Maplotlib: A Comprehensive Guide to Data Visualization

Detailed Source Code: Day 4 Commit

LinkedIn post: Day 4 Update

LeetCode Problems Solved:

  1. Employees Earning More Than Their Managers

DAY 5 (23 Sept 2023):

Topic: Advanced Matplotlib Topics Revision

  1. Multiple Subplots
    • 1.1 Creating Multiple Plots in a Single Figure
    • 1.2 Combining Different Types of Plots
  2. Advanced Features
    • 2.1 Adding annotations and text
    • 2.2 Fill the Area Between Plots
    • 2.3 Plotting Time Series Data
    • 2.4 Creating 3D Plots
    • 2.5 Live Plot - Incorporating Animations and Interactivity.

Detailed Medium Article: Advanced Maplotlib: A Comprehensive Guide to Data Visualization

Detailed Source Code: Day 5 Commit

LinkedIn post: Day 5 Update

LeetCode Problem Solved:

  1. Customers Who Never Order

DAY 6 (24 Sept 2023):

Topic: Seaborn Fundamentals Revision

  1. Categorical Plots
    • 1.1 Count Plot
    • 1.2 Swarm Plot
    • 1.3 Point Plot
    • 1.4 Cat Plot
    • 1.5 Categorical Box Plot
    • 1.6 Categorical Violin Plot

Detailed Source Code: Day 6 Commit

LinkedIn post: Day 6 Update

LeetCode Problem Solved:

  1. Delete Duplicate Emails

DAY 7 (25 Sept 2023):

Topic: Seaborn Univariate and Bivariate Plots

  1. Univarite Plots
    • 1.1 KDE Plot
    • 1.2 Rug Plot
    • 1.3 Box Plot
    • 1.4 Violin Plot
    • 1.5 Strip Plot
  2. Bivariate PLots
    • 2.1 Regression Plot
    • 2.2 Joint Plot
    • 2.3 Hexbin Plot

Detailed Medium Article: Mastering Seaborn: Demystifying the Complex Plots!

Detailed Source Code: Day 7 Commit

LinkedIn post: Day 7 Update

LeetCode Problem Solved:

  1. Department Highest Salary

DAY 8 (26 Sept 2023):

Topic: Seaborn Multivariate and Matrix Plots

  1. Multivariate Plots
    • 1.1 Using Parameters
    • 1.2 Relational Plot
    • 1.3 Facet Grid
    • 1.4 Pair Plot
    • 1.5 Pair Grid
  2. Matrix PLots
    • 2.1 Heat Map
    • 2.2 Cluster Map

Detailed Medium Article: Advanced Seaborn: Demystifying the Complex Plots!

Detailed Source Code: Day 8 Commit

LinkedIn post: Day 8 Update

LeetCode Problem Solved:

  1. Rising Temparature

DAY 9 (27 Sept 2023):

Topic: Plotly Fundamentals

  1. Using plotly express to create basic plots
  2. Using graph objects module to customize plots

Detailed Source Code: Day 9 Commit

LinkedIn post: Day 9 Update

LeetCode Problem Solved:

  1. Game Play Analysis I

DAY 10 (28 Sept 2023):

Topic: Plotly Advanced plots

  1. Advanced Plots
    • Box plots
    • Violin Plots
    • Density Heatmaps
    • Scatter Matrix
    • 3D Plots
    • Animated Plots

Detailed Medium Article:

Detailed Source Code: Day 10 Commit

LinkedIn post:Day 10 Update


DAY 11 (29 Sept 2023):

Topic: Data Cleaning on Loan Defaulter Dataset

  1. Data Inspection.
  2. Handling missing values.
  3. Data Imputation

Detailed Source Code: Day 11 Commit

LinkedIn post: Day 11 Update


DAY 12 (30 Sept 2023):

Topic: Data Visualization on Loan Defaulter Dataset

  1. Binning of data for better visualizaiton
  2. Univariant analysis
  3. Bivariant analsis

Detailed Source Code: Day 12 Commit

LinkedIn post: Day 12 Update


DAY 13 (1 Oct 2023):

Topic: Exploratory Data Analysis and Insights on Loan Defaulter Dataset

  1. Finding insights from the visualizations

Detailed Source Code: Day 13 Commit

LinkedIn post: Day 13 Update


DAY 14 (2 Oct 2023):

Topic: Descriptive Statistice

  1. Mean, Median, Mode: These are measures of central tendency.
  2. Variance and Standard Deviation: These quantify data spread or dispersion.
  3. Skewness and Kurtosis: These describe the shape of data distributions.
  4. Quantiles and Percentiles: These help analyze data distribution.
  5. Box Plots for Descriptive Stats: Box plots provide a visual summary of the dataset.
  6. Interquartile Range (IQR): The IQR is the range covered by the middle 50% of the data

Detailed Source Code: Day 14 Commit

LinkedIn post: Day 14 Update


DAY 15 (3 Oct 2023):

Topic: Probability for Data Science

  1. Probability Basics: Understand the fundamental concepts like events, outcomes, and sample spaces.
  2. Probability Formulas: Master key formulas:
    • Probability of an Event (P(A)): Number of favorable outcomes / Total number of outcomes.
    • Conditional Probability (P(A|B)): Probability of A given that B has occurred.
    • Bayes' Theorem: A powerful tool for updating probabilities based on new evidence.
    • Law of Large Numbers: As you increase the sample size, the sample mean converges to the population mean. Crucial for statistical inference.
  3. Probability Distributions: Get acquainted with probability distributions:
    • Normal Distribution: The bell curve is everywhere in data science. It's essential for hypothesis testing and confidence intervals.
    • Bernoulli Distribution: For binary outcomes (like success or failure).
    • Binomial Distribution: When dealing with a fixed number of independent Bernoulli trials.
    • Poisson Distribution: Used for rare events, like customer arrivals at a store.

Detailed Source Code: Day 15 Commit

LinkedIn post: Day 15 Update


DAY 16 (4 Oct 2023):

Topic: Inferential Statistics

  1. Central Limit Theorm
  2. Hypothesis Testing
  3. Deriving p-values
  4. Z-Test
  5. T-Test

Detailed Source Code: Day 16 Commit

LinkedIn post: Day 16 Update


DAY 17 (5 Oct 2023):

Topic: Inferential Statistics

  1. Chi-Square Test
  2. F-Test/ANOVA
  3. Covariance
  4. Pearson Correlation
  5. Spearman Rank Correlation

Detailed Source Code: Day 17 Commit

LinkedIn post: Day 17 Update


DAY 18 (6 Oct 2023):

Topic: Introduction to Machine Learning

  1. What is Machine Learning?
  2. Types of Machine Learning?
  3. Supervised Machine Learning
  4. Unsupervised Machien Learning
  5. Reinforcement Learning
  6. Semi-supervised Learning

Detailed Source Code: Day 18 Commit

LinkedIn post: Day 18 Update


DAY 19 (7 Oct 2023):

Topic: Steps in Machine Learning Project

  1. Data Collection
  2. Data Cleaning
  3. Exploratory Data Analysis
  4. Data Preprocessing
  5. Data Splitting
  6. Train the model
  7. Evaluation of a Model
  8. Deploy and Retrain

Detailed Source Code: Day 19 Commit

LinkedIn post: Day 19 Update


DAY 20 (8 Oct 2023):

Topic: Exploring Scikit-Learn

  1. sklearn.datasets
  2. sklearn.preprocessing
  3. sklearn.model_selection
  4. sklearn.feature_selection
  5. sklearn.linear_model And Many more...

Detailed Source Code: Day 20 Commit

LinkedIn post: Day 20 Update


DAY 21 (9 Oct 2023):

Topic: Advanced Scikit-Learn Features

  1. sklearn.metrics
  2. sklearn.compose
  3. sklearn.pipeline

Detailed Source Code: Day 21 Commit

LinkedIn post: Day 21 Update


DAY 22 (10 Oct 2023):

Topic: Feature Engineering 1 - Handling Missing Values

1.Handling Missing values

  • 1.1 Problems of Having Missing values
  • 1.2 Understanding Types of Missing Values
  • 1.3 Dealing MV Using SimpleImputer Method
  • 1.4 Dealing MV Using KNN Imputer Method

2.Handling Categorical Values

  • 2.1 One Hot Encoding
  • 2.2 Label Encoding
  • 2.3 Ordinal Encoding
  • 2.4 Multi Label Binarizer
  • 2.5 Count/Frequency Encoding
  • 2.6 Target Guided Ordinal Encoding

Detailed Source Code: Day 22 Commit

LinkedIn post: Day 22 Update


DAY 23 (11 Oct 2023):

Topic: Feature Engineering 2 - Feature Scaling

  1. Feature Scaling
    • 1.1 Standardization/Standard Scaler
    • 1.2 Normalization/MinMax Scaler
    • 1.3 Max Abs Scaler
    • 1.4 Robust Scaler

Detailed Source Code: Day 23 Commit

LinkedIn post: Day 23 Update


DAY 24 (12 Oct 2023):

Topic: Feature Engineering 3 - Feature Selection

  1. why Feature Selection Matters

  2. Types of Feature Selection

  3. Filter Methods

    • Variance Threshold
    • SelectKBest
    • SelectPercentile
    • GenericUnivariateSelect
  4. Wrapper Methods

    • RFE
    • RFECV
    • SelectFromModel
    • SequentialFeatureSelector

Detailed Source Code: Day 24 Commit

LinkedIn post: Day 24 Update


DAY 25 (13 Oct 2023):

Topic: Feature Engineering 4 - Feature Transformation and Pipelines

  1. Feature Transformation

    • Undestanding QQPlot and PP-Plot
    • logarithmic transformation
    • reciprocal transformation
    • square root transformation
    • exponential transformation
    • boxcox transformation
  2. Using Pipelines to automate the FE

    • What are Pipelines
    • Accessing individual steps in pipeline
    • Accessing Parameters in Pipeline
    • Performing Grid Search with Pipeline
    • Combining Transformers and Pipeline
    • Visualizing the Pipeline

Detailed Source Code: Day 25 Commit

LinkedIn post: Day 25 Update


DAY 26 (14 Oct 2023):

Topic: Understanding Linear Regression

  1. Fundamentals of Linear Regression
  2. Exploring the Assumptions of Linear Regression
  3. Gradient Descent and Loss Function
  4. Evaluation Metrics for Linear Regression
  5. Applications of Linear Regression

Detailed Notes: Day 26 Commit

LinkedIn post: Day 26 Update


DAY 27 (15 Oct 2023):

Topic: Understanding Multicollinearity, and Regularization Techniques

  1. Multiple Linear Regression
  2. Multicollinearity
  3. Regularization Techniques
  4. Ridge, Lasso and Elastic Net
  5. Polynomial Regression

Detailed Notes: Day 27 Commit

LinkedIn post: Day 27 Update


DAY 28 (16 Oct 2023):

Topic: Understanding the Logistic Regression

  1. How does Logistic Regression work
  2. What is a sigmoid curve
  3. Assumptions of Logistic Regression
  4. Cost Function of Logistic Regression

Detailed Notes: Day 28 Commit

LinkedIn post: Day 28 Update


DAY 29 (17 Oct 2023):

Topic: Understanding Decision Trees

  1. Why do we need Decision Trees
  2. How does Decision Trees work
  3. How do we select a root node
  4. Understanding Entropy, Information Gain
  5. Solving an Example on Entropy
  6. Understanding Gini Impurity
  7. Solving an Exmaple on Gini Impurity
  8. Decision Trees for Regression
  9. Why decsision trees are Greedy Approach
  10. Understanding Pruning

Detailed Notes: Day 29 Commit

LinkedIn post: Day 29 Update


DAY 30 (18 Oct 2023):

Topic: Understanding Ensemble Techniques

  1. What are Ensemble Techniques
  2. Understanding Bagging
  3. Understanding Boosting
  4. Understanding Stacking

Detailed Notes: Day 30 Commit

LinkedIn post: Day 30 Update


DAY 31 (19 Oct 2023):

Topic: Understanding Random Forests

  1. Decision Trees Agreegation
  2. Bagging and Variance Reduction
  3. FEature Subspace sampling
  4. Handling Overfitting
  5. Out of bag error

Detailed Notes: Day 31 Commit

LinkedIn post: Day 31 Update


DAY 32 (20 Oct 2023):

Topic: Understanding Boosting Algorithms

  1. Concept of Boosting
  2. Understanding Ada Boost
  3. Solving an Example on AdaBoost
  4. Understanding Gradient Boosting
  5. Solving an Example on Gradient Boosting
  6. AdaBoost vs Gradient Boosting

Detailed Notes: Day 32 Commit

LinkedIn post: Day 32 Update


DAY 33 (21 Oct 2023):

Topic: Understanding XG Boost Algorithms

  1. Concept of XGBoost Algorithm
  2. Boosting Mechanism
  3. Feature Importance Interpretation
  4. Regularization Techniques
  5. Flexibility and Scalability

Detailed Notes: Day 33 Commit

LinkedIn post: Day 33 Update


DAY 34 (22 Oct 2023):

Topic: Understanding K Nearest Neighbours

  1. How does K-Nearest Neighbours work
  2. How is Distance Calculated
    • Eculidean Distance
    • Hamming Distance
    • Manhattan Distance
  3. Why is KNN a Lazy Learner
  4. Effects of Choosing the value of K
  5. Different ways to perform KNN
  6. Understanding KD-Tree
  7. Solving an Example of KD Tree
  8. Understanding Ball Tree

Detailed Notes: Day 34 Commit

LinkedIn post: Day 34 Update


DAY 35 (23 Oct 2023):

Topic: Understanding Support Vector Machines

  1. Understanding Concept of SVC
  2. What are Support Vectors
  3. What is Margin
  4. Hard Margin and Soft Margin
  5. Kernelized SVC
  6. Types of Kernels
  7. Understanding SVR

Detailed Notes: Day 35 Commit

LinkedIn post: Day 35 Update


DAY 36 (24 Oct 2023):

Topic: Understanding Naive Bayes Classifiers

  1. Why do we need Naive Bayes
  2. Concept of how it works
  3. Mathematical Intuition of Naive Bayes
  4. Solving an Example on Naive Bayes
  5. Other Bayes Classifiers
    • Gaussian Naive Bayes Classifier
    • Multinomial Naive Bayes Classifier
    • Bernoulli Naive Bayes Classifier

Detailed Notes: Day 36 Commit

LinkedIn post: Day 36 Update


DAY 37 (25 Oct 2023):

Topic: Understanding Clustering Techniques

  1. How clustering is different from classification
  2. Applications of Clustering
  3. What are density based methods
  4. What are Hierarchial based methods
  5. What are partitioning methods
  6. What are Grid Based methods
  7. Main Requirements for Clustering Algorithms

Detailed Notes: Day 37 Commit

LinkedIn post: Day 37 Update


DAY 38 (26 Oct 2023):

Topic: Understanding K-Means Clustering

  1. Concept of K-Means Clustering
  2. Math Intuition Behind K-Means
  3. Cluster Building Process
  4. Edge Case Scenarios of K-Means
  5. Challenges and Improvements in K-Means

Detailed Notes: Day 38 Commit

LinkedIn post: Day 38 Update


DAY 39 (27 Oct 2023):

Topic: Understanding Hierarchical Clustering

  1. Concept of Hierarchical Clustering
  2. Understanding Algorithm
  3. Understanding Linkage Methods

Detailed Notes: Day 39 Commit

LinkedIn post: Day 39 Update


DAY 40 (28 Oct 2023):

Topic: Understanding DB SCAN Clustering

  1. Concept of DB SCAN
  2. Key words in understanding DB SCAN
  3. Algorithm of DB SCAN

Detailed Notes: Day 40 Commit

LinkedIn post: Day 40 Update


DAY 41 (29 Oct 2023):

Topic: Evaluation of Clustering Models

  1. Understanding External Measures
    • Rand Index
    • Jaccard Co-efficient
  2. Understanding Internal Measures
    • Cohesion
    • Seperation

Detailed Notes: Day 41 Commit

LinkedIn post: Day 41 Update


DAY 42 (30 Oct 2023):

Topic: Understanding Curse of Dimensionality

  1. Computational Complexity
  2. Data Visualization Challenges

Detailed Notes: Day 42 Commit

LinkedIn post: Day 42 Update


DAY 43 (31 Oct 2023):

Topic: Understanding Principal Component Analysis

  1. Idea Behind PCA
  2. What are Principal Components
  3. Eigen Decomposition Approach
  4. Singular Value Decomposition Approach
  5. Why do we maximize Variance
  6. What is Explained Variance Ratio
  7. How to select optimal no.of Prinicpal Components
  8. Understanding Scree plot
  9. Issues with PCA
  10. Understanding Kernel PCA

Detailed Notes: Day 43 Commit

LinkedIn post: Day 43 Update


About

Hello Data Enthusiast! I will be updating my 100-day Journey here along with detailed Code Files Starting from Essential Libraries to Advanced Machine Learning and Deep Learning Algorithm Theory with Implementation. Save for Later ⭐ Happy Learning :)


Languages

Language:Jupyter Notebook 99.9%Language:Python 0.1%