Over the past months, I've dived into the world of data science, mastering tools like Pandas, NumPy, Matplotlib, Seaborn. Now, I'm ready to take my skills to the next level!
This 100-day journey will be all about understanding statistics, machine learning, and deep learning algorithms at their core, along with a lot of hands-on projects. I'm eager to delve deep into the theory behind these powerful algorithms, ensuring I grasp every concept intricately. But there's a twist!
Throughout this challenge, I'll be sharing my newfound insights with our amazing community. Each day, I'll revisit these topics and create articles to teach what I've learned. You can Follow me on Medium for Detailed Articles. My goal is simple: to enhance my own understanding while helping others on their data science journeys.
One of the things is definitely the “Show Your Work” book by Austin Kleon, and I believe it can motivate you as well. Read more about it here.
Click Here to Find Detailed Articles.
- Data Structures
- Data Loading and Data Inspection
- Data Selection and Indexing
- Data Cleaning
- Data Manipulation
Detailed Medium Article: Pandas Demystified: A Comprehensive Handbook for Data Enthusiasts
Detailed Source Code: Day 1 Commit
LinkedIn post: Day 1 Update
LeetCode Problems Solved:
- Data Aggregations
- Data Visualizations
- Time Series Data Handling
- Handling Categorical Data
- Advanced Topics
Detailed Medium Article: Advanced Pandas: A Comprehensive Handbook for Data Enthusiasts
Detailed Source Code: Day 2 Commit
LinkedIn post: Day 2 Update , Pandas Complete Guide Post
LeetCode Problems Solved:
- Numpy Array Basics
- Array Inspection
- Array Operations
- Working with Numpy Arrays
- NumPy for Data Cleaning
- NumPy for Statistical Analysis
- NumPy for Linear Algebra
- Advanced NumPy Techniques
- Performance Optimization with NumPy
Detailed Medium Article: Mastering NumPy: A Data Enthusiast’s Essential Companion
Detailed Source Code: Day 3 Commit
LinkedIn post: Day 3 Update
LeetCode Problems Solved:
- Basic Plotting
- Plot Types
- 2.1 Bar Chart
- 2.2 Histograms
- 2.3 Scatter plots
- 2.4 Pie Charts
- 2.5 Box Plot (Box and Whisker Plot)
- 2.6 Heatmap, and Displaying Images
- 2.7 Stack Plot
Detailed Medium Article: Mastering Maplotlib: A Comprehensive Guide to Data Visualization
Detailed Source Code: Day 4 Commit
LinkedIn post: Day 4 Update
LeetCode Problems Solved:
- Multiple Subplots
- 1.1 Creating Multiple Plots in a Single Figure
- 1.2 Combining Different Types of Plots
- Advanced Features
- 2.1 Adding annotations and text
- 2.2 Fill the Area Between Plots
- 2.3 Plotting Time Series Data
- 2.4 Creating 3D Plots
- 2.5 Live Plot - Incorporating Animations and Interactivity.
Detailed Medium Article: Advanced Maplotlib: A Comprehensive Guide to Data Visualization
Detailed Source Code: Day 5 Commit
LinkedIn post: Day 5 Update
LeetCode Problem Solved:
- Categorical Plots
- 1.1 Count Plot
- 1.2 Swarm Plot
- 1.3 Point Plot
- 1.4 Cat Plot
- 1.5 Categorical Box Plot
- 1.6 Categorical Violin Plot
Detailed Source Code: Day 6 Commit
LinkedIn post: Day 6 Update
LeetCode Problem Solved:
- Univarite Plots
- 1.1 KDE Plot
- 1.2 Rug Plot
- 1.3 Box Plot
- 1.4 Violin Plot
- 1.5 Strip Plot
- Bivariate PLots
- 2.1 Regression Plot
- 2.2 Joint Plot
- 2.3 Hexbin Plot
Detailed Medium Article: Mastering Seaborn: Demystifying the Complex Plots!
Detailed Source Code: Day 7 Commit
LinkedIn post: Day 7 Update
LeetCode Problem Solved:
- Multivariate Plots
- 1.1 Using Parameters
- 1.2 Relational Plot
- 1.3 Facet Grid
- 1.4 Pair Plot
- 1.5 Pair Grid
- Matrix PLots
- 2.1 Heat Map
- 2.2 Cluster Map
Detailed Medium Article: Advanced Seaborn: Demystifying the Complex Plots!
Detailed Source Code: Day 8 Commit
LinkedIn post: Day 8 Update
LeetCode Problem Solved:
- Using plotly express to create basic plots
- Using graph objects module to customize plots
Detailed Source Code: Day 9 Commit
LinkedIn post: Day 9 Update
LeetCode Problem Solved:
- Advanced Plots
- Box plots
- Violin Plots
- Density Heatmaps
- Scatter Matrix
- 3D Plots
- Animated Plots
Detailed Medium Article:
Detailed Source Code: Day 10 Commit
LinkedIn post:Day 10 Update
- Data Inspection.
- Handling missing values.
- Data Imputation
Detailed Source Code: Day 11 Commit
LinkedIn post: Day 11 Update
- Binning of data for better visualizaiton
- Univariant analysis
- Bivariant analsis
Detailed Source Code: Day 12 Commit
LinkedIn post: Day 12 Update
- Finding insights from the visualizations
Detailed Source Code: Day 13 Commit
LinkedIn post: Day 13 Update
- Mean, Median, Mode: These are measures of central tendency.
- Variance and Standard Deviation: These quantify data spread or dispersion.
- Skewness and Kurtosis: These describe the shape of data distributions.
- Quantiles and Percentiles: These help analyze data distribution.
- Box Plots for Descriptive Stats: Box plots provide a visual summary of the dataset.
- Interquartile Range (IQR): The IQR is the range covered by the middle 50% of the data
Detailed Source Code: Day 14 Commit
LinkedIn post: Day 14 Update
- Probability Basics: Understand the fundamental concepts like events, outcomes, and sample spaces.
- Probability Formulas: Master key formulas:
- Probability of an Event (P(A)): Number of favorable outcomes / Total number of outcomes.
- Conditional Probability (P(A|B)): Probability of A given that B has occurred.
- Bayes' Theorem: A powerful tool for updating probabilities based on new evidence.
- Law of Large Numbers: As you increase the sample size, the sample mean converges to the population mean. Crucial for statistical inference.
- Probability Distributions: Get acquainted with probability distributions:
- Normal Distribution: The bell curve is everywhere in data science. It's essential for hypothesis testing and confidence intervals.
- Bernoulli Distribution: For binary outcomes (like success or failure).
- Binomial Distribution: When dealing with a fixed number of independent Bernoulli trials.
- Poisson Distribution: Used for rare events, like customer arrivals at a store.
Detailed Source Code: Day 15 Commit
LinkedIn post: Day 15 Update
- Central Limit Theorm
- Hypothesis Testing
- Deriving p-values
- Z-Test
- T-Test
Detailed Source Code: Day 16 Commit
LinkedIn post: Day 16 Update
- Chi-Square Test
- F-Test/ANOVA
- Covariance
- Pearson Correlation
- Spearman Rank Correlation
Detailed Source Code: Day 17 Commit
LinkedIn post: Day 17 Update
- What is Machine Learning?
- Types of Machine Learning?
- Supervised Machine Learning
- Unsupervised Machien Learning
- Reinforcement Learning
- Semi-supervised Learning
Detailed Source Code: Day 18 Commit
LinkedIn post: Day 18 Update
- Data Collection
- Data Cleaning
- Exploratory Data Analysis
- Data Preprocessing
- Data Splitting
- Train the model
- Evaluation of a Model
- Deploy and Retrain
Detailed Source Code: Day 19 Commit
LinkedIn post: Day 19 Update