mrperfectpandit / DS-Curriculum

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Brief Introduction

A complete guide to learn data science for beginners.

This learning path is intended for everyone who wants to learn data science and build a career in data field especially data analyst and data scientist. In this guide, there is a corresponding link in each section that will help you to learn..

Table of Contents

Table of Contents
  1. Programming
  2. Mathematics & Statistics
  3. Machine Learning
  4. Evaluation Metrics
  5. Deep Learning
  6. Advanced Natural Language Processing (NLP)
  7. Capstone Projects

Programming

  1. Python
    • Variables and Data Types
    • Operators and Expressions
    • Control Flow (if, elif, else, for, while)
    • Functions
    • Python Data Structure - Lists, Tuples, and Dictionaries
    • File Handling
    • Exception Handling
    • Modules and Packages
    • Lambda Functions

Data Processing Modules

  1. NumPy

    • Array Creation and Manipulation
    • Mathematical Functions
    • Linear Algebra Operations
    • Statistical Functions
    • Broadcasting
    • Indexing and Slicing
    • File I/O
  2. Pandas

    • Data Structures (Series, DataFrame)
    • Data Cleaning and Preprocessing
    • Indexing and Selecting Data
    • Grouping and Aggregating Data
    • Merging and Joining DataFrames
    • Reshaping and Pivoting Data
    • Handling Missing Data
  3. Matplotlib

    • Basic Plotting (Line Plot, Scatter Plot, Bar Plot, Histogram)
    • Customizing Plots (Labels, Titles, Legends, Colors)
    • Subplots and Layouts
    • Plot Annotations and Text
    • 3D Plotting
  4. Seaborn

    • Statistical Data Visualization
    • Distribution Plots (Histograms, Kernel Density Estimation)
    • Categorical Plots (Bar Plots, Box Plots, Violin Plots)
    • Scatter and Line Plots with Regression Analysis
    • Pair Plots and Heatmaps

🠥🠥 Back to Table of Contents 🠥🠥

Mathematics & Statistics

  1. Descriptive Statistics
  2. Data Distributions
  3. Statistical Testing
  4. Exploratory Data Analysis
  5. TOOLBOX: Pandas
  6. TOOLBOX: Numpy
  7. TOOLBOX: Matplotlib
  8. TOOLBOX: Seaborn

🠥🠥 Back to Table of Contents 🠥🠥

Machine Learning

  • Supervised Learning

  1. Linear Regression
  2. Logistic Regression
  3. Decision Tree
  4. K-NN (K-Nearest Neighbors)
  5. Naive Bayes
  6. Support Vector Machine
  7. Random Forest
  8. XGBoost
  9. TOOLBOX: Scikit Learn
  10. CASE STUDY 1:
  11. CASE STUDY 2:

🠥🠥 Back to Table of Contents 🠥🠥

  • Unsupervised Learning

  1. K-Means Clustering
  2. Hierarchical Clustering
  3. DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
  4. Gaussian Mixture Models (GMM)
  5. Principal Component Analysis (PCA)

🠥🠥 Back to Table of Contents 🠥🠥

Evaluation Metrics

  • Supervised Learning

  1. Confusion Matrix
  2. Accuracy
  3. Precision
  4. Recall
  5. F Score
  6. ROC (Receiver Operating Characteristic)
  7. ROC AUC (Area Under Curve)
  8. MAE
  9. MSE

🠥🠥 Back to Table of Contents 🠥🠥

  • Unsupervised Learning

  1. Elbow Method
  2. Silhouette Score

🠥🠥 Back to Table of Contents 🠥🠥

Deep Learning

  1. Activation Functions
  2. Linear Layer
  3. CNN (Convolutional Neural Networks)
  4. Optimization
  5. Loss Functions / Objective Functions
  6. Dropout
  7. Batchnorm
  8. Learning Rate Scheduler
  9. TOOLBOX: Tensorflow
  10. TOOLBOX: Keras

Advance topic Natural Language Processing (NLP)

  1. Tokenization
  2. Stopwords Removal
  3. Stemming and Lemmatization
  4. Bag of Words (BoW)
  5. Term Frequency-Inverse Document Frequency (TF-IDF)
  6. Word Embeddings (Word2Vec, GloVe)
  7. Recurrent Neural Networks (RNNs)
  8. Long Short-Term Memory (LSTM)
  9. Gated Recurrent Units (GRU)
  10. Attention Mechanism

🠥🠥 Back to Table of Contents 🠥🠥

Capstone mini Project---

Capstone major Project---

About