johnwslee / Data_Science_Libraries

Personal Collections of Data Science Libraries

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Personal Collections of Data Science Libraries

This repository comprises Jupyter notebooks containing various Python libraries that I found very useful for data science. Most of the code was collected from articles on Medium, an online publishing platform, for the purpose of personal study and practice. In some cases, the code was modified by myself to ensure that it actually works.

List of Libraries

1. Python & Linux

  - General Tips, PRegEx, Pathlib, PyCircular, Decorators, OpenCV, make & Makefile, Watchdog

2. Machine Learning

  2.1. Data Preparation

   - Pandas, NumPy, FiftyOne, PySpark, Upgini, Synthetic Dataset

  2.2. Data Visalization

   - Sweetviz, Matplotlib/Plotly/Seaborn, PyGWalker

  2.3. Models and Algorithms

   - scikit-learn, Mahalanobis Distance, Open3D, PyMLPipe, Reinforcement Learning, Predictive Maintenance

  2.4. Web Apps

   - Dash, Streamlit, Gradio, Modelbit, PyScript

3. Deep Learning

  3.1. Pytorch

   - Basics

   - CNN: Binary Classification, 1D/2D comparison, Transfer Learning, Multi-Classification, Multi-Label Classification

   - Visions: Image Captioning, Image Segmentation, Object Detection

   - Generative AI: DPDM

   - Advanced Topics: Temporal Fusion Transformer, Physics-Informed NN, Graph Neural Network, Transformer

  3.2. TensorFlow

   - Basics, TensorBoard, Autoencoder

  3.3. HuggingFace

   - How to Use HuggingFace

4. Models and Tools for Timeseries

  - Darts, tslearn

5. Survival Analysis

  - lifelines

6. Natural Language Processing

  - NLTK

7. Hyperparameter Optimization

  - Optuna

8. Explainable AI

  - SHAP, Grad-CAM

9. Low-Code Machine Learning

  - PyCaret

10. Statistics

  10.1. General

  - Statistical Testing Flowchart, Distributions and Collinearity, Categorical Correlation, A/B Testing with Resampling/Booststrapping, Power Analysis

  10.2. Bayesian Statistics

   - PyMC, PyStan, BNLearn

  10.3. Markov Chain

   - Markov Chain, hmmlearn

11. Web Scraping

  - BeutifulSoup/Selenium/Wordcloud

12. Large Language Models

  - Jupyter_AI/Pandas_AI/Langchain

About

Personal Collections of Data Science Libraries


Languages

Language:Jupyter Notebook 100.0%Language:Python 0.0%Language:HTML 0.0%Language:CSS 0.0%Language:Makefile 0.0%