This repository provides some basic learning & training materials for anyone who are willing to learn data science and machine learning engineering & operations (MLOps).
Please note: the training materials in this repo are to provide interesting contents for people to get familar to accompulish data science and MLOps projects. It is not comprehensive user manual or guidance.
To get all the contents of this repo, you can clone this repo to your local machine by using Git as:
git clone https://github.com/chen115y/MLOpsLearning.git
For git download and installation, please refer to its official website. For learning how to use git and GitHub, please refer to this simple introduction.
- Local Environment Setup on Windows and Linux Ubuntu GitHub
- Google Colaboratory (colab) (if you don't want to setup an environment locally)
- IDEs – Pycharm, Jupyter Notebook and/or Jupyter Lab, Visual Studio Code
- Python Libraries included in Anaconda
- Khan Academy - Math
- Hypothesis Tests
- Think Stats: Probability and Statistics for Programmers
- Mathematics for Machine Learning
- Math Cheat Sheets for Data Science
- Data Science Life Cycle Introduction - Slides Deck
- End-to-End Machine Learning Project
- Data Science and MLOps Life Cycle - Principles, Standards and Best Practices
- A Jupyter Notebook Template for Data Science Project
- AutoML and Auto-Keras: Getting Started Guide
- Rules of Machine Learning: Best Practices for ML Engineering
- Extra Reading - Machine Learning Ops
- Time Series Made Easy in Python - Darts
- Categories of Machine Learning Algorithms
- Association Rules
- Classification
- Regression
- Decision Tree
- Ensemble Learning
- Unsupervised Learning
- Dimensioinality Reduction
- Cheat Sheets
- Neural Nets
- Activation Functions
- Traning Deep Neural Network
- Convolutional Neural Network (CNN)
- Recurrent Neural Network (RNN)
- Autoencoder and Generative Adversarial Network (GAN)
- Cheat Sheets
- Word Embedding - Word2Vec
- Advanced NLP with SpaCy
- Natural Language Processing with RNNs and Attention
- Transformer Introduction
- Transformer model for language understanding
- How do Transformers Work in NLP? A Guide to the Latest State-of-the-Art Models
- The Illustrated Transformer
- A Visual Guide to Using BERT for the First Time
- Applying massive language models in the real world with Cohere
- Intuition Builder: How to Wrap Your Mind Around Transformer’s Attention Mechanism
- How to Build OpenAI's GPT-2: "The AI That Was Too Dangerous to Release"
- The Illustrated GPT-2 (Visualizing Transformer Language Models)
- How GPT3 Works - Visualizations and Animations
- The Illustrated Stable Diffusion
- LMFlow - An extensible, convenient, and efficient toolbox for finetuning large machine learning models
- LLaMA-Adapter: Efficient Fine-tuning of LLaMA
- List of Open Sourced Fine-Tuned Large Language Models (LLM)
- Google Cloud - Generative AI learning path
- Lil’Log - Lilian is leading a team on AI Safety at OpenAI.
- OpenAI Cookbook
- Prompt Engineering Guide
- Awesome Prompt Engineering
- (Almost) Everything I know about LLMs
- I-JEPA: The first AI model based on Yann LeCun’s vision for more human-like AI
- Data-centric AI - MIT
- Interfaces for Explaining Transformer Language Models
- Explainable AI Cheat Sheet
- Unveiling the Black Box model using Explainable AI(Lime, Shap) Industry use case
- What Are the Data-Centric AI Concepts behind GPT Models?
- Reinforcement Learning - An Introduction
- llustrating Reinforcement Learning from Human Feedback (RLHF)
- StackLLaMA: A hands-on guide to train LLaMA with RLHF
- ColossalChat: An Open-Source Solution for Cloning ChatGPT With a Complete RLHF Pipeline
- Mastering RLHF with AWS: A Hands-on Workshop on Reinforcement Learning from Human Feedback
- Combine Amazon SageMaker and DeepSpeed to fine-tune FLAN-T5 XXL
- High accuracy generative ai applications using amazon kendra
- LLM Powered Autonomous Agents
- Python Crash Course, Eric Matthes, No Starch Press, Inc., 2016 (recommend for beginners to understand Python)
- Introduction to Machine Learning with Python, Andreas Muller and Sarah Guido, O’Reilly, 2016 (recommend for beginners)
- Python Data Science Handbook, Jake VanderPlas, O’Reilly, 2017 (recommend for beginners)
- R for Data Science, Garrett Grolemund and Hadley Wickham, O’Reilly, 2017 (optional since Python is dominant in data science)
- Machine Learning Yearning, Andrew Ng, deeplearning.ai 2018 (recommend for beginners)
- Hands-on Machine Learning with Scikit-Learn, Keras & Tensorflow, Aurelien Geron, O’Reilly, 2019 (recommend for advanced learners)
- Deep Learning with Python, Francois Chollet, Manning Publications Co., 2018 (recommend for beginners)
- Data Science on AWS, Chris Fregly & Antje Barth, O'Reilly, 2021 (recommend for AWS professionals)
- Mathematics for Machine Learning, GitHub, Marc Peter Deisenroth, A Aldo Faisal and Cheng Soon Ong, Cambridge University Press, 2020 (recommend for all levels of learners)
- Amazon Machine Learning University
- Tensorflow - ML Zero to Hero 25 Youtube videos.
- Deep Learning with PyTorch - Full Course An almost 10-hour course video.
- Gradient Descent - Youtube
- Backpropagation Calculus - Youtube
- Illustrated Guide to Recurrent Neural Networks: Understanding the Intuition
- Illustrated Guide to LSTM's and GRU's: A step by step explanation
- Transformer Neural Networks - EXPLAINED! (Attention is all you need)
- Illustrated Guide to Transformers Neural Network: A step by step explanation
- LSTM is dead. Long Live Transformers!
- BERT Neural Network - EXPLAINED!
- GPT-3: Language Models are Few-Shot Learners (Paper Explained)
- Reinforcement Learning - OpenAI GymAWS SageMaker Examples
- Scaling Laws for Neural Language Models.
- Google Python Class, Online Python Tutorials, Interactive Python Class and Algorithms and Data Structures using Python
- Python Data Science Handbook website and Github
- Hands-on Machine Learning with Scikit-Learn, Keras and Tensorflow 1st Edition and Github
- AWS Amazon Sagemaker Examples Github
- Three Popular AI/ML/DS Frameworks: Tensorflow, PyTorch, MXNet
- An Introduction to Math Behind Neural Network
- Machine Learning & Deep Learning Fundamentals - Youtube
- Spark Data Processing Course
- Cornell University CS4780/CS5780 Course: Machine Learning for Intelligent Systems
- Applied Machine Learning Course at BYU - GitHub
- Microsoft Machine Learning Algorithms available on SQL Server Analysis Services (SSAS)
- "Fake" Data Scientists
- You can master Computer Vision, Deep Learning, and OpenCV - Adrian Rosebrock, PhD
- Data Scientist Interview Questions
- Stanford CS324 - understanding and developing large language models
- aman.ai - exploring the art of artificial intelligence one concept at a time
- Computer Science Courses with Video Lectures