cliff-notes kmeans linear-regression logistic-regression machine-learning naive-bayes-classifier neural-network

Machine Learning Cliff Notes

These Jupyter notebooks are my notes as I work through this stuff. I make no guarantee that these are correct. Please visit bdhammel.com for more up-to-date write-ups. If you find an error or something that's unclear, please let me know.

Outline

For each ML model:

Top-level

What is the high-level version, explain in layman's terms
What scenario should you use it in (classification vs regression, noisy data vs clean data)?
How does this deal with outliers? Skewed data?
What assumptions does the model make about the data? (Linear, etc)?
What types of features does the model use (continuous vs categorical)?
When does the model break/fail (adv & dis-advantages)?
use cases/alternatives when it breaks

A bit more detail:

How do you normalize the data for the model, if you need to?
How to initialize parameters at beginning (e.g. where to put centroid in KMeans)
What's the loss function used (if many, what are tradeoffs of each)?
What optimizers can you use here?
What's the complexity — runtime, parameters?
How does it scale with # of features or input data?

In-depth

probabilistic interpretation
Derive the math (only for a couple models)
Code up implementation (only for a couple models)

More on training the model (not model-specific, this should be common for most of the models):

How to deal with imbalanced data?
How well does it generalize to unseen data (over-fitting vs under-fitting)?
What if you have MANY more features than sample points? Vice versa? (A variation of the above over/under fitting)
How do you regularize the model? Tradeoffs?
How can you validate the model?
Does the model emphasize Type 1 or Type 2 errors?

Great Resources

I borrow heavily from the resources below; this includes code snippets, figures, derivations, and explanations.

About

Machine Learning Cliff Notes

cliff-notes kmeans linear-regression logistic-regression machine-learning naive-bayes-classifier neural-network

Languages

Language:Jupyter Notebook 100.0%

bdhammel / ml-cliff-notes

Machine Learning Cliff Notes

Table of Contents

Outline

Great Resources

About

Languages