We introduce basic principles and techniques in the fields of data mining and machine learning. These are some of the key tools behind the emerging field of data science and the popularity of the `big data' buzzword. These techniques are now running behind the scenes to discover patterns and make predictions in various applications in our daily lives. We'll focus on many of the core data mining and machine learning technlogies, with motivating applications from a variety of disciplines.
- Linear algebra review
- Probability review
- Calculus review
- Algorithms and data structures review
- Data exploration
- Decision trees
- Training and testing
- Naive Bayes
- K-nearest neighbours
- Random forests
- Clustering
- Finding similar items
- Matrix notation and minimizing quadratics
- Robust regression with gradient descent
- Linear regression and non-linear bases
- Convex functions
- Logistic regression with sparse regularization (L2, L1, L0, L0.5)
- Softmax classification
- One-vs-all logistic regression
- Kernel logistic regression
- Hyperparameter searching
- MAP estimation
- Principal component analysis
- Robust PCA
- Data visualization (PCA, MDS, ISOMAP, t-SNE)
- Stochastic gradient descent for a neural network
- Hyperparameter tuning for a neural network