MTH594 Advanced data mining: theory and applications
The materials for the course MTH 594 Advanced data mining: theory and applications taught by Dmitry Efimov in American University of Sharjah, UAE in Spring, 2016 semester.
The program of the course can be downloaded from the folder syllabus.
To compose this lectures mainly I used the ideas from three sources:
Stanford lectures by Andrew Ng on YouTube: https://www.youtube.com/watch?v=UzxYlbK2c7E&list=PLA89DCFA6ADACE599
The book "The elements of Statistical Learning" by T. Hastie, R. Tibshirani and J. Friedman: http://statweb.stanford.edu/~tibs/ElemStatLearn
Lectures by Andrew Ng on Coursera: https://www.coursera.org/learn/machine-learning
All uploaded pdf lectures are adapted in a way to help students to understand the material.
The supplementary files from ipython folder are aimed to teach students how to use built-in methods to train the models on Python 2.7.
In case you found some mistakes or typos, please email me diefimov@gmail.com , this course is a new for me and probably there are some :)
The content of the lectures:
Linear and logistic regressions, perceptrons
Analytical minimization: normal equations
Statistical interpretation
Bayesian interpretation and regularization
Examples of gradient descent
Stochastic gradient descent
Generalized linear models (GLM)
Generalized Linear Models (GLM)
Generative learning algorithms
General idea of generative algorithms
Gaussian discriminant analysis
Generative vs Discriminant comparison
Gaussian discriminant analysis
Support vector machines: intuition
Primal/dual optimization problem and KKT
Locally weighted regression
Generalized additive models (GAM)
Locally weighted regression
Regression decision trees
Classification decision trees
Empirical risk minimization (ERM)
Union bound / Hoeffding inequality
Advices for apply ML algorithms
Mixture of Gaussians and EM algorithm
EM algorithm for the mixture of Gaussians
EM algorithm for the mixture of Naive Bayes
EM algorithm for mixture of Gaussians
Marginal and conditionals for Gaussians
EM steps for factor analysis
Principal component analysis
Latent semantic indexing (LSI)
Independent component analysis (ICA)