jlin-vt / SML

An R package for statistical machine learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Description

Build Status

This repository explores the theory and application of statistical machine learning, focusing on statistics, Bayesian and graphical model. It follows the structure of the classical machine learning textbooks (Machine Learning: a probabilistic approach) by Kevin Murphy.

Topics covered in the repository

  1. Mixture models and the EM algorithm
  2. Sparse linear models
  3. Kernels
  4. Gaussian Process
  5. Adaptive basis function models
  6. Graphical models
  7. Variational Inference
  8. Monte Carlo inference
  9. Markov chain Monte Carlo (MCMC) inference
  10. Clustering

Software

This toolkit contains many demos of different methods applied to many different kinds of data sets. The demos are listed here. The vast majority of the code is written in R. In the future, I will provide wrappers to implementations written in Julia, for speed reasons. Both programs are math based and have their own advantages: Julia (like Matlab) is the one for designing algorithms (e.g. matrix operations), while R is great for data analysis and statistics.

Dependencies

If you choose R, you can download it from the CRAN. R Studio is an excellent graphical interface. Also, you should install a few packages:

  • mvtnorm package computes multivariate normal and t probabilities, quantiles, random deviates and densities.
  • ggplot2: a system for 'declaratively' creating graphics, based on "The Grammar of Graphics".

For Julia users, I recommend installing the latest versions of Julia and Juno. Alternatively, you can run Julia on Atom, which is another powerful editor. Some demos may depend on the packages below:

  • Distributions: package for probability distributions and associated functions.

Resources

Print Textbooks and Online References

Programming Languages

Bug Reports / Change Requests

If you encounter a bug or would like make a change request, please file it as an issue here.

License

The package is available under the terms of the GNU General Public License v3.0.

About

An R package for statistical machine learning

License:GNU General Public License v3.0


Languages

Language:R 100.0%