CMU 10-605 Machine Learning with Large Data Sets
Assignment codes for 10-605 Machine Learning with Big Datasets
The assignments code are for 10-605 2014 Spring. Most of the codes could run against benchmark and achieve high performances.
Tricks are involved to improve the performance. Please see each folder's report for details.
- Homework1: Streaming Naive Bayes
- Homework2: Small Memory Streaming Naive Bayes
- Homework3: Streaming Phrase Finding
- homework4(a): Hadoop Streaming Naive Bayes
- homework4(b): Naive Bayes with Hadoop API
- homework4(c): Hadoop Phrase Finding
- homework5: Logistic Regression using Stochastic Gradient Descent
- Homework6: Approximate PageRank
- Homework7: Naive Bayes with Pig