yifeiacc / 10-605

Assignment code for 10-605 Machine Learning with Big Datasets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CMU 10-605 Machine Learning with Large Data Sets

Assignment codes for 10-605 Machine Learning with Big Datasets

The assignments code are for 10-605 2014 Spring. Most of the codes could run against benchmark and achieve high performances.

Tricks are involved to improve the performance. Please see each folder's report for details.

  1. Homework1: Streaming Naive Bayes
  2. Homework2: Small Memory Streaming Naive Bayes
  3. Homework3: Streaming Phrase Finding
  4. homework4(a): Hadoop Streaming Naive Bayes
  5. homework4(b): Naive Bayes with Hadoop API
  6. homework4(c): Hadoop Phrase Finding
  7. homework5: Logistic Regression using Stochastic Gradient Descent
  8. Homework6: Approximate PageRank
  9. Homework7: Naive Bayes with Pig

About

Assignment code for 10-605 Machine Learning with Big Datasets


Languages

Language:Java 94.8%Language:PigLatin 5.2%