bluepine / spark-examples

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exploring internet on hot topic Machine Learning I found tutorials created by Microsoft about their SQL Server Data Mining Extensions. Microsoft provides AdventureWorksDW2012 sample database under Microsoft Public License (http://msftdbprodsamples.codeplex.com/license) and uses it in data mining tutorials. Some time ago it inspired me to build and test more general and presumably more scalable solution using Apache Spark (f course, results of computations should be comparable to some level regardless of solution being used). After that I decided to provide some more evaluations of MLLib algorithms that are available in Spark.

About


Languages

Language:Scala 40.9%Language:Java 26.4%Language:Shell 19.9%Language:Batchfile 7.1%Language:HTML 4.5%Language:XSLT 1.3%Language:Makefile 0.1%