saurfang / spark-tsne

Distributed t-SNE via Apache Spark

Home Page:https://saurfang.github.io/spark-tsne-demo/tsne-pixi.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

spark-tsne

Join the chat at https://gitter.im/saurfang/spark-tsne Build Status Distributed t-SNE with Apache Spark. WIP...

t-SNE is a dimension reduction technique that is particularly good for visualizing high dimensional data. This is an attempt to implement this algorithm using Spark to leverage distributed computing power.

The project is still in progress of replicating reference implementations from the original papers. Spark specific optimizations will be the next goal once the correctness is verified.

Currently I'm showcasing this using the standard MNIST handwriting recognition dataset. I have created a WebGL player (built using pixi.js) to visualize the inner workings as well as the final results of t-SNE. If a WebGL is unavailable for you, you may checkout the d3.js player instead.

Credits

About

Distributed t-SNE via Apache Spark

https://saurfang.github.io/spark-tsne-demo/tsne-pixi.html

License:Apache License 2.0


Languages

Language:Scala 83.2%Language:HTML 13.1%Language:R 3.7%