snowwolph's repositories
spark-sframe
This project contains the code to translate between Apache Spark and SFrame.
python-libffm
A Python wrapper for the libffm library.
spark-mail
Tutorial on parsing Enron email to Avro and then explore the email set using Spark.
spark-ts-examples
Spark TS Examples
nose-progressive
A nosetests plugin with a progress bar and an emphasis on showing what's important
spark-training-vm
Tetra Concepts LLC Spark Training Environment
xkcd_survey
Analysis of the xkcd survey data
zeppelin
Zeppelin is data analytics environment
spark-1
Apache Spark Extensions
AnomalyDetection
Anomaly Detection with R
flights
Automated download and load entire flights data, and functions for generating NYC flights, delay and weather delay for 2014.
courses
Course materials for the Data Science Specialization: https://www.coursera.org/specialization/jhudatascience/1
tachyon
A Reliable Memory Centric Distributed Storage System
DataScienceSpecialization.github.io
http://DataScienceSpecialization.github.io
spark-csv
CSV reader for Spark
hadleyverse
Presentation on R Packages by Hadley Wickham et al
SparkR-pkg
R frontend for Spark
spark-pr-dashboard
Dashboard to aid in Spark pull request reviews
spark-knowledgebase
Spark Knowledge Base
smappR
R tools for analysis of Twitter data
logs-storm
The same MVP Logs analyzer application in Storm.
spark-avro
Integration utilities for using Spark with Apache Avro data
ISpark
An Apache Spark-shell backend for IPython
spark-indexedrdd
An efficient updatable key-value store for Apache Spark
git
Git Source Code Mirror
aws-sdk-java
Official mirror of the AWS SDK for Java. For more information on the AWS SDK for Java, see our web site:
spark-perf
Performance tests for Spark