Arpit T's repositories
GettingStartedWithD3
Dive into D3 with Mike Dewars intro book
aas
Code to accompany Advanced Analytics with Spark from O'Reilly Media
benchmark
Application to benchmark inserts, reads and queries of nosql data-stores
Constraint_Distance
Check to see if the constraint are satisfied in the protein or RNA structure files after simulations. This is somewhat raw but I wrote it on a hunch for a project I am working on
CSVCleaner
Python tool for cleaning survey CSVs
generator
Synthetic data generators for simulating real-time data and work loads
GSODGetStation
Still Incomplete, I will update as I find time
GSODMaxTemp
Based on Tom White's Hadoop Definitive Guide book. This calculates the Maximum Temperature recorded for each year from Global Summary of the Day (GSOD) database. The database is available to public from NCDC. Initial commit.. improvements will follow
GSODMaxTempByStation
Reproduction of Maximum Temperature by station from hadoop book by Tom White. Tesing this code on GSOD data publicly available from NCDC
hue
Let’s Big Data. Hue is an open source Web interface for analyzing data with Hadoop and Spark.
incubator-zeppelin
Mirror of Apache Zeppelin (Incubating)
Protein-principal-axis-angles
Can be used to calculate the angles between two protein domains found in any protein complex.
scripts
Cloudwick Deployment Scripts
Spark-Collection
Spark Applications for the MapReduce cwt
spark-perf
Performance tests for Spark
SparkExamples
MapReduce use cases written using Spark (Scala API)