rakesh-92's repositories
fast-data-dev
Kafka Docker for development. Kafka, Zookeeper, Schema Registry, Kafka-Connect, Landoop Tools, 20+ connectors
flume-ng-audit-db
Apache Flume JDBC source, drop duplicated events interceptor, utility to infer Avro schema from table and much more!
google-cloud-node
Google Cloud Client Library for Node.js
hadoop-tutorials-2016
Hadoop Tutorials
kafka-connect-hdfs
Kafka Connect HDFS connector
maps
GBIF mapping service built on HBase and SOLR, supporting Mapbox Vector Tiles and PNGs
nutch-plugins
Apache Nutch extensions
pyspark-tutorial
PySpark-Tutorial provides basic algorithms using PySpark
pyspark-tutorials
Code snippets and tutorials for working with social science data in PySpark
python-docs-samples
Code samples used on cloud.google.com
scala-best-practices
A collection of Scala best practices
spark
Mirror of Apache Spark
spark-hive-udf
Example project showing how to use Hive UDFs in Apache Spark
Theano
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.
training-data-analyst
Labs and demos for courses for GCP Training (http://cloud.google.com/training).
yarn-beginners-examples
Yarn example source code accompanying wikibooks "Beginning Hadoop Programming" by Jaehwa Jung