Madhu's starred repositories

fastText

Library for fast text representation and classification.

Language:HTMLLicense:MITStargazers:25874Issues:0Issues:0

aeron

Efficient reliable UDP unicast, UDP multicast, and IPC message transport

Language:JavaLicense:Apache-2.0Stargazers:7349Issues:0Issues:0

interview

Interview questions

Language:JavaLicense:Apache-2.0Stargazers:11080Issues:0Issues:0

Beetest

A super simple utility for testing Apache Hive scripts locally for non-Java developers.

Language:JavaStargazers:72Issues:0Issues:0

airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Language:PythonLicense:Apache-2.0Stargazers:36571Issues:0Issues:0

CtCI-6th-Edition

Cracking the Coding Interview 6th Ed. Solutions

Language:JavaStargazers:11306Issues:0Issues:0

hacker-scripts

Based on a true story

Language:JavaScriptStargazers:47487Issues:0Issues:0

tensorflow

An Open Source Machine Learning Framework for Everyone

Language:C++License:Apache-2.0Stargazers:185871Issues:0Issues:0

pipeline

PipelineAI

Language:JsonnetLicense:Apache-2.0Stargazers:4164Issues:0Issues:0

gobblin

A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

Language:JavaLicense:Apache-2.0Stargazers:2217Issues:0Issues:0

nifi

Apache NiFi

Language:JavaLicense:Apache-2.0Stargazers:4802Issues:0Issues:0

finagle

A fault tolerant, protocol-agnostic RPC system

Language:ScalaLicense:Apache-2.0Stargazers:8784Issues:0Issues:0

scikit-learn

scikit-learn: machine learning in Python

Language:PythonLicense:BSD-3-ClauseStargazers:59668Issues:0Issues:0

learning-spark

Practical examples of using Apache Spark in several different use cases

Language:JavaScriptStargazers:104Issues:0Issues:0

mapreducepatterns

Repository for MapReduce Design Patterns (O'Reilly 2012) example source code

Language:JavaStargazers:236Issues:0Issues:0

stat-learning

Notes and exercise attempts for "An Introduction to Statistical Learning"

Language:HTMLStargazers:2125Issues:0Issues:0

india-election-data

To map publicly available datasets related to General Assembly (Lok Sabha) elections in India.

Language:Jupyter NotebookStargazers:152Issues:0Issues:0

elephant-bird

Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.

Language:JavaLicense:Apache-2.0Stargazers:1139Issues:0Issues:0

HiveKa

Kafka as Hive Storage

Language:JavaLicense:Apache-2.0Stargazers:67Issues:0Issues:0

kafka-storm-starter

[PROJECT IS NO LONGER MAINTAINED] Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.

Language:ScalaLicense:NOASSERTIONStargazers:725Issues:0Issues:0

kafka-spout

Kafka consumer emitting messages as storm tuples

Language:JavaLicense:Apache-2.0Stargazers:103Issues:0Issues:0

kafka-elasticsearch-standalone-consumer

Kafka Standalone Consumer [Indexer] will read messages from Kafka, in batches, process and bulk-index them into ElasticSearch.

Language:JavaLicense:Apache-2.0Stargazers:167Issues:0Issues:0

kylin

Apache Kylin

Language:JavaLicense:Apache-2.0Stargazers:3635Issues:0Issues:0

killrweather

KillrWeather is a reference application (work in progress) showing how to easily integrate streaming and batch data processing with Apache Spark Streaming, Apache Cassandra, Apache Kafka and Akka for fast, streaming computations on time series data in asynchronous event-driven environments.

Language:ScalaLicense:Apache-2.0Stargazers:1182Issues:0Issues:0

cascading

All development now happens over here: https://github.com/cwensel/cascading. Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on various cluster computing platforms.

Language:JavaLicense:NOASSERTIONStargazers:331Issues:0Issues:0

scalding

A Scala API for Cascading

Language:ScalaLicense:Apache-2.0Stargazers:3497Issues:0Issues:0

zeppelin

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Language:JavaLicense:Apache-2.0Stargazers:6388Issues:0Issues:0

storm-contrib

A collection of spouts, bolts, serializers, DSLs, and other goodies to use with Storm

Language:JavaLicense:EPL-1.0Stargazers:580Issues:0Issues:0

couchbasekafka

Couchbase Kafka Adapter

Language:JavaLicense:Apache-2.0Stargazers:24Issues:0Issues:0

spark-cassandra-connector

DataStax Connector for Apache Spark to Apache Cassandra

Language:ScalaLicense:Apache-2.0Stargazers:1942Issues:0Issues:0