rogersmarin / pipeline

End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline using Spark, Spark SQL, Spark ML, GraphX, Spark Streaming, Kafka, Cassandra, ElasticSearch, Redis, Tachyon, HDFS, Zeppelin, Spark-Notebook, iPython/Jupyter Notebook, Tableau. See https://github.com/fluxcapacitor/pipeline/wiki for Setup Instructions.

Home Page:http://advancedspark.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Docker-based, End-to-End, Big Data Reference Pipeline!

Real-time, Advanced Analytics, Machine Learning, Graph Processing, Text/NLP Analytics

Please see the wiki for more info.

Screen Shots

Apache Zeppelin Notebooks

Apache Zeppelin Notebooks

Tableau Integration

Tableau Integration

Beeline Command-line Hive Client

Beeline Command-line Hive Client

Log Visualization with Kibana & Logstash

Log Visualization with Kibana & Logstash

Spark, Spark Streaming, and Spark SQL Admin UIs

Spark Admin UI Spark Admin UI Spark Admin UI Spark Admin UI Spark Admin UI Spark Admin UI

Ganglia System and JVM Metrics Monitoring UIs

Ganglia Metrics UI Ganglia Metrics UI Ganglia Metrics UI Ganglia Metrics UI Ganglia Metrics UI

Architecture Overview

Big Data Pipeline Overview

Tools Overview

Apache Spark Redis Apache Cassandra Apache Kafka ElasticSearch Logstash Kibana Apache Zeppelin Ganglia Hadoop HDFS iPython Notebook Docker Tachyon

About

End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline using Spark, Spark SQL, Spark ML, GraphX, Spark Streaming, Kafka, Cassandra, ElasticSearch, Redis, Tachyon, HDFS, Zeppelin, Spark-Notebook, iPython/Jupyter Notebook, Tableau. See https://github.com/fluxcapacitor/pipeline/wiki for Setup Instructions.

http://advancedspark.com

License:Other


Languages

Language:Shell 38.4%Language:Scala 27.7%Language:Vim Script 13.4%Language:JavaScript 9.0%Language:HTML 5.0%Language:ApacheConf 3.2%Language:CSS 2.4%Language:XSLT 0.6%Language:Batchfile 0.3%