murshidazher / ultimate-hands-on-hadoop

🐷 The ultimate hands on hadoop

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hadoop with HDFS, YARN, MapReduce, Pig, Hive, Spark, Flink and more...

  • This repository contains all the data, setup and execution scripts used during the coursework on Udemy - The Ultimate Hands on hadoop Tame your Big data
  • Requires Hortonworks Sandbox 2.6.5, for docker script to run locally look into this repo.
  • If you need terraform script to setup an sandbox environment in AWS, look into this repo.

Table of Contents

Quick Navigation by Technology

Tech Stack

All the technologies used throughout the course

Core

  • Introduction to the Hadoop Eco-system
  • YARN
  • HDFS
  • MapReduce
  • Pig
  • Spark
  • Hive
  • Tez
  • Mesos - Alternative cluster manager for YARN
  • Zookeeper
  • Oozie

Querying

  • Apache Drill
  • Apache Phoenix
  • Presto

Ingestion

  • Sqoop
  • Kafka
  • Flume

NoSQL Databases

  • HBase
  • Cassandra
  • MongoDB

Streaming

  • Spark Streaming
  • Storm
  • Flink

Notebooks and Visualization

  • Apache Zeppelin
  • Apache Superset

Other

  • Impala
  • Accumulo
  • Redis
  • Ignite
  • Elasticsearch
  • Kinesis
  • Apache NiFi
  • Falcon
  • Apache Slider

License

MIT © Murshid Azher.

About

🐷 The ultimate hands on hadoop

License:MIT License


Languages

Language:PLpgSQL 99.7%Language:Python 0.3%Language:PigLatin 0.0%Language:HiveQL 0.0%