travistheall / UltimateHandsOnHadoop

Code Along with The Ultimate Hands-On Hadoop: Tame your Big Data!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Code Along with The Ultimate Hands-On Hadoop: Tame your Big Data!

Hadoop tutorial with MapReduce, HDFS, Spark, Flink, Hive, HBase, MongoDB, Cassandra, Kafka + more! Over 25 technologies.

File Structure:

This whole github is the home directory for the VirtualBox environment.

  • data - data coming into scripts
    • out - data generated by scripts
  • hive_ql - hive query language scripts
  • notes - general notes taken in class
  • python - modified versions of the python code provided
    • given - provided scripts from sundog education
    • selfLearn - additional learning to better understand material
  • scripts - directory for scripts to run python scripts

Code file comments very verbose and are for learning purposes

Outdated python and pip for ease of following along with course

  • python version = Python 2.7.5
  • pip version = 8.1.2
  • rest = requirements.txt

About

Code Along with The Ultimate Hands-On Hadoop: Tame your Big Data!


Languages

Language:PLpgSQL 98.4%Language:Java 1.0%Language:Python 0.5%Language:PigLatin 0.1%Language:Shell 0.0%Language:HiveQL 0.0%