https://www.coursera.org/learn/big-data-essentials
dockerhub : https://hub.docker.com/u/bigdatateam
HDFS architecture
MapReduce Basic / Hadoop Streaming(with Python) / MapReduce Optimization(Combiner, Partitioner, Comparator)
Spark architecture
Map(Reduce)-Side Join / Job Chaining / Data Salting