hadoop-docker

There are 5 repositories under hadoop-docker topic.

big-data-europe / docker-hadoop
Apache Hadoop docker image
hadoop-docker docker-hadoop hadoop hadoop-cluster docker
Language:Shell 2128
ruoyu-chen / hadoop-docker
基于Docker构建的Hadoop开发测试环境，包含Hadoop，Hive，HBase，Spark
docker-hadoop spark hive centos hadoop-docker
Language:Shell 289
apache-spark-docker
Wittline / apache-spark-docker
Dockerizing an Apache Spark Standalone Cluster
pyspark apache-spark docker-compose hive-metastore hive hdfs hue dataengineering dataengineer hadoop-cluster hadoop-docker docker
Language:VBA 40
waltherg / distributable_docker_sql_on_hadoop
Toy Hadoop cluster combining various SQL-on-Hadoop variants
hadoop hadoop-mapreduce hadoop-filesystem hadoop-cluster hadoop-docker hadoop-hdfs hadoop-framework hive hue spark sparksql hbase hbase-client yarn yarn-hadoop-cluster zookeeper zookeeper-deployment tez impala presto
Language:Shell 12
hyeonsangjeon / dataplatform
Hadoop3.2 single/cluster mode with web terminal gotty, spark, jupyter pyspark, hive, eco etc.
hadoop hadoop-cluster hadoop-docker hadoop-mapreduce hadoop-ecosystem hive pyspark-notebook zeppelin-notebook
Language:Shell 11
lyingbo / hadoop-cluster-docker
Run Hadoop Cluster within Docker Containers
hadoop-3-2-0 hadoop-cluster hadoop-docker
Language:Shell 8
Mgosi / Big-Data-Analysis-using-MapReduce-in-Hadoop
We explore data by using Big Data Analysis and Visualization skills. To obtain this, we perform 3 main operations. i.e. i)Data Aggregation through different sources. ii) Big Data Analysis using MapReduce and iii) Visualization through Tableau. Data Analysis is very critical in understanding the data, and what we can do with the data. For small datasets it is easier to process and obtain the results. But as for big companies, it becomes crucial for them to obtain the trends of the company for any changes need to be made. Hence we introduce Big Data Analysis to solve this problem. In this lab, we collect close to 20000 tweets, 500 articles on New York Times and 500 articles on Common Crawl Data about Entertainment, which is our main topic of discussion. Using this data, we perform preprocessing and feed it to a MapReduce to find the Word Count and Word Co-Occurrence. Using this, we find the trend of the data collected in this topic. We have used Python to perform Data Analysis.Data Analysis is very critical in understanding the data, and what we can do with the data. For small datasets it is easier to process and obtain the results. But as for big companies, it becomes crucial for them to obtain the trends of the company for any changes need to be made. Hence we introduce Big Data Analysis to solve this problem. In this lab, we collect close to 20000 tweets, 500 articles on New York Times and 500 articles on Common Crawl Data about Entertainment, which is our main topic of discussion. Using this data, we perform preprocessing and feed it to a MapReduce to find the Word Count and Word Co-Occurrence. Using this, we find the trend of the data collected in this topic. We have used Python to perform Data Analysis.
big-data big-data-analytics common-crawl data-pipeline data-processing docker hadoop-docker hdfs tableau tweet-collector twitter-api
Language:Jupyter Notebook 8
jinho-yoo-jack / HadoopCluster
based Docker
docker-compose hadoop hadoop-cluster hadoop-docker
Language:Shell 5
adisve / hadoop-spark-cluster
A Spark/Hadoop-Docker Cluster template for working with Big Data
big-data big-data-docker big-data-template docker docker-hadoop docker-spark hadoop hadoop-docker hadoop-template pyspark spark spark-docker spark-template template
Language:Python 4
MengmSun / hadoop-in-docker
Hadoop in docker cluster, created by docker-compose. Create Hadoop cluster in less than 5mins.
docker-compose hadoop-cluster hadoop-docker hdfs-cluster hdfs-docker hadoop hdfs docker
Language:Shell 4
HoangNV2001 / Docker-Hadoop-Hive-Spark-Zeppelin-Hue-Superset
Bigdata stack with Hadoop + Hive +Spark + Zeppelin + Hue + Superset
docker docker-hadoop hadoop hadoop-docker
Language:Python 3
JuanCasado / Hadoop-Docker
Hadoop deployment on docker and Docker Swarm
flume hadoop hadoop-docker hbase hive pig-latin postgresql twitter
Language:TSQL 3
mjaglan / docker-hadoop-pseudo-distributed-mode
Run Apache Hadoop 2.7 inside docker container in pseudo-distributed mode
docker dockerfile hadoop hadoop-docker
Language:Shell 3
vietanh85 / hadoop-docker
Apache Hadoop Cluster Docker images
hadoop-docker hadoop-cluster docker hdfs pseudo-distributed-hadoop
Language:Shell 3
alex-ber / docker-hive
EMR 5.25.0 cluster single node Hadoop docker image. With Amazon Linux, Hadoop 2.8.5 and Hive 2.3.5
hadoop-docker hive docker docker-compose dockerfile docker-image hadoop-hdfs hadoop-mapreduce hadoop-cluster hadoop-ecosystem hadoop-framework hadoop-filesystem yarn-hadoop-cluster yarn hiveserver2 dockerfiles docker-images hadoop emr emr-cluster
Language:Shell 2
codito / hadoop-expt
Experiments with Hadoop cluster setups in Docker
docker docker-compose hadoop hadoop-cluster hadoop-docker
2
mjaglan / docker-hadoop-distributed-mode
Run Apache Hadoop 2.7 inside docker container in Multi-Node Cluster mode
docker dockerfile hadoop hadoop-docker
Language:Shell 2
Rohit9314 / my-hadoop
Setup hadoop cluster manually and automatically
hadoop-docker hadoop-cluster hadoop-mapreduce hadoop-filesystem hadoop-framework hadoop-distributions hdfs-docker docker-container dockerfiles docker-implemented-hadoop automated-hadoop-implementation complete-hadoop-setup hadoop-using-devops
Language:Python 2
fredrikhgrelland / docker-hadoop
hadoop docker hadoop-docker datamesh
Language:Dockerfile 1
tertiarycourses / ApacheHadoop
Exercise files for Apache Hadoop Big Data Training
hadoop hadoop-mapreduce hadoop-filesystem hadoop-streaming hadoop-docker
1
younthu / docker-hadoop
Apache Hadoop docker image cluster
hadoop hadoop-cluster hadoop-docker
Language:Shell 1
ChristinaLi91 / MapReduceProject_Google-Search-Auto-Complete
hadoop-docker mapreduce
Language:Java 0
maiktheknife / hadoop-docker
Apache Hadoop Docker Image
docker hadoop hadoop-docker docker-compose docker-swarm
Language:Shell 0
marycboardman / Assessment-Attempts
Data processing using docker containers, kafka, spark, and hadoop
kafka spark pyspark docker docker-image docker-compose docker-container zookeeper cloudera cloudera-hadoop cloudera-hadoop-framework hadoop hadoop-hdfs hadoop-docker spark-sql sparksql digitalocean digital-ocean
0
mr-ravin / Smart-Hadoop-Cluster-SMHACL
This is an automated hadoop cluster building tool,which implements distributed computing for creating the cluster over the network. This is implemented in python 2.7
automation big-data distributed-systems docker hadoop hadoop-cluster hadoop-docker hadoop-hdfs python-2
Language:Python 0
NajlaBH / BigData
Big Data projects.
bigdata hadoop hadoop-docker
Language:CSS 0
SharpData / docker-hive3
Hive 3 In Docker container
bigdata docker docker-image hadoop hadoop-docker hive
Language:Shell 0
Sidl419 / hadoop_streaming
Построение рекомендательной системы на основе алгоритма коллаборативной фильтрации и технологии Hadoop Streaming
hadoop-mapreduce hadoop-streaming hadoop-docker hadoop-hdfs
Language:Python 0
simonprewo / vagrantbox-hadoop-containerexecutor
hadoop-docker vagrant
0
imdeepanshugpt / Hadoop
Hadoop-Cluster
hadoop hadoop-mapreduce hadoop-filesystem hadoop-cluster hadoop-docker hadoop-streaming hadoop-framework docker docker-compose docker-container docker-image
Language:Python
jbw / build-hadoop
Build Hadoop with Docker for Ubuntu. See releases for different architectures such as armv7l
hadoop docker hadoop-docker raspberry-pi armv7
Language:Dockerfile
jbw / hadoop-docker-cluster
Hadoop cluster on Docker (single host)
hadoop hadoop-cluster hadoop-mapreduce hadoop-docker docker
Language:Shell
kevin85421 / docker-compile-hadoop
Compile hadoop in docker container
hadoop-docker hadoop-compile docker
Language:Dockerfile
rishabhindoria / Big-Data-Hadoop-Pig-Latin
Apache Pig Latin script to count letters in multiple input text files, using the HortonWorks Hadoop Sandbox or Google Cloud Platform
googlecloud googlecloudplatform hadoop-cluster hadoop-docker hadoop-filesystem hadoop-mapreduce pig-latin sandbox
Language:PigLatin

hadoop-docker

big-data-europe / docker-hadoop

ruoyu-chen / hadoop-docker

Wittline / apache-spark-docker

waltherg / distributable_docker_sql_on_hadoop

hyeonsangjeon / dataplatform

lyingbo / hadoop-cluster-docker

Mgosi / Big-Data-Analysis-using-MapReduce-in-Hadoop

jinho-yoo-jack / HadoopCluster

adisve / hadoop-spark-cluster

MengmSun / hadoop-in-docker

HoangNV2001 / Docker-Hadoop-Hive-Spark-Zeppelin-Hue-Superset

JuanCasado / Hadoop-Docker

mjaglan / docker-hadoop-pseudo-distributed-mode

vietanh85 / hadoop-docker

alex-ber / docker-hive

codito / hadoop-expt

mjaglan / docker-hadoop-distributed-mode

Rohit9314 / my-hadoop

fredrikhgrelland / docker-hadoop

tertiarycourses / ApacheHadoop

younthu / docker-hadoop

ChristinaLi91 / MapReduceProject_Google-Search-Auto-Complete

maiktheknife / hadoop-docker

marycboardman / Assessment-Attempts

mr-ravin / Smart-Hadoop-Cluster-SMHACL

NajlaBH / BigData

SharpData / docker-hive3

Sidl419 / hadoop_streaming

simonprewo / vagrantbox-hadoop-containerexecutor

imdeepanshugpt / Hadoop

jbw / build-hadoop

jbw / hadoop-docker-cluster

kevin85421 / docker-compile-hadoop

rishabhindoria / Big-Data-Hadoop-Pig-Latin