shreshthajit / Hadoop

Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hadoop

if package is not showing than we can make that folder mark directory as root folder and we will see the package option.

Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. We can run hadoop using docker.

set up hadoop cluster

mapreduce program link: youtube

questions: Link

/home/shreshthajit/docker-hadoop# docker cp ../Desktop/hadoop-mapreduce-examples-2.7.1-sources.jar 013d76107704:hadoop-mapreduce-examples-2.7.1-sources.jar first go to root user:

sudo su

docker problems: Problem

to stop running container :
docker kill contaninerID

to remove a container:
docker rm containerID

to start and stop docker-compose:
docker-compose stop
docker-compose start

First We will setup docker in ubuntu:

commnad:

sudo apt install docker.io

docker --version

sudo systemctl status docker

sudo systemctl enable --now docker

sudo systemctl status docker

sudo docker run hello-world  ///this command will create a image called  hello-world

docker images

install hadoop:

docker-compose --version

docker-machine ---version

docker run -d -p 80:80 --name myserver nginx

visit http://localhost to view the homepage of your new server.

download this: link

next command:

docker-compose up -d

docker ps

Go to link to view the current status of the system from the namenode.

Testing hadoop culster.

docker exec -it namenode bash

mkdir input

echo "Hello World" >input/f1.txt

echo "Hello Docker" >input/f2.txt

hadoop fs -mkdir -p input

hdfs dfs -put ./input/* input

docker container ls //this command will have to run into the docker folder

word count file link: link

docker cp ../hadoop-mapreduce-examples-2.7.1-sources.jar cb0c13085cd3:hadoop-mapreduce-examples-2.7.1-sources.jar

to see the files use this command:

hadoop jar hadoop-mapreduce-examples-2.7.1-sources.jar org.apache.hadoop.examples.WordCount input output

About

Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware.