Lab 2: Running a Hadoop Application

Post on 11-Nov-2014

848 views 2 download

Tags:

description

Cloud Computing Workshop 2013, ITU

transcript

2: Running a Hadoop Application

Zubair Nabi

zubair.nabi@itu.edu.pk

April 18, 2013

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 1 / 8

Running Hadoop

The first order of the day is to format the Hadoop DFS

Jump to the Hadoop directory and execute: bin/hadoopnamenode -format

To run Hadoop and HDFS: bin/start-all.sh

To terminate them: bin/stop-all.sh

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8

Running Hadoop

The first order of the day is to format the Hadoop DFS

Jump to the Hadoop directory and execute: bin/hadoopnamenode -format

To run Hadoop and HDFS: bin/start-all.sh

To terminate them: bin/stop-all.sh

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8

Running Hadoop

The first order of the day is to format the Hadoop DFS

Jump to the Hadoop directory and execute: bin/hadoopnamenode -format

To run Hadoop and HDFS: bin/start-all.sh

To terminate them: bin/stop-all.sh

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8

Running Hadoop

The first order of the day is to format the Hadoop DFS

Jump to the Hadoop directory and execute: bin/hadoopnamenode -format

To run Hadoop and HDFS: bin/start-all.sh

To terminate them: bin/stop-all.sh

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8

Generating a dataset

Create a temporary directory to hold the data: mkdir/tmp/gutenberg

Jump to it: cd /tmp/gutenberg

Download text files:I wget www.gutenberg.org/etext/20417I wget www.gutenberg.org/etext/5000I wget www.gutenberg.org/etext/4300

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 3 / 8

Generating a dataset

Create a temporary directory to hold the data: mkdir/tmp/gutenberg

Jump to it: cd /tmp/gutenberg

Download text files:I wget www.gutenberg.org/etext/20417I wget www.gutenberg.org/etext/5000I wget www.gutenberg.org/etext/4300

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 3 / 8

Generating a dataset

Create a temporary directory to hold the data: mkdir/tmp/gutenberg

Jump to it: cd /tmp/gutenberg

Download text files:I wget www.gutenberg.org/etext/20417I wget www.gutenberg.org/etext/5000I wget www.gutenberg.org/etext/4300

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 3 / 8

Copying the dataset to the HDFS

Jump to the Hadoop directory and execute: bin/hadoop dfs-copyFromLocal /tmp/gutenberg /ccw/gutenberg

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 4 / 8

Running Wordcount

Execute: bin/hadoop jar hadoop-examples-1.0.4.jarwordcount /ccw/gutenberg /ccw/gutenberg-output

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 5 / 8

Retrieving results from the HDFS

Copy to the local FS: bin/hadoop dfs -getmerge/ccw/gutenberg-output /tmp/gutenberg-output

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 6 / 8

Accessing the web interface

JobTracker: http://localhost:50030

TaskTracker: http://localhost:50060

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 7 / 8

Accessing the web interface

JobTracker: http://localhost:50030

TaskTracker: http://localhost:50060

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 7 / 8

Reference(s)

Running Hadoop on Ubuntu Linux (Single-Node Cluster):http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 8 / 8