2. Hadoop MapReduce Basic Tutorial

This extends the Part 1 tutorial 1. Hadoop MapReduce Basic Tutorial. The key difference of this tutorial is using a “TextInputFormat” instead of “KeyValueTextInputFormat“.

TextInputFormat reads

The key as line offset number starting from 0 and the values as “Science, 80, 75, 89, 90” from the Scores.data file.

Mapper input with TextInputFormat.

Mapper input with TextInputFormat.

Hadoop MapReduce Steps

Step 1: The Hadoop based mapper class “ScoreMapper” that can be executed in parallel by multiple nodes. It processe each input line as key/value pairs. E.g 0/Science, 80, 75, 89, 90. It is imperative to note that the key of type “LongWritable” instead of type “Text”.

Step 2: The Hadoop based reducer class “ScoreReducer” that can be executed in parallel by multiple nodes. It processe each input line as key/value pairs. E.g Science/80_75_89_90_. The output key/value pairs will be E.g Science/max score is: 90. This is same as the Part 1 example.

Step 3: Finally the executable main Java class “MaxScoreMain” that ties everything together.

As you can see the “job.setInputFormatClass” uses “TextInputFormat“. The output results will be exactly same as the part 1.


Categories Menu - Q&As, FAQs & Tutorials

Top