01: Apache Hadoop HDFS Tutorial

Step 1: Download the latest version of “Apache Hadoop common” from using wget, curl or a browser. This tutorial uses “”.

Step 2: You can set Hadoop environment variables by appending the following commands to ~/.bashrc file.

You can run this in a Unix command prompt as

Step 3: You can verify if Hadoop has been setup properly with

Step 4: The Hadoop file in $HADOOP_HOME/etc/Hadoop/ has the JAVA_HOME setting.

02: Java to write from/to Local to HDFS File System

This extends Hadoop MapReduce Basic Tutorial and Apache Hadoop HDFS Tutorial. This could have have been done on the command-line as shown below after running “” to start the name and data nodes.

The focus of this tutorial is to do the same via Java and Hadoop APIs.

03: Create or append a file to HDFS – Hadoop API tutorial

Step 1: Create a simple maven project named "simple-hadoop". Step 2: Import the "simple-hadoop" maven project into eclipse or IDE of your choice. Step 3: Modify the pom.xml file include 1) relevant Hadoop libraries 2) The shade plugin to create a single jar (i.e.

04: Create new or append to an existing AVRO file tutorial

This extends Create or append a file to HDFS – Hadoop API tutorial to write an AVRO file to HDFS. Step 1: Include the AVRO library files in the pom.xml file. Step 2: The AVRO files are schema based.

05: Create or append a Sequence file to HDFS – Hadoop API tutorial

The following tutorial extends Create or append a file to HDFS – Hadoop API tutorial, and Create or append an AVRO file to HDFS – Hadoop & AVRO API tutorial. In this tutorial we will write to a Sequence file,

