1. HBase Tutorial Getting Started

In standalone mode, HBase does not use HDFS — it uses the local filesystem instead — and it runs all HBase daemons and a local ZooKeeper all up in the same JVM. Zookeeper binds to a well known port so clients may talk to HBase.

Step 1: Download HBase from “http://hbase.apache.org/”. This tutorial uses hbase-1.0.3. Extract the download:

Step 2: Create a new directory to store data.

Step 3: Add the following to the ./bashrc or .profile.

Step 4: Add the following to “HBASE_HOME/conf/hbase-site.xml

ZooKeeper is a high-performance coordination service for distributed applications(like HBase). It exposes common services like naming, configuration management, synchronization, and group services. HBase provides you the option to use its built-in Zookeeper which will get started whenever you start HBAse. But it is not good if you are working on a production cluster. In such scenarios it’s always good to have a dedicated Zookeeper cluster and integrate it with your HBase cluster.

Note: Standalone mode does not require HDFS. It can run on a local file system as shown above like “file:///Users/akumaras/hbase-data”. For running a fully-distributed operation on more than one host, add the following configurations in hbase-site.xml, add the property hbase.cluster.distributed and set it to true and point the HBase hbase.rootdir at the appropriate HDFS NameNode and location in HDFS where you would like HBase to write data. For example, if your namenode were running at myhdfs.namenode.host on port 9000 and you wanted to home your HBase in HDFS at /hbase then:

A fully distributed mode would also require “conf/regionservers” to list all hosts that you would have running HRegionServers, one host per line similar to the etc/hadoop/slaves file in HDFS.

Step 5: Start hbase as shown below:

List the java processes that are running:

Output: As you can see HMaster is up and running.

Step 6: Invoke the hbase shell to run commands to create tables, column families, etc.

Step 7: Create a table on HBase.

Step 8: List the table

Step 9: Add some data to “row1”.

Step 10: Display data using “scan” and “get” commands.

Step 11: Exit out of the HBase shell and stop the HBase server.

Accessing HBase via Java Client API

Step 1: Have hbase-client dependency in the pom.xml file.

Step 2: The client code connecting to the server.

Make sure that the “HMaster” is running, and run the client code.

800+ Java & Big Data Interview Q&As