Blog Archives
1 2 3 4 5 6 7 34

01: Spark tutorial- writing a file from a local file system to HDFS

This tutorial assumes that you have set up Cloudera as per “cloudera quickstart vm tutorial installation” YouTube videos that you can search Google or YouTube. You can install it on...



01. Setting up Scala & practicing the concepts via REPL the Scala way for Java developers

Scala runs on the JVM, so Java and Scala stacks can be freely mixed. You can call Java libraries from Scala. Having said this, it is very important that you learn to write code the Scala way,

Read more ›



01a: Convert XML file To Sequence File – writing & reading – Local File System

Sequence files are good for saving raw data into HDFS. Sequence files are compressible and splittable. It is also useful for combining a number of smaller files into a single...



01A: Spark on Zeppelin – Docker pull from Docker hub

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As.

What is Apache Zeppelin?

Read more ›



01b: Convert XML file To Sequence File – writing & reading – Hadoop File System (i.e HDFS)

This extends Convert XML file To Sequence File – writing & reading – Local File System. Step 1: Upload “report.xml” onto HDFS. E.g using the Cloudera HUE on to path...



01B: Spark on Zeppelin – custom Dockerfile

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As.

What is Apache Zeppelin?

Read more ›



01B: Spark tutorial – writing to HDFS from Spark using Hadoop API

Step 1: The “pom.xml” that defines the dependencies for Spark & Hadoop APIs. Step 2: The Spark job that writes numbers 1 to 10 to 10 different files on HDFS....



02: Apache Flume with Custom classes for JMS Source & HDFS Sink

This post extends 01: Apache Flume with JMS source (Websphere MQ) and HDFS sink to write Flume customization code. We will be customizing 3 things. 1) Customized JMS Source message...



02: Apache Kafka multi-broker cluster tutorial

This extends Getting started with Apache Kafka on Mac tutorial. This assumes that the zookeeper & kafka servers are started as per the previous tutorial. List topics Create a topic...



02: Apache Spark – local mode on Docker tutorial with Java & Maven

This extends 01: Docker tutorial with Java & Maven. This runs Spark in local mode. You build the Spark code as a jar file and run it as a Java...



1 2 3 4 5 6 7 34

Java Developer Interview Q&As

800+ Java Interview Q&As

Top