Blog Archives
Page 1 of 58
1 2 3 58

00: Apache Spark eco system & anatomy interview Q&As

Q01. Can you summarise the Spark eco system?
A01. Apache Spark is a general purpose cluster computing system. It provides high-level API in Java, Scala, Python, and R. It has 6 components Core, Spark SQL, Spark Streaming, Spark MLlib,

Read more ›



00: Data Lake Vs. Data Warehouse Vs. Delta Lake

Modern data architectures will have both the Data Lakes & Data Warehouses. Q1. What questions do you need to ask for choosing a Data Warehouse over a Data Lake for your BI (i.e. Business Intelligence) reporting? A1. The gap between a data lake & … Read more ›...

Members Only Content
Log In Register Home


00: Q1 – Q6 Hadoop based Big Data architecture & basics interview Q&As

There are a number of technologies to ingest & run analytical queries over Big Data (i.e. large volume of data). Big Data is used in Business Intelligence (i.e. BI) reporting, Data Science, Machine Learning, and Artificial Intelligence (i.e. AI). Processing a large volume of data will be intensive on disk I/O,

Read more ›



01: Getting started with Zookeeper tutorial

Installing Zookeepr on Windows Step 1: Download Zookeeper from http://zookeeper.apache.org/. At the time of writing downloading zookeeper-3.4.11.tar.gz. Step 2: Using 7-zip on windows unpack the gzipped tar file into a folder. E.g. c:\development\zookeeper-3.4.11. you can see “zkServer.cmd” in the bin folder for windows & … Read more ›...

Members Only Content
Log In Register Home


01: Apache Flume with JMS source (Websphere MQ) and HDFS sink

Apache Flume is used in the Hadoop ecosystem for ingesting data. In this example, let’s ingest data from Websphere MQ. Step 1: Apache flume is config driven. Hierarchy driven flume config flumeWebsphereMQQueue.conf file. You need to define the “source“, “ … Read more ›...

Members Only Content
Log In Register Home


Page 1 of 58
1 2 3 58

800+ Java Interview Q&As Menu

Top