Blog Archives
1 2 3 4 5 44

00: 13 Data Warehouse interview Q&As – Fact Vs Dimension, CDC, SCD, etc

Q1. What is dimensional modelling in a Data Warehouse (i.e. DWH)?
A1. A dimensional model is a data structure technique optimised for Data Warehousing tools (i.e. OLAP products). The concept of Dimensional Modelling is comprised of Fact and Dimension tables.

Read more ›



00: Apache Spark eco system & anatomy interview Q&As

Q01. Can you summarise the Spark eco system?
A01. Apache Spark is a general purpose cluster computing system. It provides high-level API in Java, Scala, Python, and R. It has 6 components Core, Spark SQL, Spark Streaming, Spark MLlib,

Read more ›



00: Data Lake Vs. Data Warehouse Vs. Delta Lake

Modern data architectures will have both the Data Lakes & Data Warehouses. Q1. What questions do you need to ask for choosing a Data Warehouse over a Data Lake for your BI (i.e. Business Intelligence) reporting? A1. The gap between a data lake & … Read more ›...



00: Q1 – Q6 Hadoop based Big Data architecture & basics interview Q&As

There are a number of technologies to ingest & run analytical queries over Big Data (i.e. large volume of data). Big Data is used in Business Intelligence (i.e. BI) reporting, Data Science, Machine Learning, and Artificial Intelligence (i.e. AI). Processing a large volume of data will be intensive on disk I/O,

Read more ›



01: Getting started with Zookeeper tutorial

Installing Zookeepr on Windows Step 1: Download Zookeeper from http://zookeeper.apache.org/. At the time of writing downloading zookeeper-3.4.11.tar.gz. Step 2: Using 7-zip on windows unpack the gzipped tar file into a folder. E.g. c:\development\zookeeper-3.4.11. you can see “zkServer.cmd” in the bin folder for windows & … Read more ›...



01: Apache Flume with JMS source (Websphere MQ) and HDFS sink

Apache Flume is used in the Hadoop ecosystem for ingesting data. In this example, let’s ingest data from Websphere MQ. Step 1: Apache flume is config driven. Hierarchy driven flume config flumeWebsphereMQQueue.conf file. You need to define the “source“, “ … Read more ›...



01: Apache Hadoop HDFS Tutorial

Step 1: Download the latest version of “Apache Hadoop common” from http://apache.claz.org/hadoop using wget, curl or a browser. This tutorial uses “http://apache.claz.org/hadoop/core/hadoop-2.7.1/”.

Step 2: You can set Hadoop environment variables by appending the following commands to ~/.bashrc file.

You can run this in a Unix command prompt as

Step 3: You can verify if Hadoop has been setup properly with

Step 4: The Hadoop file in $HADOOP_HOME/etc/Hadoop/hadoop-env.sh has the JAVA_HOME setting.

Read more ›



1 2 3 4 5 44

Java Interview FAQs

800+ Java Interview Q&As

Top