Blog Archives
1 2 3 4 5 11

00: 13 Data Warehouse interview Q&As – Fact Vs Dimension, CDC, SCD, etc

Q1. What is dimensional modelling in a Data Warehouse (i.e. DWH)?
A1. A dimensional model is a data structure technique optimised for Data Warehousing tools (i.e. OLAP products). The concept of Dimensional Modelling is comprised of Fact and Dimension tables.

Read more ›



00: 15+ Apache Spark best practices, memory mgmt & performance tuning interview FAQs – Part-1

There are so many different ways to solve the big data problems at hand in Spark, but some approaches can impact on performance, and lead to performance and memory issues. Here are some best practices to keep in mind when writing Spark jobs. General Best Practices #1 Favor DataFrames over...

Members Only Content
Log In Register Home


00: 15+ Apache Spark best practices, memory mgmt & performance tuning interview FAQs – Part-2

This extends 15+ Apache Spark best practices, memory mgmt & performance tuning interview FAQs – Part-1, where best practices 1-6 were covered with examples & diagrams.

#7 Use Spark UI: Running Spark jobs without inspecting the Spark UI is a definite NO.

Read more ›



00: Apache Spark eco system & anatomy interview Q&As

Q01. Can you summarise the Spark eco system?
A01. Apache Spark is a general purpose cluster computing system. It provides high-level API in Java, Scala, Python, and R. It has 6 components Core, Spark SQL, Spark Streaming, Spark MLlib,

Read more ›



00: Data Lake Vs. Data Warehouse Vs. Delta Lake

Modern data architectures will have both the Data Lakes & Data Warehouses. Q1. What questions do you need to ask for choosing a Data Warehouse over a Data Lake for your BI (i.e. Business Intelligence) reporting? A1. The gap between a data lake & … Read more ›...

Members Only Content
Log In Register Home


00: Q1 – Q6 Hadoop based Big Data architecture & basics interview Q&As

There are a number of technologies to ingest & run analytical queries over Big Data (i.e. large volume of data). Big Data is used in Business Intelligence (i.e. BI) reporting, Data Science, Machine Learning, and Artificial Intelligence (i.e. AI). Processing a large volume of data will be intensive on disk I/O,

Read more ›



01: Lambda, Kappa & Delta Data Architectures Interview Q&As – Overview

Q1. What is the Lambda Architecture? A1. It is a data-processing architecture designed to handle Big Data by using both real-time streaming (e.g. Spark streaming, Apache Storm) and batch processing (E.g. Hive, Pig, Spark batch). This means you have to build 2 separate pipelines. … Read more ›...

Members Only Content
Log In Register Home


1 2 3 4 5 11

800+ Java Interview Q&As Menu

Learn by categories on the go...
Learn by categories such as FAQs – Core Java, Key Area – Low Latency, Core Java – Java 8, JEE – Microservices, Big Data – NoSQL, Architecture – Distributed, Big Data – Spark, etc. Some posts belong to multiple categories.
Top