Blog Archives
1 2

00: 13 Data Warehouse interview Q&As – Fact Vs Dimension, CDC, SCD, etc – part 1

Q1. What is dimensional modelling in a Data Warehouse (i.e. DWH)?
A1. A dimensional model is a data structure technique optimised for Data Warehousing tools (i.e.

Read more ›



00: Apache Spark eco system & anatomy interview Q&As

Q01. Can you summarise the Spark eco system?
A01. Apache Spark is a general purpose cluster computing system. It provides high-level API in Java,

Read more ›



00: Data Lake Vs. Data Warehouse Vs. Delta Lake

Modern data architectures will have both the Data Lakes & Data Warehouses. The Data Engineers build the data pipelines for the data analysts and scientists to build business reports &

Read more ›



00: Q1 – Q6 Hadoop based Big Data architecture & basics interview Q&As

There are a number of technologies to ingest & run analytical queries over Big Data (i.e. large volume of data). Big Data is used in Business Intelligence (i.e. BI) reporting,

Read more ›



01: Databricks getting started – Spark, Shell, SQL


Step 1:
Signup to Databricks community edition – https://databricks.com/try-databricks. Fill in the details and you can leave your mobile number blank. Select “

Read more ›



01: Python Iterators, Generators & Decorators Tutorial

Assumes that Python3 is installed as described in Getting started with Python.

1. Iterators

Iterators don’t compute the value of each item when instantiated. They only compute it when you ask for it.

Read more ›



01: Scala Functional Programming basics – pure functions, referential transparency & side effects

Q1. What is a pure function?
A1. A pure function is a function where the following conditions are met:

1) The Input solely determines the output.

Read more ›



02: Databricks – Spark schemas, casting & PySpark API

Prerequisite: Extends Databricks getting started – Spark, Shell, SQL.

Q: What is a Dataframe?
A: A DataFrame is a data abstraction or a domain-specific language (DSL) for working with structured and semi-structured data,

Read more ›



03: Databricks – Spark SCD Type 1

Prerequisite: Extends Databricks getting started – Spark, Shell, SQL.

What is SCD Type 1

SCD stands for Slowly Changing Dimension,

Read more ›



10 Distributed storage & computing systems interview Q&As

A distributed system consists of multiple software components that are on multiple computers (aka nodes), but run as a single system. These components can be stateful, stateless, or serverless, and these components can be created in different languages running on hybrid environments and developing open-source technologies,

Read more ›



1 2

300+ Java Developer Interview Q&As

800+ Java Interview Q&As

Top