Blog Archives
Page 1 of 2
1 2

00: Apache Spark eco system & anatomy interview Q&As

Q01. Can you summarise the Spark eco system?
A01. Apache Spark is a general purpose cluster computing system. It provides high-level API in Java, Scala, Python, and R. It has 6



02: Cleansing & pre-processing data in BigData & machine learning with Spark interview Q&As

Q1. Why are data cleansing & pre-processing important in analytics & machine learning? A1. Garbage in gets you garbage out. No matter how good your machine learning algorithm is. Q2. What …



12 Apache Spark getting started interview Q&As

Q01. Where is Apache Spark used in the Hadoop eco system?
A01. Spark is essentially a data processing framework that is faster & more flexible than “Map Reduce”. The Spark itself …



14: Q105 – Q108 Spark “map” vs “flatMap” interview questions & answers

Q105. What is the difference between “map” and “flatMap” operations in Spark? A105. The map and flatMap are transformation operations in Spark. map transformation is applied to each element of RDD …



15: Q109 – Q113 Spark RDD partitioning and “mapPartitions” interview questions & answers

Q109. What is the difference between “map” and “mapPartitions” transformations in Spark? A109. The method map converts each element of the source RDD into a single element of the result RDD …



15+ Apache Spark best practices, memory mgmt & performance tuning interview FAQs – Part-1

There are so many different ways to solve the big data problems at hand in Spark, but some approaches can impact on performance, and lead to performance and memory issues. Here …



15+ Apache Spark best practices, memory mgmt & performance tuning interview FAQs – Part-2

This extends 15+ Apache Spark best practices, memory mgmt & performance tuning interview FAQs โ€“ Part-1. #7 Use Spark UI: Running Spark jobs without inspecting the Spark UI is a definite …



17: Spark interview Q&As with coding examples in pyspark (i.e. python)

Q01. How will you create a Spark context? A01.



Debugging Spark applications written in Java locally by connecting to HDFS, Hive and HBase

This extends Remotely debugging Spark submit Jobs in Java. Running Spark in local mode When you run Spark in local mode, both the Driver and Executor will be running in the …



Spark interview Q&As with coding examples in Scala – part 1

Some of these basic Apache Spark interview questions can make or break your chance to get an offer.

Q01. Why is “===” used in the below Dataframe join?



Spark interview Q&As with coding examples in Scala โ€“ part 2

This extends Spark interview Q&As with coding examples in Scala โ€“ part 1 with the key optimisation concepts.

Partition Pruning

Q13. What do you understand by the concept Partition Pruning?…



Spark interview Q&As with coding examples in Scala โ€“ part 3

This extends Spark interview Q&As with coding examples in Scala โ€“ part 2 with more coding examples on Databricks Note book. Prerequisite: Create a free account as per Databricks getting started. …



Spark interview Q&As with coding examples in Scala โ€“ part 4

This extends Spark interview Q&As with coding examples in Scala โ€“ part 3 with more coding examples on Databricks Note book. Prerequisite: Create a free account as per Databricks getting started. …



Page 1 of 2
1 2

800+ Java Interview Questions answered

What can you do as a Java & Big Data Engineer?

Prepare to fast-track & go places
with multi-offers to choose from & increased earning potential. Expand your horizons along the way by taking the road less travelled.
Learn by categories on the go...
Learn by categories such as FAQs – Core Java, Key Area – Low Latency, Core Java – Java 8, JEE – Microservices, Big Data – NoSQL, etc. Some posts belong to multiple categories.
Top