Blog Archives

06: Learn how to access Hive from Spark via SparkSQL & Dataframes by example

These Hadoop tutorials assume that you have installed Cloudera QuickStart, which has the Hadoop eco system like HDFS, Spark, Hive, HBase, YARN, etc. This example extends Learn Hive to write to and read from AVRO & Parquet files by examples to access Hive metastore via Spark SQL. You run the…

Read more ...

10 Spark SQL Interview Q&As

Q1. What is Spark SQL?
A1. Apache Spark SQL is a module for structured data processing in Spark. Spark SQL integrates relational processing (i.e. SQL) with Spark’s functional programming using Scala, Java, etc weave SQL queries with Dataframes/Datasets based transformations. It provides support for various data sources as shown below:

Q2. What libraries do Spark SQL have?

1. Data Source API

This library has built-in support for various Datasources shown above.… Read more ...

17: Spark interview Q&As with coding examples in pyspark (i.e. python)

Q01. How will you create a Spark context? A01.

Read more ...

Apache Spark SQL join types interview questions and answers

Q1. What are the different Spark SQL join types?
A1. There are different SQL join types like inner join, left/right outer joins, full outer join, left semi-join, left anti-join and self-join.

Q2. Given the below tables, can give examples of the above join types?

Read more ...

Java Developer & Architect Interview Q&As

Java & Big Data Tutorials

Prepare to fast-track & go places

FAQs are marked with 🔥 as some questions are not only more popular with the interviewers, but also required to build robust systems. If you are an interviewer, cover well rounded topics to judge real experience.

Don't be overwhelmed by the number of questions as the technology stacks are so vast. The quality of the answers you provide to some of the key technical & open-ended questions along with your soft skills & attitude will go a long way in getting the job offers.

Note: Some Q&As belong to more than one category.