Blog Archives

06: Learn how to access Hive from Spark via SparkSQL & Dataframes by example

These Hadoop tutorials assume that you have installed Cloudera QuickStart, which has the Hadoop eco system like HDFS, Spark, Hive, HBase, YARN, etc. This example extends Learn Hive to write to and read from AVRO & Parquet files by examples to access Hive metastore via Spark SQL. … Read more...

Members Only Content
Log In Register Home


10 Spark SQL Interview Q&As

Q1. What is Spark SQL? A1. Apache Spark SQL is a module for structured data processing in Spark. Spark SQL integrates relational processing (i.e. SQL) with Spark’s functional programming using Scala, Java, etc weave SQL queries with Dataframes/Datasets based transformations. It provides support for various data sources as shown below:...

Members Only Content
Log In Register Home


17: Spark interview Q&As with coding examples in pyspark (i.e. python)

Q01. How will you create a Spark context? A01. Q02. How will you create a Dataframe by reading a file from AWS S3 bucket? A02. Q03. How will you create a Dataframe by reading a table in a database? … Read more ›...

Members Only Content
Log In Register Home


Spark SQL joins & performance tuning interview questions & answers

Q1. What are the different types of Spark SQL joins? A1. There are 3 types of joins. 1) Sort Merge Join – when both table 1 & table 2 are large. You need to shuffle & sort by the join keys. … Read more ›...

Members Only Content
Log In Register Home


800+ Java Interview Q&As

Prepare to fast-track & go places
with multi-offers to choose from & increased earning potential. Expand your horizons along the way by taking the road less travelled.
Learn by categories on the go...
Learn by categories such as FAQs – Core Java, Key Area – Low Latency, Core Java – Java 8, JEE – Microservices, Big Data – NoSQL, etc. Some posts belong to multiple categories.
Top