Spark SQL read a Hive table

What is Spark SQL

Spark SQL is a Spark module for structured data processing. You can do SQL like joins.

Step 1: The Spark SQL & Spark Hive libraries are required as shown below in the pom.xml file.

Step 2: Open a Unix terminal window, and run the following if you are running in local mode.

Step 3: Spark job in Java that reads the data from a Hive table (i.e. parquet_order) in the database “learnhadoop”, which we created previously over Parquet data.

Output:


Why & What are the benefits

🎯 Why java-success.com?

🎯 What are the benefits of Q&As approach?

Learn by categories such as FAQs – Core Java, Key Area – Low Latency, Core Java – Java 8, JEE – Microservices, Big Data – NoSQL, Architecture – Distributed, Big Data – Spark, etc. Some posts belong to multiple categories.

BigData on Cloudera
Module 1 Installing & getting started with Cloudera Quick Start+
Unit 1 Installing & getting started with Cloudera QuickStart on VMWare for windows in 17 steps  - Preview
Unit 2 ⏯ Cloudera Hue, Terminal Window (on edge node) & Cloudera Manager overview  - Preview
Unit 3 Understanding Cloudera Hadoop users  - Preview
Unit 4 Upgrading Java version to JDK 8 in Cloudera Quickstart  - Preview
Module 2 Getting started with HDFS on Cloudera+
Unit 1 ⏯ Hue and terminal window to work with HDFS  - Preview
Unit 2 Java program to list files in HDFS & write to HDFS using Hadoop API  - Preview
Unit 3 ⏯ Java program to list files on HDFS & write to a file in HDFS  - Preview
Unit 4 Write to & Read from a csv file in HDFS using Java & Hadoop API  - Preview
Unit 5 ⏯ Write to & read from HDFS using Hadoop API in Java  - Preview
Module 3 Running an Apache Spark job on Cloudera+
Unit 1 Before running a Spark job on a YARN cluster in Cloudera  - Preview
Unit 2 Running a Spark job on YARN cluster in Cloudera  - Preview
Unit 3 ⏯ Running a Spark job on YARN cluster  - Preview
Unit 4 Write to HDFS from Spark in YARN mode & local mode  - Preview
Unit 5 ⏯ Write to HDFS from Spark in YARN & local modes  - Preview
Unit 6 Spark running on YARN and Local modes reading from HDFS  - Preview
Unit 7 ⏯ Spark running on YARN and Local modes reading from HDFS  - Preview
Module 4 Hive on Cloudera+
Unit 1 Getting started with Hive  - Preview
Unit 2 ⏯ Getting started with Hive  - Preview
Module 5 HBase on Cloudera+
Unit 1 Write to HBase from Java  - Preview
Unit 2 Read from HBase in Java  - Preview
Unit 3 HBase shell commands to get, scan, and delete  - Preview
Unit 4 ⏯ Write to & read from HBase  - Preview
Module 6 Writing to & reading from Avro in Spark+
Unit 1 Write to an Avro file from a Spark job in local mode  - Preview
Unit 2 Read an Avro file from HDFS via a Spark job running in local mode  - Preview
Unit 3 ⏯ Write to & read from an Avro file on HDFS using Spark  - Preview
Unit 4 Write to HDFS as Avro from a Spark job using Avro IDL  - Preview
Unit 5 ⏯ Write to Avro using Avro IDL from a Spark job  - Preview
Unit 6 Create a Hive table over Avro data  - Preview
Unit 7 ⏯ Hive table over an Avro folder & avro-tools to generate the schema  - Preview
Module 7 Writing to & reading from Parquet in Spark+
Unit 1 Write to a Parquet file from a Spark job in local mode  - Preview
Unit 2 Read from a Parquet file in a Spark job running in local mode  - Preview
Unit 3 ⏯ Write to and read from Parquet data on HDFS via Spark  - Preview
Unit 4 Create a Hive table over Parquet data  - Preview
Unit 5 ⏯ Hive over Parquet data  - Preview
Module 8 Spark SQL-
Unit 1 Spark SQL read a Hive table  - Preview
Unit 2 Write to Parquet using Spark SQL & Dataframe  - Preview
Unit 3 Read from Parquet with Spark SQL & Dataframe  - Preview
Unit 4 ⏯ Spark SQL basics video tutorial  - Preview
Module 9 Spark streaming+
Unit 1 Spark streaming text files  - Preview
Unit 2 Spark file streaming in Java  - Preview
Unit 3 ⏯ Spark streaming video tutorial  - Preview
Top