FAQs Data: 02: Getting started with Spark Scala on IntelliJ IDEA with Maven

This extends FAQs Data: 01: Getting started with Spark Scala on IntelliJ IDEA to package the Spark & run it via command-line. The dependent libraries configured via Maven pom.xml. Make sure Maven is installed on your machine & set in the path.

Step 1: Extend the previous pom.xml file from

TO:

Step 2: Rename “src/test/java” to “src/test/scala” via right mouse click context menu & “Refactor“.

Step 3: Remove the “.master(“local[*]”)” as it can be supplied via commandline to run in local mode or cluster mode.

Step 4: Package the project with Maven. Run the “Maven” –> “Package“.

Intellij IDEA IDE - Spark, Scala & Maven

Intellij IDEA IDE – Spark, Scala & Maven

Step 5: In “target/spark-on-scala2-1.0-SNAPSHOT.jar” right mouse click context menu & “Open In” –> “Terminal“.

Step 6: The syntax to run Spark on command-line is via spark-submit

Hence we can run our packaged Spark code as:

Outputs:

Note: In Intellij IDEA IDE, you can also create a Maven project with the “scala-archetype-simple” maven plugin from “net.alchim31.maven“.

Intellij IDEA IDE - scala-archetype-simple

Intellij IDEA IDE – scala-archetype-simple

Make sure you select archeType version as 1.7.

Intellij IDEA IDE - scala-maven-plugin

Intellij IDEA IDE – scala-maven-plugin

This will regenerate the pom.xml file & then you can add the Apache Spark dependency to it.


300+ Java & Big Data Interview FAQs

16+ Java Key Areas Interview Q&As

800+ Java Interview Q&As

300+ Java & Big Data Tutorials