Blog Archives
1 2

01A: Spark on Zeppelin – Docker pull from Docker hub

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As.

What is Apache Zeppelin?

Zeppelin is a web based notebook to execute arbitrary code in Scala, SQL, Spark, etc. You can mix languages.

Read more ›



01B: Spark on Zeppelin – custom Dockerfile

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As.

What is Apache Zeppelin?

Zeppelin is a web based notebook to execute arbitrary code in Scala, SQL, Spark, etc. You can mix languages.

Read more ›



02: Spark on Zeppelin – read a file from local file system

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As. This extends setting up Apache Zeppelin Notebook.

Step 1: Pull this from the docker hub,

Read more ›



03: Spark on Zeppelin – DataFrame Operations in Scala

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As.

This tutorial extends Apache Zeppelin on Docker Tutorial – Docker pull from Docker hub and Spark stand-alone to read a file from local file system

1.

Read more ›



04: Spark on Zeppelin – DataFrame joins in Scala

This tutorial extends the series: Spark on Apache Zeppelin Tutorials.

1. Create “Orders” DataFrame

2. Create “Customers” DataFrame

You can perform a number of joins between DataFrames. Default is the inner join. Joins can be of: inner, cross, outer, full, full_outer, left, left_outer, right, right_outer,

Read more ›



05: Spark on Zeppelin – semi-structured log file

This tutorial extends the series: Spark on Apache Zeppelin Tutorials. Step 1: Pull apache/zeppelin image from the docker hub, and build the image with the following command. “docker images” will show the image that was created. Step 2: Run the above image to create a container with the following command....



06: Spark on Zeppelin – RDD operation zipWithIndex

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As. This extends setting up Apache Zeppelin Notebook. Q. Why do we need zipWithIndex? A. … Read more ›...



07: Spark on Zeppelin – window functions in Scala

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As. This extends setting up Apache Zeppelin Notebook. Q. What are the different types of functions in Spark SQL? … Read more ›...



08: Spark on Zeppelin – convert DataFrames to RDD[Row] and RDD[Row] to DataFrame

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As. This extends setting up Apache Zeppelin Notebook. Important: It is not a best practice to mutate values or to use RDD directly as opposed to using Dataframes....



09: Spark on Zeppelin – convert DataFrames to RDD and RDD to DataFrame

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As. This extends setting up Apache Zeppelin Notebook. Important: It is not a best practice to mutate values or to use RDD directly as opposed to using Dataframes....



10: Spark on Zeppelin – union, udf and explode

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As. This extends setting up Apache Zeppelin Notebook. Step 1: Pull this from the docker hub, … Read more ›...



11: Spark on Zeppelin – Dataframe groupBy, collect_list, explode & window

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As. This extends setting up Apache Zeppelin Notebook. Step 1: Pull this from the docker hub, … Read more ›...



12: Spark on Zeppelin – Dataframe pivot

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As. This extends setting up Apache Zeppelin Notebook. Step 1: Pull this from the docker hub, … Read more ›...



1 2

800+ Java Q&As & tutorials

Top