Blog Archives
1 2

01A: Spark on Zeppelin – Docker pull from Docker hub

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As.

What is Apache Zeppelin?

Zeppelin is a web based notebook to execute arbitrary code in Scala, SQL, Spark, etc. You can mix languages.

Read more ›



01B: Spark on Zeppelin – custom Dockerfile

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As.

What is Apache Zeppelin?

Zeppelin is a web based notebook to execute arbitrary code in Scala, SQL, Spark, etc. You can mix languages.

Read more ›



02: Spark on Zeppelin – read a file from local file system

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As. This extends setting up Apache Zeppelin Notebook.

Step 1: Pull this from the docker hub,

Read more ›



03: Spark on Zeppelin – DataFrame Operations in Scala

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As.

This tutorial extends Apache Zeppelin on Docker Tutorial – Docker pull from Docker hub and Spark stand-alone to read a file from local file system

1.

Read more ›



04: Spark on Zeppelin – DataFrame joins in Scala

This tutorial extends the series: Spark on Apache Zeppelin Tutorials.

1. Create “Orders” DataFrame

2. Create “Customers” DataFrame

You can perform a number of joins between DataFrames. Default is the inner join. Joins can be of: inner, cross, outer, full, full_outer, left, left_outer, right, right_outer,

Read more ›



05: Spark on Zeppelin – semi-structured log file

This tutorial extends the series: Spark on Apache Zeppelin Tutorials. Step 1: Pull apache/zeppelin image from the docker hub, and build the image with the following command. “docker images” will show the image that was created. Step 2: Run the above image to create a container with the following command....

Members Only Content
Log In Register Home


06: Spark on Zeppelin – RDD operation zipWithIndex

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As. This extends setting up Apache Zeppelin Notebook. Q. Why do we need zipWithIndex? A. … Read more ›...

Members Only Content
Log In Register Home


1 2

800+ Java Interview Q&As Menu

Learn by categories on the go...
Learn by categories such as FAQs – Core Java, Key Area – Low Latency, Core Java – Java 8, JEE – Microservices, Big Data – NoSQL, Architecture – Distributed, Big Data – Spark, etc. Some posts belong to multiple categories.
Top