04: Spark on Zeppelin – DataFrame joins in Scala

This tutorial extends the series: Spark on Apache Zeppelin Tutorials.

1. Create “Orders” DataFrame

2. Create “Customers” DataFrame

You can perform a number of joins between DataFrames. Default is the inner join. Joins can be of: inner, cross, outer, full, full_outer, left, left_outer, right, right_outer, left_semi, left_anti.

3. inner join of two DataFrames

Customers and their orders.

4. leftanti join

Customers who do not have any orders.

5. left join

5. leftsemi join

Customers who have orders.

