04: Spark on Zeppelin – DataFrame joins in Scala

This tutorial extends the series: Spark on Apache Zeppelin Tutorials.

1. Create “Orders” DataFrame

2. Create “Customers” DataFrame

You can perform a number of joins between DataFrames. Default is the inner join. Joins can be of: inner, cross, outer, full, full_outer, left, left_outer, right, right_outer, left_semi, left_anti.

3. inner join of two DataFrames

Customers and their orders.

4. leftanti join

Customers who do not have any orders.

5. left join

5. leftsemi join

Customers who have orders.

Categories Menu - Q&As, FAQs & Tutorials