As part of this lesson we will understand
- Data frames and data frame operations
- Using Spark SQL to query from data frames
- Using Spark SQL to query from Hive tables
- Spark JDBC
For the demonstrations we will be using Hortonworks Sandbox which is running Spark with 2.10 version of Scala.
- Rebuild the existing project with Spark assembled with 2.10 version of Scala
- Set Scala IDE compiler to 2.10