Spark Core provides basic I/O functionalities, distributed task dispatching, and scheduling. Resilient Distributed Datasets (RDDs) are logical collections of data partitioned across machines. RDDs can be created by referencing datasets in external storage systems, or by applying transformations on existing RDDs. In this course, you will learn how to improve Spark's performance and work with Data Frames and Spark SQL.
Programmers and developers familiar with Apache Spark who wish to expand their skill sets