Apache Spark Competency (Intermediate Level)

  • 20m
  • 15 questions
The Apache Spark Competency (Intermediate Level) benchmark measures your knowledge of deploying, using and streaming with Apache Spark. You will be evaluated on Spark clusters, jobs, streaming, and transforming with Spark SQL. Learners scoring high on this benchmark demonstrate the skills necessary to using Apache Spark in thr data streaming applications.

Topics covered

  • build a Spark application that reads from a Kafka topic
  • configure a Spark cluster using the file
  • create a Spark cluster with a master and worker
  • describe how Apache Hadoop and Spark work
  • describe what windows are in the context of Spark streaming and define them using DataFrames
  • distinguish between Spark standalone and local deployment modes
  • execute apps on a Spark standalone cluster
  • execute Spark commands and monitor jobs with the Spark web UI
  • manipulate streaming data and publish the output to the console
  • perform aggregations on Spark DataFrames and order their contents
  • recall the architecture and features of Apache Spark
  • recognize the use cases of Spark in general and specifically, its structured streaming engine
  • run a job on the PySpark shell and view its details from the Spark web user interface (UI)
  • set up an environment to stream files, and build an app to process files in real-time
  • transform streaming data with Spark SQL