Course details

Accessing Data with Spark: Data Analysis using Spark SQL

Accessing Data with Spark: Data Analysis using Spark SQL


Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description

Apache Spark is an open-source cluster-computing framework used for data science and it has become the defacto big data framework. In this Skillsoft Aspire course, you will learn how to analyze a Spark DataFrame by treating it as though it were a relational database table. Discover how to create a view from a Spark DataFrame and run SQL queries against it and how to define and explore data in Windows.



Expected Duration (hours)
0.9

Lesson Objectives

Accessing Data with Spark: Data Analysis using Spark SQL

  • Course Overview
  • recall the different stages involved in optimizing any query or method call on the contents of a Spark DataFrame
  • create views out of a Spark DataFrame's contents and run queries against them
  • trim and clean a DataFrame before a view is created as a precursor to running SQL queries on it
  • perform an analysis of data by running different kinds of SQL queries, including grouping and aggregations
  • recognize how Spark DataFrames infer the schema of data loaded into them and configure a DataFrame with an explicitly defined schema
  • define what a window is in the context of Spark DataFrames and when they can be used
  • create and analyze categories of data in a dataset using Windows
  • analyze data using Spark SQL
  • Course Number:
    it_dsadskdj_03_enus

    Expertise Level
    Intermediate