Apache Spark Getting Started

Apache Spark    |    Beginner
  • 15 videos | 1h 6m 15s
  • Includes Assessment
  • Earns a Badge
Rating 4.5 of 137 users Rating 4.5 of 137 users (137)
Explore the basics of Apache Spark, an analytics engine used for big data processing. It's an open source, cluster computing framework built on top of Hadoop. Discover how it allows operations on data with both its own library methods and with SQL, while delivering great performance. Learn the characteristics, components, and functions of Spark, Hadoop, RDDS, the spark session, and master and worker notes. Install PySpark. Then, initialize a Spark Context and Spark DataFrame from the contents of an RDD and a DataFrame. Configure a DataFrame with a map function. Retrieve and transform data. Finally, convert Spark and Pandas DataFrames and vice versa.

WHAT YOU WILL LEARN

  • Recognize where spark fits in with hadoop and its components
    Describe spark rdds and their characteristics, including what makes them resilient and distributed
    Identify the types of operations which are permitted on an rdd and describe how rdd transformations are lazily evaluated
    Distinguish between rdds and dataframes and describe the relationship between the two
    List the crucial components of spark and the relationships between them and recognize the functions of the spark session, master and worker nodes
    Install pyspark and initialize a spark context
    Create and load data into an rdd
  • Initialize a spark dataframe from the contents of an rdd
    Work with spark dataframes containing both primitive and structured data types
    Define the contents of a dataframe using the sqlcontext
    Apply the map() function on an rdd to configure a dataframe with column headers
    Retrieve required data from within a dataframe and define and apply transformations on a dataframe
    Convert spark dataframes to pandas dataframes and vice versa
    Describe basic spark concepts

IN THIS COURSE

  • 2m 20s
  • 5m 17s
    After completing this video, you will be able to recognize where Spark fits in with Hadoop and its components. FREE ACCESS
  • Locked
    3.  Resilient Distributed Datasets (RDDs)
    2m 15s
    Upon completion of this video, you will be able to describe Spark RDDs and their characteristics, including what makes them resilient and distributed. FREE ACCESS
  • Locked
    4.  RDD Operations
    7m 22s
    In this video, you will identify the types of operations which are permitted on an RDD and describe how RDD transformations are evaluated lazily. FREE ACCESS
  • Locked
    5.  Spark DataFrames
    2m 32s
    In this video, you will learn how to distinguish between RDDs and DataFrames, and describe the relationship between the two. FREE ACCESS
  • Locked
    6.  Spark Architecture
    6m 24s
    Upon completion of this video, you will be able to list the crucial components of Spark and the relationships between them. You will also be able to recognize the functions of the Spark Session, Master and Worker nodes. FREE ACCESS
  • Locked
    7.  Spark Installation
    3m 43s
    During this video, you will learn how to install PySpark and initialize a Spark Session. FREE ACCESS
  • Locked
    8.  Working with RDDs
    3m 16s
    During this video, you will learn how to create and load data into an RDD. FREE ACCESS
  • Locked
    9.  Creating DataFrames from RDDs
    5m 47s
    In this video, you will learn how to initialize a Spark DataFrame from the contents of an RDD. FREE ACCESS
  • Locked
    10.  Contents of a DataFrame
    4m 22s
    In this video, find out how to work with Spark DataFrames containing both primitive and complex data types. FREE ACCESS
  • Locked
    11.  The SQLContext
    5m 7s
    In this video, find out how to define the contents of a DataFrame using the SQLContext. FREE ACCESS
  • Locked
    12.  The map() Function of an RDD
    3m 52s
    During this video, you will learn how to apply the map() function to an RDD to configure a DataFrame with column headers. FREE ACCESS
  • Locked
    13.  Accessing the Contents of a DataFrame
    7m 51s
    In this video, you will retrieve required data from within a DataFrame and define and apply transformations to a DataFrame. FREE ACCESS
  • Locked
    14.  DataFrames in Spark and Pandas
    1m 57s
    In this video, you will convert Spark DataFrames to Pandas DataFrames and Pandas DataFrames to Spark DataFrames. FREE ACCESS
  • Locked
    15.  Exercise: Working with Spark
    4m 12s
    After completing this video, you will be able to describe basic Spark concepts. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

Rating 4.5 of 62 users Rating 4.5 of 62 users (62)
Rating 4.7 of 24 users Rating 4.7 of 24 users (24)
Rating 4.7 of 117 users Rating 4.7 of 117 users (117)

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Rating 4.6 of 63 users Rating 4.6 of 63 users (63)
Rating 4.6 of 737 users Rating 4.6 of 737 users (737)
Rating 4.2 of 20 users Rating 4.2 of 20 users (20)