Apache Spark: Apache Spark 2.3 Beginner

https://www.skillsoft.com/channel/apache-spark-d0c511f0-0b1e-11e7-a3e9-a39d8b265364?technologyandversion=54772&expertiselevel=54769 https://www.skillsoft.com/channel/apache-spark-d0c511f0-0b1e-11e7-a3e9-a39d8b265364?technologyandversion=54771&expertiselevel=54770 https://www.skillsoft.com/channel/apache-spark-d0c511f0-0b1e-11e7-a3e9-a39d8b265364?technologyandversion=54773&expertiselevel=54770
  • 3 Courses | 3h 13m 7s
  • 8 Books | 30h 58m
  • 5 Courses | 4h 22m 12s
  • 5 Books | 19h 45m
  • 2 Courses | 1h 14m 41s
Likes 65
 
Explore Apache Spark, the open-source cluster computing framework that provides a fault-tolerant programming interface for clusters.

GETTING STARTED

Apache Spark Getting Started

  • Playable
    1. 
    Course Overview
    2m 23s
    NOW PLAYING
  • Playable
    2. 
    Introduction to Spark and Hadoop
    5m 20s
    UP NEXT

GETTING STARTED

Introduction to Apache Spark

  • Playable
    1. 
    Overview of Apache Spark
    6m 24s
    NOW PLAYING
  • Playable
    2. 
    Downloading and Installing Apache Spark
    7m 43s
    UP NEXT

GETTING STARTED

Introducing Apache Spark for AI Development

  • Playable
    1. 
    Course Overview
    1m 54s
    NOW PLAYING
  • Playable
    2. 
    Apache Spark: Features and Uses
    2m 29s
    UP NEXT

COURSES INCLUDED

Apache Spark Getting Started
Explore the basics of Apache Spark, an analytics engine used for big data processing. It's an open source, cluster computing framework built on top of Hadoop. Discover how it allows operations on data with both its own library methods and with SQL, while delivering great performance. Learn the characteristics, components, and functions of Spark, Hadoop, RDDS, the spark session, and master and worker notes. Install PySpark. Then, initialize a Spark Context and Spark DataFrame from the contents of an RDD and a DataFrame. Configure a DataFrame with a map function. Retrieve and transform data. Finally, convert Spark and Pandas DataFrames and vice versa.
15 videos | 1h has Assessment available Badge
Data Analysis Using the Spark DataFrame API
An open-source cluster-computing framework used for data science, Apache Spark has become the de facto big data framework. In this Skillsoft Aspire course, learners explore how to analyze real data sets by using DataFrame API methods. Discover how to optimize operations with shared variables and combine data from multiple DataFrames using joins. Explore the Spark 2.x version features that make it significantly faster than Spark 1.x. Other topics include how to create a Spark DataFrame from a CSV file; apply DataFrame transformations, grouping, and aggregation; perform operations on a DataFrame to analyze categories of data in a data set. Visualize the contents of a Spark DataFrame, with Matplotlib. Conclude by studying how to broadcast variables and DataFrame contents in text file format.