Apache Spark: Apache Spark 3.2 intermediate

https://www.skillsoft.com/channel/apache-spark-d0c511f0-0b1e-11e7-a3e9-a39d8b265364?technologyandversion=308718&expertiselevel=308715 https://www.skillsoft.com/channel/apache-spark-d0c511f0-0b1e-11e7-a3e9-a39d8b265364?technologyandversion=14116106&expertiselevel=308716 https://www.skillsoft.com/channel/apache-spark-d0c511f0-0b1e-11e7-a3e9-a39d8b265364?technologyandversion=308717&expertiselevel=308716 https://www.skillsoft.com/channel/apache-spark-d0c511f0-0b1e-11e7-a3e9-a39d8b265364?technologyandversion=308719&expertiselevel=308716
  • 3 Courses | 3h 11m 7s
  • 8 Books | 30h 58m
  • 1 Course | 1h 51m 32s
  • 2 Books | 9h 5m
  • 5 Courses | 4h 19m 12s
  • 5 Books | 19h 45m
  • 2 Courses | 1h 13m 17s
Rating 5.0 of 1 users Rating 5.0 of 1 users (1)
 
Explore Apache Spark, the open-source cluster computing framework that provides a fault-tolerant programming interface for clusters.

GETTING STARTED

Apache Spark Getting Started

  • 2m 20s
  • 5m 17s

GETTING STARTED

Graph Modeling on Apache Spark: Working with Apache Spark GraphFrames

  • 2m 23s
  • 11m 51s

GETTING STARTED

Introduction to Apache Spark

  • 6m 21s
  • 7m 40s

GETTING STARTED

Introducing Apache Spark for AI Development

  • 1m 51s
  • 2m 26s

COURSES INCLUDED

Apache Spark Getting Started
Explore the basics of Apache Spark, an analytics engine used for big data processing. It's an open source, cluster computing framework built on top of Hadoop. Discover how it allows operations on data with both its own library methods and with SQL, while delivering great performance. Learn the characteristics, components, and functions of Spark, Hadoop, RDDS, the spark session, and master and worker notes. Install PySpark. Then, initialize a Spark Context and Spark DataFrame from the contents of an RDD and a DataFrame. Configure a DataFrame with a map function. Retrieve and transform data. Finally, convert Spark and Pandas DataFrames and vice versa.
15 videos | 1h 6m has Assessment available Badge
Data Analysis Using the Spark DataFrame API
An open-source cluster-computing framework used for data science, Apache Spark has become the de facto big data framework. In this Skillsoft Aspire course, learners explore how to analyze real data sets by using DataFrame API methods. Discover how to optimize operations with shared variables and combine data from multiple DataFrames using joins. Explore the Spark 2.x version features that make it significantly faster than Spark 1.x. Other topics include how to create a Spark DataFrame from a CSV file; apply DataFrame transformations, grouping, and aggregation; perform operations on a DataFrame to analyze categories of data in a data set. Visualize the contents of a Spark DataFrame, with Matplotlib. Conclude by studying how to broadcast variables and DataFrame contents in text file format.
16 videos | 1h 10m has Assessment available Badge
Data Analysis using Spark SQL
Analyze an Apache Spark DataFrame as though it were a relational database table. During this Aspire course, you will discover the different stages involved in optimizing any query or method call on the contents of a Spark DataFrame. Discover how to create views out of a Spark DataFrame's contents and run queries against them; and how to trim and clean a DataFrame. Next, learn how to perform an analysis of data by running different SQL queries; how to configure a DataFrame with an explicitly defined schema; and define what a window is in the context of Spark. Finally, observe how to create and analyze categories of data in a data set by using Windows.
9 videos | 54m has Assessment available Badge

COURSES INCLUDED

Graph Modeling on Apache Spark: Working with Apache Spark GraphFrames
Apache Spark, which is a widely used analytics engine, also helps anyone modeling graphs to perform powerful graph analytics. GraphFrames, a Spark package, aids this process by providing various graph algorithm implementations. Use this course to learn about GraphFrames and the application of graph algorithms on data to extract insights. Explore how GraphFrames complements the Apache Hadoop ecosystem in processing graph data. Getting hands-on, construct and visualize a GraphFrame. Practice querying nodes and relationships in a graph and finding motifs in it. Moving along, work with the breadth-first search and the shortestPaths functions to find paths between graph nodes. And finally, apply the PageRank algorithm to arrive at the most relevant nodes in a network. Upon completion, you'll be able to use GraphFrames to analyze and generate insights from graph data.
13 videos | 1h 51m has Assessment available Badge

COURSES INCLUDED

Introduction to Apache Spark
Apache Spark is an open-source big data processing framework. Explore how to download and install Apache Spark, and also build, configure, and initialize Spark.
10 videos | 54m has Assessment available Badge
Apache Spark SQL
Apache Spark SQL is used for structured data processing in Spark. Explore features of Spark SQL such as SparkSessions, DataFrames, and Datasets.
16 videos | 1h has Assessment available Badge
Structured Streaming
Discover the concepts of Structured Steaming such as Windowing, DataFrame, and SQL Operations, and explore File Sinks, Deduplication, and Checkpointing.
12 videos | 1h 5m has Assessment available Badge
Spark Monitoring & Tuning
Explore various ways to monitor Spark applications such as web UIs, metrics, and other monitoring tools, and examine memory tuning.
14 videos | 49m has Assessment available Badge
Spark Security
Discover Spark security! Explore how to secure Spark UI, event logs, and configuring SSL settings, and examin YARN deployments, SASL encryption, and network security.
8 videos | 29m has Assessment available Badge
SHOW MORE
FREE ACCESS

COURSES INCLUDED

Introducing Apache Spark for AI Development
Apache Spark provides a robust framework for implementing machine learning and deep learning. It takes advantage of resilient distributed databases to provide a fault-tolerant platform well-suited to developing big data applications. Because many large companies are actively using this framework, AI developers should be familiar with the basics of implementing AI with Apache Spark and Spark ML. In this course, you'll explore the concept of distributed computing. You'll identify the benefits of using Spark for AI Development, examining the advantages and disadvantages of using Spark over other big data AI platforms. Next, you'll describe how to implement machine learning, deep learning, natural language processing, and computer vision using Spark. Finally, you'll use Spark ML to create a movie recommendation system commonly used by Netflix and YouTube.
15 videos | 36m has Assessment available Badge
Using Apache Spark for AI Development
Spark is a leading open-source cluster-computing framework that is used for distributed databases and machine learning. Although not primarily designed for AI, Spark allows you to take advantage of data parallelism and the large distributed systems used in AI development. AI practitioners should recognize when to use Spark for a particular application. In this course, you'll explore advanced techniques for working with Apache Spark and identify the key advantages of using Spark over other platforms. You'll define the meaning of resilient distributed databases (RDDs) and explore several workflows related to them. You'll move on to recognize how to work with a Spark DataFrame, identifying its features and use cases. Finally, you'll learn how to create a machine learning pipeline using Spark ML Pipelines.
13 videos | 36m has Assessment available Badge

EARN A DIGITAL BADGE WHEN YOU COMPLETE THESE COURSES

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

BOOKS INCLUDED

Book

Beginning Apache Spark 2: With Resilient Distributed Datasets, Spark SQL, Structured Streaming and Spark Machine Learning Library
A tutorial on the Apache Spark platform written by an expert engineer and trainer, this book will give you the fundamentals to become proficient in using Apache Spark and know when and how to apply it to your big data applications.
book Duration 5h 7m book Authors By Hien Luu

Book

Practical Apache Spark: Using the Scala API
Following a learn-to-do-by-yourself approach to teaching Apache Spark using Scala, this book will help you learn the concepts, practice the code snippets in Scala, and complete the assignments given to get an overall exposure.
book Duration 1h 53m book Authors By Dharanitharan Ganesan, Subhashini Chellappan

Book

Next-Generation Big Data: A Practical Guide to Apache Kudu, Impala, and Spark
Utilize this practical and easy-to-follow guide to modernize traditional enterprise data warehouse and business intelligence environments with next-generation big data technologies.
book Duration 4h 13m book Authors By Butch Quinto

Book

PySpark Recipes: A Problem-Solution Approach with PySpark2
Taking you on an interesting journey to learn about PySpark and big data, this book uses a problem-solution approach where every problem is followed by a detailed, step-by-step answer which will improve your thought process for solving big data problems with PySpark.
book Duration 3h 2m book Authors By Raju Kumar Mishra

Book

Big Data SMACK: A Guide to Apache Spark, Mesos, Akka, Cassandra, and Kafka
Explaining each of the full-stack technologies and, more importantly, how to best integrate them, this book provides detailed coverage of the practical benefits of these technologies and incorporates real-world examples in every situation.
book Duration 3h 56m book Authors By Isaac Ruiz, Raul Estrada

Book

Pro Spark Streaming: The Zen of Real-Time Analytics Using Apache Spark
Introducing use cases in each chapter from a specific industry, and using publicly available datasets from that domain to unravel the intricacies of production-grade design and implementation, this book walks you through end-to-end real-time application development using real-world applications, data, and code.
book Duration 4h 16m book Authors By Zubair Nabi

Book

Spark: Big Data Cluster Computing in Production
With real-world production insight and expert guidance, tips, and tricks, this incredibly useful resource goes beyond general Spark overviews to provide targeted guidance toward using lightning-fast big data clustering in production.
book Duration 3h 35m book Authors By Brennon York, Ema Orhian, Ilya Ganelin, Kai Sasaki

Book

Big Data Analytics with Spark: A Practitioner's Guide to Using Spark for Large-Scale Data Processing, Machine Learning, and Graph Analytics, and High-Velocity Data Stream Processing
Helping you become a much sought-after Spark expert, this step-by-step guide shows you how to use Spark for different types of big data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning.
book Duration 4h 56m book Authors By Mohammed Guller
SHOW MORE
FREE ACCESS

BOOKS INCLUDED

Book

Beginning Apache Spark 3
This book begins by explaining different ways of interacting with Apache Spark, such as Spark Concepts and Architecture, and Spark Unified Stack.
book Duration 5h 19m book Authors By Hien Luu

Book

Hands-on Guide to Apache Spark 3: Build Scalable Computing Engines for Batch and Stream Data Processing
This book explains how to scale Apache Spark 3 to handle massive amounts of data, either via batch or streaming processing. It covers how to use Spark's structured APIs to perform complex data transformations and analyses you can use to implement end-to-end analytics workflows.
book Duration 3h 46m book Authors By Alfonso Antolínez García

BOOKS INCLUDED

Book

Big Data SMACK: A Guide to Apache Spark, Mesos, Akka, Cassandra, and Kafka
Explaining each of the full-stack technologies and, more importantly, how to best integrate them, this book provides detailed coverage of the practical benefits of these technologies and incorporates real-world examples in every situation.
book Duration 3h 56m book Authors By Isaac Ruiz, Raul Estrada

Book

Pro Spark Streaming: The Zen of Real-Time Analytics Using Apache Spark
Introducing use cases in each chapter from a specific industry, and using publicly available datasets from that domain to unravel the intricacies of production-grade design and implementation, this book walks you through end-to-end real-time application development using real-world applications, data, and code.
book Duration 4h 16m book Authors By Zubair Nabi

Book

Big Data Analytics with Spark: A Practitioner's Guide to Using Spark for Large-Scale Data Processing, Machine Learning, and Graph Analytics, and High-Velocity Data Stream Processing
Helping you become a much sought-after Spark expert, this step-by-step guide shows you how to use Spark for different types of big data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning.
book Duration 4h 56m book Authors By Mohammed Guller

Book

Spark: Big Data Cluster Computing in Production
With real-world production insight and expert guidance, tips, and tricks, this incredibly useful resource goes beyond general Spark overviews to provide targeted guidance toward using lightning-fast big data clustering in production.
book Duration 3h 35m book Authors By Brennon York, Ema Orhian, Ilya Ganelin, Kai Sasaki

Book

PySpark Recipes: A Problem-Solution Approach with PySpark2
Taking you on an interesting journey to learn about PySpark and big data, this book uses a problem-solution approach where every problem is followed by a detailed, step-by-step answer which will improve your thought process for solving big data problems with PySpark.
book Duration 3h 2m book Authors By Raju Kumar Mishra
SHOW MORE
FREE ACCESS

SKILL BENCHMARKS INCLUDED

Apache Spark Competency (Intermediate Level)
The Apache Spark Competency (Intermediate Level) benchmark measures your knowledge of deploying, using and streaming with Apache Spark. You will be evaluated on Spark clusters, jobs, streaming, and transforming with Spark SQL. Learners scoring high on this benchmark demonstrate the skills necessary to using Apache Spark in thr data streaming applications.
20m    |   15 questions

YOU MIGHT ALSO LIKE

Channel Apache Hadoop
Rating 5.0 of 1 users Rating 5.0 of 1 users (1)
Channel Apache Solr
Rating 5.0 of 1 users Rating 5.0 of 1 users (1)