Apache Spark: Apache Spark 2.4 Intermediate

https://www.skillsoft.com/channel/apache-spark-d0c511f0-0b1e-11e7-a3e9-a39d8b265364?technologyandversion=308718&expertiselevel=308715 https://www.skillsoft.com/channel/apache-spark-d0c511f0-0b1e-11e7-a3e9-a39d8b265364?technologyandversion=308717&expertiselevel=308716 https://www.skillsoft.com/channel/apache-spark-d0c511f0-0b1e-11e7-a3e9-a39d8b265364?technologyandversion=308719&expertiselevel=308716
  • 3 Courses | 3h 28m 7s
  • 8 Books | 30h 58m
  • 5 Courses | 4h 46m 42s
  • 5 Books | 19h 45m
  • 2 Courses | 1h 25m 17s
Likes 81 Likes 81
 
Explore Apache Spark, the open-source cluster computing framework that provides a fault-tolerant programming interface for clusters.

GETTING STARTED

Apache Spark Getting Started

  • Playable
    1. 
    Course Overview
    2m 20s
    NOW PLAYING
  • Playable
    2. 
    Introduction to Spark and Hadoop
    5m 17s
    UP NEXT

GETTING STARTED

Introduction to Apache Spark

  • Playable
    1. 
    Overview of Apache Spark
    6m 21s
    NOW PLAYING
  • Playable
    2. 
    Downloading and Installing Apache Spark
    7m 40s
    UP NEXT

GETTING STARTED

Introducing Apache Spark for AI Development

  • Playable
    1. 
    Course Overview
    1m 51s
    NOW PLAYING
  • Playable
    2. 
    Apache Spark: Features and Uses
    2m 26s
    UP NEXT

COURSES INCLUDED

Apache Spark Getting Started
Explore the basics of Apache Spark, an analytics engine used for big data processing. It's an open source, cluster computing framework built on top of Hadoop. Discover how it allows operations on data with both its own library methods and with SQL, while delivering great performance. Learn the characteristics, components, and functions of Spark, Hadoop, RDDS, the spark session, and master and worker notes. Install PySpark. Then, initialize a Spark Context and Spark DataFrame from the contents of an RDD and a DataFrame. Configure a DataFrame with a map function. Retrieve and transform data. Finally, convert Spark and Pandas DataFrames and vice versa.
15 videos | 1h 12m has Assessment available Badge
Data Analysis Using the Spark DataFrame API
An open-source cluster-computing framework used for data science, Apache Spark has become the de facto big data framework. In this Skillsoft Aspire course, learners explore how to analyze real data sets by using DataFrame API methods. Discover how to optimize operations with shared variables and combine data from multiple DataFrames using joins. Explore the Spark 2.x version features that make it significantly faster than Spark 1.x. Other topics include how to create a Spark DataFrame from a CSV file; apply DataFrame transformations, grouping, and aggregation; perform operations on a DataFrame to analyze categories of data in a data set. Visualize the contents of a Spark DataFrame, with Matplotlib. Conclude by studying how to broadcast variables and DataFrame contents in text file format.
16 videos | 1h 17m has Assessment available Badge
Data Analysis using Spark SQL
Analyze an Apache Spark DataFrame as though it were a relational database table. During this Aspire course, you will discover the different stages involved in optimizing any query or method call on the contents of a Spark DataFrame. Discover how to create views out of a Spark DataFrame's contents and run queries against them; and how to trim and clean a DataFrame. Next, learn how to perform an analysis of data by running different SQL queries; how to configure a DataFrame with an explicitly defined schema; and define what a window is in the context of Spark. Finally, observe how to create and analyze categories of data in a data set by using Windows.
9 videos | 57m has Assessment available Badge

COURSES INCLUDED

Introduction to Apache Spark
Apache Spark is an open-source big data processing framework. Explore how to download and install Apache Spark, and also build, configure, and initialize Spark.
10 videos | 59m has Assessment available Badge
Apache Spark SQL
Apache Spark SQL is used for structured data processing in Spark. Explore features of Spark SQL such as SparkSessions, DataFrames, and Datasets.
16 videos | 1h 7m has Assessment available Badge
Structured Streaming
Discover the concepts of Structured Steaming such as Windowing, DataFrame, and SQL Operations, and explore File Sinks, Deduplication, and Checkpointing.
12 videos | 1h 10m has Assessment available Badge
Spark Monitoring & Tuning
Explore various ways to monitor Spark applications such as web UIs, metrics, and other monitoring tools, and examine memory tuning.
14 videos | 55m has Assessment available Badge
Spark Security
Discover Spark security! Explore how to secure Spark UI, event logs, and configuring SSL settings, and examin YARN deployments, SASL encryption, and network security.
8 videos | 33m has Assessment available Badge
SHOW MORE
FREE ACCESS

COURSES INCLUDED

Introducing Apache Spark for AI Development
Apache Spark provides a robust framework for implementing machine learning and deep learning. It takes advantage of resilient distributed databases to provide a fault-tolerant platform well-suited to developing big data applications. Because many large companies are actively using this framework, AI developers should be familiar with the basics of implementing AI with Apache Spark and Spark ML.In this course, you'll explore the concept of distributed computing. You'll identify the benefits of using Spark for AI Development, examining the advantages and disadvantages of using Spark over other big data AI platforms. Next, you'll describe how to implement machine learning, deep learning, natural language processing, and computer vision using Spark. Finally, you'll use Spark ML to create a movie recommendation system commonly used by Netflix and YouTube.
15 videos | 42m has Assessment available Badge
Using Apache Spark for AI Development
Spark is a leading open-source cluster-computing framework that is used for distributed databases and machine learning. Although not primarily designed for AI, Spark allows you to take advantage of data parallelism and the large distributed systems used in AI development. AI practitioners should recognize when to use Spark for a particular application. In this course, you'll explore advanced techniques for working with Apache Spark and identify the key advantages of using Spark over other platforms. You'll define the meaning of resilient distributed databases (RDDs) and explore several workflows related to them. You'll move on to recognize how to work with a Spark DataFrame, identifying its features and use cases. Finally, you'll learn how to create a machine learning pipeline using Spark ML Pipelines.
13 videos | 42m has Assessment available Badge

EARN A DIGITAL BADGE WHEN YOU COMPLETE THESE COURSES

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

BOOKS INCLUDED

Book

Beginning Apache Spark 2: With Resilient Distributed Datasets, Spark SQL, Structured Streaming and Spark Machine Learning Library
A tutorial on the Apache Spark platform written by an expert engineer and trainer, this book will give you the fundamentals to become proficient in using Apache Spark and know when and how to apply it to your big data applications.
Book Duration 5h 7m Book Authors By Hien Luu

Book

Practical Apache Spark: Using the Scala API
Following a learn-to-do-by-yourself approach to teaching Apache Spark using Scala, this book will help you learn the concepts, practice the code snippets in Scala, and complete the assignments given to get an overall exposure.
Book Duration 1h 53m Book Authors By Dharanitharan Ganesan, Subhashini Chellappan

Book

Next-Generation Big Data: A Practical Guide to Apache Kudu, Impala, and Spark
Utilize this practical and easy-to-follow guide to modernize traditional enterprise data warehouse and business intelligence environments with next-generation big data technologies.
Book Duration 4h 13m Book Authors By Butch Quinto

Book

PySpark Recipes: A Problem-Solution Approach with PySpark2
Taking you on an interesting journey to learn about PySpark and big data, this book uses a problem-solution approach where every problem is followed by a detailed, step-by-step answer which will improve your thought process for solving big data problems with PySpark.
Book Duration 3h 2m Book Authors By Raju Kumar Mishra

Book

Big Data SMACK: A Guide to Apache Spark, Mesos, Akka, Cassandra, and Kafka
Explaining each of the full-stack technologies and, more importantly, how to best integrate them, this book provides detailed coverage of the practical benefits of these technologies and incorporates real-world examples in every situation.
Book Duration 3h 56m Book Authors By Isaac Ruiz, Raul Estrada

Book

Pro Spark Streaming: The Zen of Real-Time Analytics Using Apache Spark
Introducing use cases in each chapter from a specific industry, and using publicly available datasets from that domain to unravel the intricacies of production-grade design and implementation, this book walks you through end-to-end real-time application development using real-world applications, data, and code.
Book Duration 4h 16m Book Authors By Zubair Nabi

Book

Spark: Big Data Cluster Computing in Production
With real-world production insight and expert guidance, tips, and tricks, this incredibly useful resource goes beyond general Spark overviews to provide targeted guidance toward using lightning-fast big data clustering in production.
Book Duration 3h 35m Book Authors By Brennon York, Ema Orhian, Ilya Ganelin, Kai Sasaki

Book

Big Data Analytics with Spark: A Practitioner's Guide to Using Spark for Large-Scale Data Processing, Machine Learning, and Graph Analytics, and High-Velocity Data Stream Processing
Helping you become a much sought-after Spark expert, this step-by-step guide shows you how to use Spark for different types of big data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning.
Book Duration 4h 56m Book Authors By Mohammed Guller
SHOW MORE
FREE ACCESS

BOOKS INCLUDED

Book

Big Data SMACK: A Guide to Apache Spark, Mesos, Akka, Cassandra, and Kafka
Explaining each of the full-stack technologies and, more importantly, how to best integrate them, this book provides detailed coverage of the practical benefits of these technologies and incorporates real-world examples in every situation.
Book Duration 3h 56m Book Authors By Isaac Ruiz, Raul Estrada

Book

Pro Spark Streaming: The Zen of Real-Time Analytics Using Apache Spark
Introducing use cases in each chapter from a specific industry, and using publicly available datasets from that domain to unravel the intricacies of production-grade design and implementation, this book walks you through end-to-end real-time application development using real-world applications, data, and code.
Book Duration 4h 16m Book Authors By Zubair Nabi

Book

Big Data Analytics with Spark: A Practitioner's Guide to Using Spark for Large-Scale Data Processing, Machine Learning, and Graph Analytics, and High-Velocity Data Stream Processing
Helping you become a much sought-after Spark expert, this step-by-step guide shows you how to use Spark for different types of big data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning.
Book Duration 4h 56m Book Authors By Mohammed Guller

Book

Spark: Big Data Cluster Computing in Production
With real-world production insight and expert guidance, tips, and tricks, this incredibly useful resource goes beyond general Spark overviews to provide targeted guidance toward using lightning-fast big data clustering in production.
Book Duration 3h 35m Book Authors By Brennon York, Ema Orhian, Ilya Ganelin, Kai Sasaki

Book

PySpark Recipes: A Problem-Solution Approach with PySpark2
Taking you on an interesting journey to learn about PySpark and big data, this book uses a problem-solution approach where every problem is followed by a detailed, step-by-step answer which will improve your thought process for solving big data problems with PySpark.
Book Duration 3h 2m Book Authors By Raju Kumar Mishra
SHOW MORE
FREE ACCESS

YOU MIGHT ALSO LIKE

CHANNEL Apache HBase
Likes 37 Likes 37  
Likes 6 Likes 6  
CHANNEL Wintellect APIs
Likes 2 Likes 2