Spark for High-speed Big Data Analytics

Big Data    |    Beginner
  • 12 videos | 45m 51s
  • Includes Assessment
  • Earns a Badge
Rating 4.5 of 62 users Rating 4.5 of 62 users (62)
Spark is an open-source, massively parallel, in-memory solution that allows you to run big data analytics pipelines at high speed. Use this course to learn how Apache Spark works and gain an understanding of its architecture. As you progress, investigate the industry-leading examples of Uber and Alibaba to recognize how Spark can add business value to data in many industry types. Moving along, compare the functionality of Spark and Hadoop in relation to use cases, identifying when using Spark is most advantageous. Finally, explore fundamental Spark characteristics, optimization techniques, and best practices. When you've completed this course, you'll have a solid theoretical understanding of how and when to use Apache Spark for specific big data analytics tasks.

WHAT YOU WILL LEARN

  • Discover the key concepts covered in this course
    Recognize how spark offers an open-source, scalable, massively parallel, in-memory solution for analytics applications
    Outline the two main components of the spark architecture: resilient distributed dataset and directed acyclic graph
    Describe how spark is providing business value to uber
    Describe how spark is providing business value to alibaba
    Describe how spark is providing business value to the healthcare industry
  • Compare and name the main differences between spark and hadoop with respect to ease of use, latency, security, and cost
    Specify in which scenarios and conditions spark is a better choice than its alternatives
    List the main features of spark, such as loading behavior, file formats, parallelism, cache, data skews
    Name the most important performance optimization techniques in apache spark, such as file format selection, level of parallelism, and api selection
    Name simple best practices when using spark, like starting small or resolving skewness
    Summarize the key concepts covered in this course

IN THIS COURSE

  • 1m 55s
  • 6m
  • Locked
    3.  Components of the Apache Spark Architecture
    4m 10s
  • Locked
    4.  Apache Spark Use Case: Uber Using Spark
    4m 52s
  • Locked
    5.  Apache Spark Use Case: Alibaba Using Spark
    4m 29s
  • Locked
    6.  Apache Spark Use Case: The Healthcare Industry
    3m 4s
  • Locked
    7.  Apache Spark vs. Hadoop
    3m 31s
    In this video, you'll compare and name the main differences between Spark and Hadoop with respect to ease of use, latency, security, and cost. You'll learn that both Hadoop and Spark are popular choices in the marketplace. Here, you'll discover more about the major differences between Hadoop and Spark. FREE ACCESS
  • Locked
    8.  Top Apache Spark Use Cases
    5m 23s
  • Locked
    9.  Apache Spark's Main Features
    4m 12s
  • Locked
    10.  Apache Spark Performance Optimization Techniques
    3m 42s
  • Locked
    11.  Apache Spark Best Practices
    3m 26s
  • Locked
    12.  Course Summary
    1m 10s

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Rating 4.7 of 57 users Rating 4.7 of 57 users (57)
Rating 4.5 of 11 users Rating 4.5 of 11 users (11)
Rating 4.6 of 893 users Rating 4.6 of 893 users (893)