Data Lake Architectures & Data Management Principles

Big Data    |    Intermediate
  • 10 Videos | 38m 8s
  • Includes Assessment
  • Earns a Badge
Likes 9 Likes 9
A key component to wrangling data is the data lake framework. In this 9-video Skillsoft Aspire course, learners discover how to implement data lakes for real-time management. Explore data ingestion, data processing, and data lifecycle management with Amazon Web Services (AWS) and other open-source ecosystem products. Begin by examining real-time big data architectures, and how to implement Lambda and Kappa architectures to manage real-time big data. View benefits of adopting Zaloni data lake reference architecture. Examine the essential approach of data ingestion and comparative benefits provided by file formats Avro and Parquet. Explore data ingestion with Sqoop, and various data processing strategies provided by MapReduce V2, Hive, Pig, and Yam for processing data with data lakes. Learn how to derive value from data lakes and describe benefits of critical roles. Learners will explore steps involved in the data lifecycle and the significance of archival policies. Finally, learn how to implement an archival policy to transition between S3 and Glacier, depending on adopted policies. Close the course with an exercise on ingesting data and archival policy.

WHAT YOU WILL LEARN

  • implement Lambda and Kappa architectures to manage real-time big data
    identify the benefits of adopting Zaloni data lake reference architecture
    describe data ingestion approaches and compare Avro and Parquet file format benefits
    demonstrate how to ingest data using Sqoop
    describe the data processing strategies provided by MapReduce V2, Hive, Pig, and Yam for processing data with data lakes
  • recognize how to derive value from data lakes and describe the benefits of critical roles
    describe the steps involved in the data life cycle and the significance of archival policies
    implement an archival policy to transition between S3 and Glacier, depending on adopted policies
    ingest data using Sqoop and implement an archival policy to transition from S3 to adopted policies

IN THIS COURSE

  • Playable
    1. 
    Course Overview
    2m 9s
    UP NEXT
  • Playable
    2. 
    Real-Time Big Data Architectures
    4m 5s
  • Locked
    3. 
    Data Lake Reference Architecture
    2m 11s
  • Locked
    4. 
    Data Ingestion and File Formats
    4m 44s
  • Locked
    5. 
    Ingestion Using Sqoop
    5m 55s
  • Locked
    6. 
    Data Processing Strategies
    3m 42s
  • Locked
    7. 
    Deriving Value from Data Lakes
    2m 32s
  • Locked
    8. 
    Data Life Cycle
    2m 28s
  • Locked
    9. 
    S3 and Glacier
    4m 4s
  • Locked
    10. 
    Exercise: Ingest Data and Implement Archival Policy
    2m 19s

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion of this course, which can be shared on any social network or business platform

Digital badges are yours to keep, forever.

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE