Course details

Data Lake: Architectures & Data Management Principles

Data Lake: Architectures & Data Management Principles


Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description

Discover how to implement data lakes for real-time data management. Explore data ingestion, data processing, and data life-cycle management using AWS and other open-source ecosystem products.



Expected Duration (hours)
0.6

Lesson Objectives

Data Lake: Architectures & Data Management Principles

  • implement Lambda and Kappa architectures to manage real-time big data
  • identify the benefits of adopting Zaloni data lake reference architecture
  • describe data ingestion approaches and compare Avro and Parquet file format benefits
  • demonstrate how to ingest data using Sqoop
  • describe the data processing strategies provided by MapReduce V2, Hive, Pig, and Yam for processing data with data lakes
  • recognize how to derive value from data lakes and describe the benefits of critical roles
  • describe the steps involved in the data life cycle and the significance of archival policies
  • implement an archival policy to transition between S3 and Glacier, depending on adopted policies
  • ingest data using Sqoop and implement an archival policy to transition from S3 to adopted policies
  • Course Number:
    it_dsdlipdj_02_enus

    Expertise Level
    Intermediate