Data Lakes on AWS

Amazon Web Services 2019    |    Intermediate
  • 12 Videos | 1h 14m 14s
  • Includes Assessment
  • Earns a Badge
Likes 39 Likes 39
This course discusses the transition of data warehousing to cloud-based solutions using the AWS (Amazon Web Services) cloud platform. In 11 videos, the course explores how data lakes store data using a flat structure, and the data are tagged, making it easy to search and query. You will learn how to build a data lake on the AWS cloud by storing data in S3 (simple storage service) buckets. You will learn to set up your data lake architecture lake using AWS Glue, a fully managed ETL (extract, transform, load) service. You will learn to configure and run Glue crawlers, and you will examine how crawlers merge data stored in an S3 folder path; and to use S3 to generate metadata tables in Glue. Learners will use Athena, Amazon's interactive query service as a simple way to analyze data in S3 using standard SQL. Finally, you will examine how to merge the data crawled by our CSV (comma separated values) crawler into a single table.

WHAT YOU WILL LEARN

  • configure a custom role with specific permissions on AWS
    create an S3 bucket and upload files
    recognize the different operations that can be performed using the AWS Glue console
    create metadata tables in Glue using the web console
    perform queries on the Glue data catalog using Athena
    perform data crawling on S3 to automatically detect schemas
  • execute queries on data in crawled tables
    perform crawling operations with multiple files in the same path
    merge data stored in multiple files in the same folder path
    merge data when files have the exact same schema
    recall the roles and features of the different AWS services used in the data lake architecture

IN THIS COURSE

  • Playable
    1. 
    Course Overview
    1m 37s
    UP NEXT
  • Playable
    2. 
    Create a Role for the AWS Glue Service
    7m 11s
  • Locked
    3. 
    Upload Data to S3
    5m 41s
  • Locked
    4. 
    Explore the Glue Web Console
    3m 17s
  • Locked
    5. 
    Manually Create Glue Tables
    6m 18s
  • Locked
    6. 
    Query the Data Lake Using Amazon Athena
    6m 11s
  • Locked
    7. 
    Configure and Run Glue Crawlers
    9m 25s
  • Locked
    8. 
    Access Data in Crawled Tables
    3m 55s
  • Locked
    9. 
    Crawl Multiple CSV Files in the Same Folder Path
    6m 54s
  • Locked
    10. 
    Merge Data in Multiple Files in the Same Folder Path
    6m 59s
  • Locked
    11. 
    Work with Files Having the Exact Same Schema
    6m 47s
  • Locked
    12. 
    Exercise: Data Lakes on AWS with S3 and Glue
    5m

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion of this course, which can be shared on any social network or business platform

Digital badges are yours to keep, forever.

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE