Data Lakes on AWS
Amazon Web Services 2019
| Intermediate
- 12 Videos | 1h 9m 14s
- Includes Assessment
- Earns a Badge
This course discusses the transition of data warehousing to cloud-based solutions using the AWS (Amazon Web Services) cloud platform. In 11 videos, the course explores how data lakes store data using a flat structure, and the data are tagged, making it easy to search and query. You will learn how to build a data lake on the AWS cloud by storing data in S3 (simple storage service) buckets. You will learn to set up your data lake architecture lake using AWS Glue, a fully managed ETL (extract, transform, load) service. You will learn to configure and run Glue crawlers, and you will examine how crawlers merge data stored in an S3 folder path; and to use S3 to generate metadata tables in Glue. Learners will use Athena, Amazon's interactive query service as a simple way to analyze data in S3 using standard SQL. Finally, you will examine how to merge the data crawled by our CSV (comma separated values) crawler into a single table.
WHAT YOU WILL LEARN
-
configure a custom role with specific permissions on AWScreate an S3 bucket and upload filesrecognize the different operations that can be performed using the AWS Glue consolecreate metadata tables in Glue using the web consoleperform queries on the Glue data catalog using Athenaperform data crawling on S3 to automatically detect schemas
-
execute queries on data in crawled tablesperform crawling operations with multiple files in the same pathmerge data stored in multiple files in the same folder pathmerge data when files have the exact same schemarecall the roles and features of the different AWS services used in the data lake architecture
IN THIS COURSE
-
1.Course Overview1m 37sUP NEXT
-
2.Create a Role for the AWS Glue Service7m 11s
-
3.Upload Data to S35m 41s
-
4.Explore the Glue Web Console3m 17s
-
5.Manually Create Glue Tables6m 18s
-
6.Query the Data Lake Using Amazon Athena6m 11s
-
7.Configure and Run Glue Crawlers9m 25s
-
8.Access Data in Crawled Tables3m 55s
-
9.Crawl Multiple CSV Files in the Same Folder Path6m 54s
-
10.Merge Data in Multiple Files in the Same Folder Path6m 59s
-
11.Work with Files Having the Exact Same Schema6m 47s
-
12.Exercise: Data Lakes on AWS with S3 and Glue5m
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion of this course, which can be shared on any social network or business platform
Digital badges are yours to keep, forever.