SKILL BENCHMARK

AWS Certified Machine Learning Specialty: Data Engineering Competency

  • 27m
  • 27 questions
The AWS Certified Machine Learning Specialty: Data Engineering Competency benchmark measures your ability to create data repositories for machine learning, identify and implement data ingestion solutions, and identify and implement a data transformation solution. This competency tests your skills in identifying data sources, determining storage mediums, implementing data ingestion pipelines using AWS Kinesis, and your ability to identify data transformation solutions with Amazon Glue/EMR/Batch. A learner who scores high on this benchmark demonstrates that they have the necessary data engineering skills to implement machine learning solutions on AWS.

Topics covered

  • compare the functionalities of AWS Data Pipeline and AWS Glue
  • configuring an AWS Data Pipeline application using the AWS console
  • define terminology used in machine learning and name typical approaches and workflows
  • define the use cases of Amazon Access Points and name its main features and workflows
  • define the use cases of Amazon Batch Operations and name its main features and workflows
  • define the use cases of Amazon Intelligent Tiering and name its main features and workflows
  • define the use cases of Amazon Storage Lens and name its main features and workflows
  • describe how AWS Step Functions works and name the principles behind workflow design
  • describe how batch processing and analytics data engineering pipelines work
  • describe how data firehouse functions in AWS Kinesis
  • describe how data ingestion works and define a data pipeline
  • describe how data repositories and data warehouses are used
  • describe how global businesses are using Amazon S3 to tackle real-world problems
  • describe how real-time and video data engineering pipelines work
  • describe how to categorize data in Amazon S3 using buckets, partitions, and tags
  • describe how to transform data for processing
  • describe how to use AWS Kinesis for data analytics tasks, such as real-time alerts and actions
  • describe the functionality of AWS Kinesis and outline its features and workflows
  • describe the main stages of the data science pipeline (collect, store, transform, label, and optimize)
  • describe the ML capabilities of the AWS platform, various tools it offers, and example real-world applications where it can be used
  • describe what Amazon S3 is used for and its main benefits
  • name major use cases of Amazon S3 and specify its role as the foundation for any data-related functionality of AWS
  • specify how to work with data streams in AWS Kinesis
  • specify how to work with video streams in AWS Kinesis
  • specify the capabilities of AWS ETL pipelines
  • work with AWS Step Functions to manage a batch job
  • work with the Amazon S3 Management Console to create buckets and use Storage Lens, Intelligent-Tiering, Access Points, Replication, and Batch Operations