AWS DevOps Engineer Professional 2021: Troubleshoot & Restore Operations

Amazon Web Services 2022    |    Intermediate
  • 12 Videos | 1h 23m 3s
  • Includes Assessment
  • Earns a Badge
Troubleshooting issues and being able to restore operations is a critical part of AWS incident and event response. In this course, you will begin by learning how to narrow down unhealthy components, mitigate the impact of increased load, and respond to and understand the causes and impacts of failure. Learn how to investigate logged events to correlate them to application components and learn about incident domains and Identity and Access Management (IAM) policies. Identify the best operational practices, rollback, and backup procedures. Finally, you will recognize the benefits of monitoring and observability and how to utilize and benefit from metrics and alarms. This course is one of a collection that prepares learners for Domain 5: Incident and Event Response of the AWS Certified DevOps Engineer - Professional (DOP-C01) Exam.

WHAT YOU WILL LEARN

  • discover the key concepts covered in this course
    identify the different incident domains and indicators of cloud security events
    outline how to implement health checks and narrow down unhealthy components
    outline identity and access management (IAM) policies and how to implement them with CloudWatch
    outline the best operational practices for backup, logs, and load balancing
    recognize the causes and impacts of a failure in deployment and operations
  • investigate and associate logged events with application components
    outline rollback operations and evaluate a failure to restore operations
    recognize the benefits of observability and monitoring and identify when and how to use them
    outline how to query log data, monitor events, and archive log data
    monitor operations using metrics and alarms and identify types of metrics and alarms
    summarize the key concepts covered in this course

IN THIS COURSE

  • Playable
    1. 
    Course Overview
    1m 43s
    UP NEXT
  • Playable
    2. 
    Incident Domains
    6m 21s
  • Locked
    3. 
    Health Checks and Unhealthy Components
    9m 4s
  • Locked
    4. 
    Identity and Access Management (IAM) Policies
    9m 17s
  • Locked
    5. 
    Operational Best Practices
    8m 14s
  • Locked
    6. 
    Causes and Impact of Failure
    9m 10s
  • Locked
    7. 
    Investigating Logged Events
    7m 58s
  • Locked
    8. 
    Restoring Operations and Rollback Procedures
    7m 30s
  • Locked
    9. 
    Observability and Monitoring
    8m 19s
  • Locked
    10. 
    CloudWatch Logs
    6m 28s
  • Locked
    11. 
    Monitoring with CloudWatch Metrics
    8m 9s
  • Locked
    12. 
    Course Summary
    50s

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion of this course, which can be shared on any social network or business platform

Digital badges are yours to keep, forever.