SRE Emergency & Incident Response: Responding to Emergencies
SRE
| Intermediate
- 18 videos | 1h 12m 46s
- Includes Assessment
- Earns a Badge
Site Reliability Engineers (SREs) are responsible for assigning the appropriate resources and responsibilities to effectively deal with unexpected emergencies. To do this, SREs should ensure the proper processes and teams are in place before an emergency occurs. In this course, you'll explore the different emergency types and outline how to plan for them. You'll examine the causes of and how to respond to test-induced, change-induced, and process-induced emergencies and what's involved in proactive approaches to emergency testing and planning. You'll then outline the critical steps to correctly documenting emergencies, including the history of outages and mistakes. You'll then differentiate between business continuity and disaster recovery planning and outline how to create both types of plans and conduct a business impact analysis. Lastly, you'll explore some IT recovery strategies.
WHAT YOU WILL LEARN
-
discover the key concepts covered in this courseoutline the fundamental emergency response principles SREs need to be familiar with and recognize the critical steps to take when a system breaksrecognize the benefits of performing test-induced emergencies and outline what this involvesname the causes and outcomes of change-induced emergencies and outline how to respond to these emergenciesdefine what is meant by a process-induced emergency, describe the effects of them, and outline how to respond to themdescribe why it is vital to keep a history of outages and mistakes and outline best practices when doing sorecognize the importance of asking important, relevant, and challenging questionsdefine what is meant by proactive testing, compare it to reactive testing, recognize the importance of encouraging proactive testing, and name best practices when carrying out this type of testingdefine what is meant by business continuity and describe why this type of planning matters
-
outline the six steps involved in developing a business continuity planoutline methods to test a business continuity plan, recognize the importance of testing this type of plan, and describe some tips when testingrecognize the importance of ongoing efforts to review and improve a business continuity plan and outline how to go about doing itrecognize the importance of having 'top-level' support for business plans and promoting user awareness, and outline how to achieve these goalsdefine what is meant by a business impact analysis, outline how to conduct one and its typical structure, and name the possible effects on business operationsrecognize the importance of developing an IT disaster recovery plan, list the goals of this type of plan, and describe what to consider when developing oneoutline key steps to creating a working disaster recovery planname some types of IT recovery strategies and recognize the importance of recovery strategies developed for IT systems, applications, and datasummarize the key concepts covered in this course
IN THIS COURSE
-
1.Course Overview1m 47sUP NEXT
-
2.Emergency Response5m 39s
-
3.Test-induced Emergencies3m 31s
-
4.Change-induced Emergencies4m 24s
-
5.Process-induced Emergencies3m 16s
-
6.Documenting Incidents3m 29s
-
7.Open-ended Questions4m 31s
-
8.Proactive Testing5m 10s
-
9.Business Continuity4m 8s
-
10.Developing a Business Continuity Plan4m 55s
-
11.Testing a Business Continuity Plan3m 53s
-
12.Improving a Business Continuity Plan3m 49s
-
13.Business Continuity Plan Awareness3m
-
14.Business Impact Analysis (BIA)4m 41s
-
15.Disaster Recovery Planning7m 2s
-
16.Creating a Disaster Recovery Plan4m 7s
-
17.IT Recovery Strategies4m 5s
-
18.Course Summary1m 19s
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.