Course: Netwerkbeheerder naar Site Reliability Engineer – Deel 4: Site Reliability Engineer
Network Administrator
15 uur
Engels (US)

Course: Netwerkbeheerder naar Site Reliability Engineer – Deel 4: Site Reliability Engineer

Snel navigeren naar:

  • Informatie
  • Inhoud
  • Kenmerken
  • Meer informatie
  • Reviews
  • FAQ

Productinformatie

Deze uitgebreide training gaat in op verschillende aspecten van Site Reliability Engineering (SRE), waarbij best practices voor het onboarden van nieuwe teamleden, essentiële technische vaardigheden en het effectief omgaan met operationele overbelasting aan bod komen. Je onderzoekt methoden voor het beheren van de operationele belasting, het gebruik van support-ticketingsystemen en het stellen van serviceniveaudoelstellingen. Daarnaast leer je over 'toil' en de negatieve effecten ervan op de teamproductiviteit, samen met strategieën om dit te identificeren en te elimineren.

Bovendien krijg je inzicht in strategieën voor noodplanning, het delen van kennis en het schrijven van effectieve postmortems. Het belang van serviceniveaudoelstellingen en succesvolle teameigenschappen wordt benadrukt, waardoor een teamgerichte aanpak en effectieve communicatietechnieken worden aangemoedigd. Samenwerkings- en communicatievaardigheden worden verbeterd door het leren van effectief vergaderbeheer, het programmeren van paren en het gebruik van samenwerkingstools. Bovendien verdiep je je in projectmanagementstatistieken, softwaretestmethodologieën, foutanalysemethoden en API-monitoringstrategieën. Je onderzoekt ook SRE-betrokkenheidsmodellen, Production Readiness Review-processen en het opschalen van SRE naar grotere omgevingen. Door middel van casestudies krijg je praktische inzichten in het toepassen van SRE-betrokkenheidsmodellen in praktijkscenario's, waardoor je waardevolle vaardigheden verkrijgt om te groeien als Site Reliability Engineering.

Inhoud van de training

Netwerkbeheerder naar Site Reliability Engineer – Deel 4: Site Reliability Engineer

15 uur

SRE Team Management: Scaling the Team

When adding a new site reliability engineer (SRE) to your team, it's important that the new member not only has the required skills but also receives the proper training. This allows the new SRE to fit into the team and get up to speed as quickly as possible. In this course, you'll learn about the best practices for onboarding a new SRE team member, including methods and tools that can be used during the onboarding process. Next, you'll explore the technical skills that an SRE requires, including the ability to reverse engineer an application to determine the root cause of a problem. Finally, you'll examine the skills and knowledge an SRE requires when on-call, including those needed to provide support and manage support issues.

SRE Team Management: Managing Operational Loads

To ensure and maintain a system's functional state, site reliability engineers (SRE) must learn how to identify, calculate, and manage a system's operational load, which generally falls into three categories: ongoing operation activities, tickets, and pages. In this course, you'll explore these categories in detail. You'll start by outlining methods for managing operational loads at the team level and using support ticketing systems and service level objectives. Next, you'll investigate 'toil,' a term used to describe the operational work associated with running and maintaining a production service. You'll outline steps for identifying, calculating, and eliminating toil and examine the adverse effects toil can have on a team. Additionally, you'll outline how to work with interrupts and distinguish between crucial metrics used for managing them. Lastly, you'll identify the human element factors to consider when dealing with interrupts, including efficiency, distractibility, and respect.

SRE Team Management: Operational Overload

Site reliability engineers (SREs) are responsible for many administrative tasks, often splitting their time between reactive ops work and special projects. To ensure teams do not become overloaded, SREs may be transferred to a team in order to prevent or help mitigate overload. In this course, you will learn how to deal with operational overload. You'll start by examining ops mode, which is an approach used to ensure services are properly maintained and optimized. You'll discover factors that contribute to team morale and stress. In addition, you will outline emergency planning strategies and best practices, as well as learn how to categorize emergencies and prepare detailed emergency plans. Next, you'll explore how knowledge sharing relates to emergency preparedness, the key to writing successful postmortems, the importance of service level objectives, and how an appropriate level of detail is required to properly explain your findings. Lastly, you'll discover the key factors and attributes of successful teams. You'll examine a team-first approach and differentiate between questioning techniques such as open/closed, funnel, probing, and leading.

Core Skills for Site Reliability Engineers: SRE Collaboration & Communication

Collaboration is key to getting the most out of your team and ensuring your clients receive their desired service. In this course, you'll learn to collaborate and communicate as an SRE effectively. You'll learn how to run traditional and virtual meetings to ensure maximum effectiveness and productivity, whether it's with customers, internal or external team members, or distributed teams. You'll examine how to plan, carry out, and post-analyze meetings using best practices and sufficient preparation, tailoring these methods to suit the participants and the end-goal. You'll delve into the unique characteristics of different meeting types, such as those for problem-solving or innovation. You'll explore the advantages and challenges of SRE pair programming. You'll then end the course by investigating some helpful collaboration and communication tools.

SRE Metric Management: Software Reliability Metrics

To improve the chances of creating, monitoring, and maintaining a successful software development project, site reliability engineers and all team members must be aware of which metrics to measure. They also need a working knowledge of both automated and manual testing methods. In this course, you'll learn how to manage and select SRE metrics and how various testing methods work. You'll begin by learning what metrics need to be measured for project management, software development, and APIs - examining in detail CI/CD, cloud API, and software project metrics, to name a few. Next, you'll compare both manual and automated testing methods and the goals of each. Lastly, you'll investigate automated testing frameworks and platforms, test cases and types, and best practices and pitfalls to consider.

SRE Metric Management: Software Reliability Monitoring and Reporting

Once SRE metrics have been identified, site reliability engineers (SREs) must know how to perform fault analysis on a system, classify defects, and monitor and report data. In this course, you'll explore the tools and best practices for carrying out these procedures. You'll begin by identifying various fault analysis methods and tools. You'll then classify software defects and bugs with a focus on severity and priority. Next, you'll investigate strategies for monitoring APIs and explore some tools used for this task. You'll then examine in detail several tools for collecting, analyzing, and reporting metric data using a customizable dashboard, including those that comprise the ELK Stack - Elasticsearch, Logstash, and Kibana. Furthermore, you'll explore the data collection tool Beats and the beneficial use cases for Elasticsearch notifications.

SRE Engagement: Production Readiness Review

Production Readiness Review (PRR), the standard first step of SRE engagement, and its phases are used to identify a service's reliability needs. The concept of ""early engagement"" is then used to evolve the Simple PRR model. In this course, you'll investigate SRE engagement, early engagement, and Production Readiness Review. You'll start by delving into each phase of the SRE Production Readiness Review (PRR) model, namely, engagement, analysis, refactoring, training, onboarding, and continuous improvement. Next, you'll learn how early engagement can be used to evolve the Simple PRR model. You'll then examine how SRE platforms and frameworks can provide structural solutions. Finally, you'll learn how to use the SRE engagement model to manage software projects, comparing it to the traditional System Development Life Cycle (SDLC) model.

SRE Engagement: The SRE Engagement Model

The SRE engagement model and SRE service lifecycle have note-worthy similarities and differences to the traditional software development life cycle. In this course, you'll explore these differences and investigate the SRE engagement model's components and how to work with it in various circumstances. You'll learn the steps for setting up and building SRE service relationships and establishing a roadmap for sprints and communication. You'll examine how to measure the impact of SRE engagement, set ground rules for SRE teams, and sustain effective relationships with other SREs and developers. Next, you'll study the steps to take for scaling SRE to larger environments and for ending an engagement. Lastly, you'll review case studies to see the results of how others have used the SRE engagement model used in real-life.

Final Exam: Site Reliability Engineer

Final Exam: Site Reliability Engineer will test your knowledge and application of the topics presented throughout the Site Reliability Engineer track of the Skillsoft Aspire Network Admin to Site Reliability Engineer Journey.

Kenmerken

Docent inbegrepen
Bereidt voor op officieel examen
Engels (US)
15 uur
Network Administrator
180 dagen online toegang
HBO

Meer informatie

Doelgroep Systeembeheerder, Netwerkbeheerder
Voorkennis

Geen formele voorkennis vereist. Het wordt echter aangeraden om enige voorkennis te hebben vanSite Reliability Engineering, Networking en DevOps.

Daarnaast wordt het aangeraden om eerst Deel 1, 2 en 3 van het van het leertraject ‘’Network Admin to Site Reliability Engineer’’ te volgen.

  • Deel 1: Netwerkbeheerder
  • Deel 2: DevOps Engineer
  • Deel 3: Chaos Engineer
Resultaat

Na het voltooien van deze training ben je klaar om het SRE-team op te schalen, operationele lasten af te handelen, effectief te communiceren en samen te werken, softwarebetrouwbaarheidsstatistieken te beheren en het SRE-betrokkenheidsmodel te beheren als Site Reliability Engineer.

Positieve reacties van cursisten

Training: Leidinggeven aan de AI transformatie

Nuttige training. Het bestelproces verliep vlot, ik kon direct beginnen.

- Mike van Manen

Onbeperkt Leren Abonnement

Onbeperkt Leren aangeschaft omdat je veel waar voor je geld krijgt. Ik gebruik het nog maar kort, maar eerste indruk is goed.

- Floor van Dijk

Training: Leidinggeven aan de AI transformatie

Al jaren is icttrainingen.nl onze trouwe partner op het gebied van kennisontwikkeling voor onze IT-ers. Wij zijn blij dat wij door het platform van icttrainingen.nl maatwerk en een groot aanbod aan opleidingen kunnen bieden aan ons personeel.

- Loranne, Teamlead bij Inwork

Hoe gaat het te werk?

1

Training bestellen

Nadat je de training hebt besteld krijg je bevestiging per e-mail.

2

Toegang leerplatform

In de e-mail staat een link waarmee je toegang krijgt tot ons leerplatform.

3

Direct beginnen

Je kunt direct van start. Studeer vanaf nu waar en wanneer jij wilt.

4

Training afronden

Rond de training succesvol af en ontvang van ons een certificaat!

Veelgestelde vragen

Veelgestelde vragen

Op welke manieren kan ik betalen?

Je kunt bij ons betalen met iDEAL, PayPal, Creditcard, Bancontact en op factuur. Betaal je op factuur, dan kun je met de training starten zodra de betaling binnen is.

Hoe lang heb ik toegang tot de training?

Dit verschilt per training, maar meestal 180 dagen. Je kunt dit vinden onder het kopje ‘Kenmerken’.

Waar kan ik terecht als ik vragen heb?

Je kunt onze Learning & Development collega’s tijdens kantoortijden altijd bereiken via support@icttrainingen.nl of telefonisch via 026-8402941.

Background Frame
Background Frame

Onbeperkt leren

Met ons Unlimited concept kun je onbeperkt gebruikmaken van de trainingen op de website voor een vast bedrag per maand.

Bekijk de voordelen

Heb je nog twijfels?

Of gewoon een vraag over de training? Blijf er vooral niet mee zitten. We helpen je graag verder. Daar zijn we voor!

Contactopties