Articles
Here’s four of the lessons I learned that should help you build a successful SRE organization.
- Focus on Developer Training
- Focus on the Right Abstractions
- Focus on Self Service
- Automate Yourself out of a job
Sven Hans Knecht
In this blog post, we’ll talk about two incident management structure models — distributed and centralized, including the pros and cons of each, and examples of what each structure looks like in our community.
Robert Ross — FireHydrant
The Rasmussen model conceptualizes the limits of a system along 3 boundaries: Cost, System Performance, and Human Capacity.
Nishant Modak — Last9
Wow, this is a really interesting incident. it has all the hallmarks of a nightmare sev1: time pressure, unknown problem, inventing new procedures on the spot, multiple different teams/specialties having to work together, etc.
Jorg Wenninger — CERN
What do you do when many engineers all need to take the same day off each week for religious reasons?
TimeWeSp
Toyota recently halted production in their factories due to a problem in their order system, about which they shared some interesting details.
Toyota
Here’s a guidebook on how to handle being the first SRE at a company.
Sven Hans Knecht