Best article about post-incident investigations that I’ve seen in awhile. My favorite part is the recommendation not to use a template for the retrospective, as it will artificially narrow the scope of the investigation.
These folks have set up a survey to gather information on whether and how folks are compensated for on-call in IT. This topic has been gaining traction over the past couple of years, and I can’t wait to see the results of the survey. Please take a moment to fill it out.
Chris Evans and Spike Lindsey
I’ll be speaking at SRECon19 Americas this March with my former coworker, Courtney Eckhardt. The talk lineup looks incredible and I’m really excited to go!
Especially useful for folks new to on-call.
If you only take one thing away from this post, it’s that you need to put your own well-being first, and once you do that other aspects of on-call will become easier.
Dave Fennell — Hosted Graphite
I have to admit I wasn’t clear on two-phase commit before I read this. Now I know what it’s all about — and its drawbacks.
This guide from Google describes the qualities and practices of SRE teams of various levels from beginner to advanced.
Gustavo Franco — Google
A good intro if you’re new around here.
Sylvia Fronczak — Scalyr