Maybe you’re thinking of skipping over “yet another article about blamelessness”? Don’t. This one has some great examples and stories and is well worth a read.
I’m definitely guilty of a couple of these.
New podcast relevant to our interests!
In this series, you’ll hear insightful conversations with engineers, product managers, co-founders and more, all about the debatable topic of incident management.
Luis Gonzalez — incident.io
A puzzling performance regression in EBS volumes, seemingly reproducible across instances. Anyone else seeing anything like this?
Dustin Brown — dolthub
This article presents a framework for scaling SRE teams by defining SRE processes, automating, and iterating.
Stelios Manioudakis — DZone
Some tips on what makes a good alert and how to design your alerts to be actually useful, rather than just noise.
Leon Adato — Kentik
Why would you want multiple different targets for the same SLO? Read this one to find out.
Conflict-free Replicated Data Types are powerful, but they have downsides explained in this article, so it’d be great if we could avoid them when possible.