In incident management as in so many areas, there’s the shiny work and the unglamorous but critical parts, and the latter often fall to women. This article seeks to reverse that trend by reminding us of the incredibly important glue work Women have been doing since the dawn of computing.
Emily Arnott — Blameless
I love stories about applying IT incident response processes to non-IT incidents.
Robert Ross — FireHydrant
Dear reader, perhaps you would enjoy reading this article on the many benefits of engineering blogs… then go write more great content and send me a link. :D
New York Times — Jordan News
Okay, this isn’t exactly an SRE story, but it sounds really familiar. It’s a story of “user error” that’s really about designing systems to help users catch errors.
Jakub Roztocil — httpie
nginx has a pretty nifty zero-downtime restart system, but it didn’t quite fit Cloudflare’s needs.
Maciej Lechowski — Cloudflare
This article does a great job of summarizing SRECon Americas by pulling out five major themes that ran through multiple talks.
Gavin Cahill — Gremlin
Building buy-in is everything.
[…] the key function of SRE being to help shape engineering’s perception of reality rather than act as a gatekeeper.
By “FinOps”, they mean a team in your company dedicated to reducing cloud computing costs. Does that really help?
[…] it is also possible to create incident writeups that engineers choose to read, that clearly describe and highlight difficult and poorly-understood aspects of our systems, and that become part of the organisation’s collective understanding.
Laura Nolan — Container Solutions`
Years after we both started doing the newsletter thing, I finally sat down with Corey Quinn for an episode of his podcast. We talked about running newsletters, my other side project, and of course, reliability.
Corey Quinn — Last Week In AWS