SRE Weekly Issue #311

I’m dedicating this issue to the people of Ukraine, and also those in Russia that are protesting the invasion.

A message from our sponsor, Rootly:

Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt):


In this episode of the podcast Page it to the Limit, they discuss learning how to be an incident commander.

There was major AWS outage and the second day I was incident command.

  Kat Gaines, with guest Iris Carrera — Page it to the Limit

This article discusses three aspects of fully owning your systems: mandate, knowledge, and responsibility. After defining those terms, it goes on to discuss what happens if one of the three is missing.

  Alex Ewerlöf

I really like the “Managing High RPS” section, especially the part about ignoring events if they’re too old to be relevant any longer.

  Ankush Gulati and David Gevorkyan — Netflix

Cool idea! When a process is overloaded, the system drops requests based on heuristics until the overload condition has passed.

  Bryan Barkley — LinkedIn

Here’s another take on incident severity and priority levels. The two terms are different and mean specific things.

  Robert Ross — FireHydrant

Can we please agree to stop calling them “postmortems”?

  Ash P — Cruform Newsletter

The term “service level” goes back to the US highway system maintenance procedures, among others.

  Akshay Chugh and Piyush Verma — Last9

Charity Majors has railed against metrics for years. Now, her company Honeycomb has a metrics product offering. How does she square it?

  Charity Majors — Honeycomb

Despite the December AWS outage, folks aren’t fleeing AWS, and multi-cloud designs for reliability still don’t make sense, according to this cloud consultant. The media angle is fascinating.

  Lydia Leong — Cloud Pundit

This article has a great list of ideas of who to talk to, plus a section on how to prioritize when you’re short on time.

  Daniela Hurtado — Jeli


Updated: June 1, 2022 — 9:37 pm
SRE WEEKLY © 2015 Frontier Theme