SRE Weekly Issue #355

A message from our sponsor, Rootly:

Manage incidents directly from Slack with Rootly 🚒.

Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?:


I’m trying something new: I’m looking for input from you, dear readers!

This link is a Google Form where I’m asking for ideas that I might turn into a blog post or conference talk. If you’re game, I’d love to hear what you think.

Here’s the panel for this webinar:

  • Vanessa Huerta Granda (Jeli)
  • Emily Ruppe (Jeli)
  • Liz Fong-Jones (Honeycomb)
  • Fred Hebert (Honeycomb)

Honestly, with that set of names, I’d listen even if they were just discussing the weather.
  Full disclosure: Honeycomb, my employer, is mentioned.

This week saw an outage of the NOTAM system which disseminates important information to aircraft pilots in the US. As a result, all flights in the US were grounded.

There’s not much in the way of interesting detail available yet, but I did see a mention of this air incident in which NOTAMs played a significant part. Mentour Pilot also covered this one

  Admiral Cloudberg

In essence, this new reliability is:

  1. The health of your system
  2. Weighed based on customer expectations and happiness
  3. Prioritized based on your current capabilities

This article focuses on the sociotechnical aspects of reliability.

  Jim Gochee — The New Stack

Here are some guidelines for what kind of alerting works best for services at various stages of maturity.

  Ali Sattari

The actions we take to avert a potential problem can introduce their own risks.

  Will Gallego

This one’s from the folks.

I often meet with skepticism when I say that server monitoring systems should only page when a service stops doing its work.

Read on to find out why.

  Dan Slimmon

Updated: January 15, 2023 — 9:02 pm
A production of Tinker Tinker Tinker, LLC Frontier Theme