SRE Weekly Issue #378

A message from our sponsor, Rootly:

Curious how companies like Figma, Tripadvisor, and 100s of others leverage Rootly to manage incidents in Slack and unlock instant best practices? Check out this lightning demo:


This is the story of a fascinating incident in which a commercial airplane’s engine was ripped off during takeoff (also covered on Mentour Pilot). What really struck me is the way a huge team on the ground and in the air assembled around the incident and all played very important roles in getting the plane down safely.

  Mark D. Young — PoliticsWeb

Time for another Catchpoint SRE Survey! They donate $5 to the Red Cross for every completed survey, so let’s all work together and drive a huge donation!


The US Federal Trade Commission (FTC) put out a request for information about cloud providers, including reliability among other topics. Here’s Corey Quinn’s answer.

  Corey Quinn — The Duckbill Group

What can you do when running an incident feels like herding cats? This article has some tips.

  Robert Ross — FireHydrant

I have a confession. Despite having been hired multiple times in part due to my experience with monitoring platforms, I have come to hate monitoring.

This jaded tale also contains some good suggestions for dealing with monitoring pitfalls.

  Mathew Duggan

The cardinal rule of engineering:

your solution shouldn’t become your next problem.

  Kumar Amit — Mercari

Here’s the articlization of a talk Fred Hebert gave at QCon New York. The alternate title of the talk is:

This Is All Going To Hell Anyway
All We Can Do Is Influence How Long It’s Gonna Take

I had the pleasure of seeing a draft version of this talk at work, since (full disclosure) Fred is my coworker.

  Fred Hebert

This article makes the case that elastic scaling is both harder to implement and more important for use cases involving streaming updates to users in real-time.

  Mittul Madaan — Ably

An intro to pdsh, my favorite of the tools that run commands on many hosts via SSH.

  Amin Astaneh — Certo Modo

Updated: June 25, 2023 — 9:11 pm
A production of Tinker Tinker Tinker, LLC Frontier Theme