SRE Weekly Issue #406

A message from our sponsor, FireHydrant:

Signals is now available in beta. Sign up to experience alerting for modern DevOps teams: Page teams, not services. Ingest inputs from any source. Bucket pricing based on usage. And one platform — ring to retro — finally. https://firehydrant.com/blog/signals-beta-live/

This article describes how to clearly show your value delivered to a tech company as someone who focuses on non-functional requirements such as operability, performance, or reliability.

  Amin Astaneh — Certo Modo

Doggedly preventing a recurrence of an incident may not be the best way to protect our systems — and may in fact make things worse.

  Lorin Hochstein

Should your SLO cover a rolling 30 days? 7 days? A calendar month?

  Alex Ewerlöf

Threads was built in five months and had over 100 million users in its first week.

   Laine Campbell and Chunqiang (CQ) Tang — Meta

This article is full of advice on setting up an on-call process that’s livable and less likely to burn folks out.

  incident.io

A pilot violated a major aviation principle, and it was the right move. It’s very interesting to me that pilots are trained on the principle but not on the exceptions, with the expectation that they will react well in exceptional circumstances.

  Admiral Cloudberg

Integer IDs or UUIDs as your DB primary key? I can’t count the number of incidents I’ve been involved in where integer primary keys played a part.

  Bertrand Florat

Updated: January 7, 2024 — 9:32 pm
A production of Tinker Tinker Tinker, LLC Frontier Theme