SRE Weekly Issue #409

A message from our sponsor, FireHydrant:

It’s time for a new world of alerting tools that prioritize engineer well-being and efficiency. The future lies in intelligent systems that are compatible with real life and use conditional rules to adapt and refine thresholds, reducing alert fatigue.
https://firehydrant.com/blog/the-alert-fatigue-dilemma-a-call-for-change-in-how-we-manage-on-call/

I’ve occasionally wondered what’s behind Slack’s /remind or “clear my away status after my vacation ends”. Now I know!

  Claire Adams

This article is an exploration of consistency and coordination in distributed systems, with lots of really interesting examples.

  Lorin Hochstein

Lots of good stuff in here, including infrastructure, monitoring, and incident management tools.

   saifeddine Rajhi

my first conference

Whew, way to dive into the deep end!

  Mike [surname unknown] — SREZone

This article explains why circuit breakers are especially useful in microservice architectures based on Lambda. It explains how to implement circuit breakers using Step Functions.

   Satrajit Basu — DZone

Definitely some interesting (and spicy!) takes in this one.

  Code Reliant

When you’re at LinkedIn’s scale, building an automated abuse mitigation means designing for high throughput. The answer: lots of caching.

  Amit Mathapati — LinkedIn

A short but thought-provoking article about where SREs belong in the management heirarchy, and why.

  Jamie Allen

Updated: January 28, 2024 — 9:07 pm
A production of Tinker Tinker Tinker, LLC Frontier Theme