SRE Weekly Issue #452

A message from our sponsor, FireHydrant:

Practice Makes Prepared: Why Every Minor System Hiccup Is Your Team’s Secret Training Ground.

https://firehydrant.com/blog/the-hidden-value-of-lower-severity-incidents/

The Lunch Exercise was my favorite part of the Blackrock3 training, and now Slack has adapted it for their ongoing training.

How Slack trains engineers in incident response by ordering lunch together.

  Scott Nelson Windels — Slack

Cloudflare runs programs written in their custom language Topaz in the hot path. They use formal verification in production(!) to ensure that the set of Topaz programs make sense.

  ames Larisch, Suleman Ahmad, and Marwan Fayed — Cloudflare

Distributed counting is a challenging problem in computer science. In this blog post, we’ll explore the diverse counting requirements at Netflix, the challenges of achieving accurate counts in near real-time, and the rationale behind our chosen approach, including the necessary trade-offs.

  Rajiv Shringi, Oleksii Tkachuk and Kartik Sathyanarayanan — Netflix

It’s hard, and this article explains why in excellent detail. It also includes a discussion of options to consider when designing a chat system.

  Ably

In anticipation of https://aws-news.com‘s busiest period of the year, I redesigned the API access patterns to support very effective caching. This resulted in significantly reduced backend load and a much faster frontend.

  Luc van Donkersgoed — AWS News

Recover means that not only is everything back online, but the system is performing well and satisfying any QoS or SLAs AND a preventative approach has been implemented.

  Will Searle — Causely

Here’s a list of recommended talks for SREs attending re:Invent, with short descriptions explaining why they’re interesting.

  Jamie Baker

In this post, I’ll share exactly how we link our code to the team that owns it, so errors and alerting are routed to the right place with minimal maintenance burden.

  Martha Lambert — incident.io

Updated: November 24, 2024 — 9:32 pm
A production of Tinker Tinker Tinker, LLC Frontier Theme