SRE Weekly Issue #401

Maybe you’re thinking of skipping over “yet another article about blamelessness”? Don’t. This one has some great examples and stories and is well worth a read.

Michael Hart

5 SRE Confessions

I’m definitely guilty of a couple of these.

Code Reliant

Introducing The Debrief: A new podcast series from incident.io

New podcast relevant to our interests!

In this series, you’ll hear insightful conversations with engineers, product managers, co-founders and more, all about the debatable topic of incident management.

Luis Gonzalez — incident.io

A Spooky Performance Regression in AWS EBS Volumes

A puzzling performance regression in EBS volumes, seemingly reproducible across instances. Anyone else seeing anything like this?

Dustin Brown — dolthub

Scaling SRE Teams

This article presents a framework for scaling SRE teams by defining SRE processes, automating, and iterating.

Stelios Manioudakis — DZone

Alerts Should Work for You, Not the Other Way Around

Some tips on what makes a good alert and how to design your alerts to be actually useful, rather than just noise.

Leon Adato — Kentik

Multi-tiered SLOs

Why would you want multiple different targets for the same SLO? Read this one to find out.

Alex Ewerlöf

You don’t need CRDTs for collaborative experiences

Conflict-free Replicated Data Types are powerful, but they have downsides explained in this article, so it’d be great if we could avoid them when possible.

Zak Knill

SRE Weekly Issue #401

Subscribe

RSS

Mastodon

Search Issues

A message from our sponsor, FireHydrant:

Subscribe

RSS

Mastodon

Search Issues