SRE Weekly Issue #226

A message from our sponsor, StackHawk:

When a team introduces security bugs, they don’t know because nothing tells them. We test for everything else… why not security bugs?


This is an article version of an interview with Dr. Danielle Ofri, author of a new book When We Do Harm, on NPR’s Fresh Air. I especially loved the part about near misses.

Bridget Bentz, Molly Seavy-Nesper, Deborah Franklin, Sam Briger, and Thea Chaloner — NPR

Maintenance of the logging system had unintended downstream effects including log loss and failure of the system that manages dynos.

In this incident, a TLS certificate was deployed without its intermediate, resulting in failures for some clients.

I wrote this after attending the Resilience Engienering Association’s webinar with panelists Dr. Richard Cook, John Allspaw, and Nora Jones, moderated by Laura Maguire. Once the recording is posted, I highly recommend watching!

Lex Neva

As SREs, we need to be laser focused on the user’s experience. Our SLIs should reflect that.

Emily Arnott — Blameless

This two-part series is an in-depth look at how Twitter adopted SRE, before SRE was even a thing.



Updated: July 5, 2020 — 9:21 pm
A production of Tinker Tinker Tinker, LLC Frontier Theme