SRE Weekly Issue #225

A message from our sponsor, StackHawk:

Application security is shifting to a model where the engineers who write the code also take ownership of the security. Read our docs to learn more about how StackHawk makes that happen.
https://docs.stackhawk.com?utm_source=SREWeekly

Articles

This suggests an upcoming shift in our field:

50 percent of SREs believe they will be working remotely post COVID-19, as compared to only 20 percent prior to the pandemic.

Kameerath Kareem — Catchpoint

BONUS CONTENT: An outside take on the survey results is here (Mike Vizard — DevOps.com).

No one person can (or should) know everything. How do we allocate expertise and build connections in order to maximize resilience and adaptive capacity?

Will Gallego

A new feature was accidentally rolled out to too wide an audience, causing log message loss.

Heroku

[…] one slow block device can affect the performance of processes even when those processes don’t use the slow block device.

Kalyanasundaram Somasundaram — LinkedIn

Should you count scheduled maintenance against your error budget? It depends.

Jesus Climent — Google

An investigation in response to three incidents led to this stark conclusion about Cassandra’s “counter columns” feature:

In fact, they don’t appear to have any properties that make them a useful primitive for building predictable distributed systems.

Paddy Byers — Ably

This article explains why we should have cost data at our fingertips as we design cloud-based systems.

[…] a well-architected system is often a cost-efficient system.

CloudZero

This is a new concept to me, and I really like it:

Capacity for maneuver (CfM) is a measure of how much adaptability or room to respond to a new challenge that a given part of the system has, whether a person or autonomous agent.

Amir B. Farjadian, Benjamin Thomsen, Anuradha M. Annaswamy, and David D. Woods (original paper)

Thai Wood — Resilience Roundup (summary)

Outages

Updated: June 28, 2020 — 8:31 pm
SRE WEEKLY © 2015 Frontier Theme