SRE Weekly Issue #172

Articles

How the Boeing 737 Max Disaster Looks to a Software Developer – IEEE Spectrum

An experienced pilot and programmer details the background behind the 737 MAX’s MCAS system and discusses the risks and motivations involved.

Boeing’s solution to its hardware problem was software.

Thanks to John Goerzen for this one.

Gregory Travis — IEEE Spectrum

Resilience Roundup – Cognitive Systems Engineering: New wine in new bottles

A detailed analysis of a paper by Eric Hollnagel and David Woods on designing systems that include humans and computers.

The operator detects failures better when he participates in system control as opposed to functioning only as a monitor…

Thai Wood (summary)

Failure is Familiar, Safety is Surprising

An essay on the difference in philosophies between Safety I and Safety II and on understanding how our systems succeed rather than focusing on how they fail.

Ryan Frantz

Microsoft to reduce Azure outages with Project Tardigrade

Azure’s project tardigrade is exploring interesting ideas like keeping VMs resident in memory even when the host kernel reboots. This reminds me of another similarly-named project.

Chris Kanaracus — TechTarget

Anatomy of a Cascading Failure

This is a followup to an article from last week about a Honeycomb incident, going into more detail on what went wrong and how they figured it out using Honeycomb itself.

Douglas Soo — Honeycomb

Preventing Pipeline Calls from Crashing Redis Clusters

On Feb 15th, 2019, a slave node in Redis, an in-memory data structure storage, failed requiring a replacement.

[…]

This blog post describes Grab’s post-mortem findings for the outage caused by the Redis Cluster failure.

· Michael Cartmell, Jiahao Huang, and Sandeep Kumar — Grab

How we optimized Magic Pocket for cold storage

I like how their chosen solution fetches from all the datacenters in the normal case, so they don’t experience a sudden shift in traffic pattern during a failover.

Preslav Le — Dropbox

Outages

GitHub
Gmail
Ankle Bracelets in the Netherlands
- These are the ankle bracelets used to monitor and enforce house arrest.
  
  the Dutch Ministry of Justice and Security had to step in and preemptively arrest and jail some of its most high-risk suspects
Reddit
Facebook and Instagram

SRE Weekly Issue #172

Articles

Outages

Subscribe

RSS

Mastodon

Search Issues

A message from our sponsor, VictorOps:

Articles

Outages

Subscribe

RSS

Mastodon

Search Issues