SRE Weekly Issue #169

Articles

Boeing’s crisis: AOPA safety expert weighs in

My coworker pointed me toward this article, and we had a really great conversation. I shared this article that I’d linked previously here, and it hit me: Boeing (and the FAA?) assumed MCAS was fine because a failure in it would look like a normal kind of failure with an established recovery procedure.

The problem is, we’ve seen that the recovery procedure can fail if the plane is moving so fast toward the ground that the pilots can’t physically pull it out of a dive. And it seems possible that no one knew that the recovery mechanism had this fatal vulnerability. This has all the hallmarks of a classic complex failure.

Thanks to John Goerzen for this one.

Richard McSpadden — AOPA

Colm MacCárthaigh on Twitter: Heartbleed

Pretty much any thread by Colm MacCárthaigh is a great read.

I think right around this minute is just about exactly 5 years since the Heartbleed vulnerability in OpenSSL became public. I remember the day vividly, and if you’re interested, allow me to tell you about how the day, and the subsequent months, and years unfolded …

Colm MacCárthaigh

A New Bee’s First Oncall

Find out why going on call made sense for a Developer Advocate and how it went.

Liz Fong-Jones — Honeycomb

Some internet outages predicted for the coming month as ‘768k Day’ approaches

As the BGP route table grows, some devices will soon run out of space to store it all.

Catalin Cimpanu

Minimising the Risk of Data Damage

The risk of logical damage to the data in a DB is the kind of risk that means there’s no such thing as a true rollback (You Can’t Have a Rollback Button).

Benji Weber

Peering into the future of Resilience Engineering in Tech

Our field is evolving toward adopting resilience engineering, and it’s not an easy process. This post goes into some detail on the mental struggle and points in the direction we need to go to get there.

Will Gallego [Note: Will is my coworker]

Outages

Gmail Suffers Two-Hour Global Outage: Reports 04/18/2019
Google Oauth
- Seems like this may have effectively taken down Gmail.
Grindr
1&1 Ionos

SRE Weekly Issue #169

Articles

Outages

Subscribe

RSS

Mastodon

Search Issues

A message from our sponsor, VictorOps:

Articles

Outages

Subscribe

RSS

Mastodon

Search Issues