SRE Weekly Issue #223

A message from our sponsor, StackHawk:

DevSecCon24 starts tonight at 10pm ET and runs for 24 hours. Tune in for great talks on building and deploying secure, resiliant software. Grab free tickets at the link here, and visit StackHawk’s virtual booth to get a T Shirt.
https://www.eventbrite.com/e/devseccon24-virtual-conference-tickets-94550734793?discount=StackHawk20

Articles

I’ve used this technique in the past with a single-page app and a highly-cacheable API, to ensure stability even when the backend goes down.

Patrick Hamann

Full disclosure: Fastly is my employer.

Here’s a deep dive into how your CA’s certificate can affect your application’s reliability — at least in the eyes of your customers.

Scott Helme

Here’s Coinbase’s followup from their outage last week.

Michael de Hoog — Coinbase

Kyle Kingsbury recently did an analysis of PostgreSQL 12.3 and found that under certain conditions it violated guarantees it makes about transactions, including violations of the serializability transaction isolation level.

I thought it would be fun to use one of his counterexamples to illustrate what serializable means.

Lorin Hochstein

Failure mode and effects analysis (FMEA) is a decades-old method for identifying all possible failures in a design, a manufacturing or assembly process, or a product or service.

If you’ve been tasked with applying FMEA in your SRE work, this article will get you started.

Matthew Helmke

Outages

Updated: June 14, 2020 — 8:35 pm
A production of Tinker Tinker Tinker, LLC Frontier Theme