SRE Weekly Issue #267

A message from our sponsor, StackHawk:

Serverless doesn’t mean secure. Use modern security testing tools to assess serverless applications for vulnerabilities during development.


Yet more proof that DNS behavior varies way more than is obvious at first glance. Who the heck thought longest common prefix matching was a good idea?

Charles Li — eBay

The application may log multiple lines during the lifecycle of a request. Stripe has found it invaluable to also log one final line with a fully summary of the request.

Brandur Leach — Stripe

This is a followup with more detail on the G-Suite outage I reported here last week. A database issue caused two separate outages.


Really great advice about 3 common pitfalls in implementing SL*s.


This research paper explores the marginal boundary, a set of conditions beyond which a system enters a different operating mode and an accident is much more likely. It discusses the concept of coupling between seemingly unrelated parts of the system and shows how economic incentives can push a system toward this boundary.

Dr. Richard Cook and Jens Rasmussen (Original paper)

Thai Wood — Resilience Roundup (summary)

This is an analysis of a recent BGP leak with a discussion about how the impact from such events can be mitigated through emerging best practices.

Alessandro Improta and Luca Sani — Catchpoint

How do you hand over ownership of a system, transferring enough knowledge that the new owners can maintain its availability and reliability successfully?

Aleksandra Gavrilovska — SoundCloud

Shopify works toward Black Friday / Cyber Monday all year long, through a combination of load testing, failure mode analysis, game days, and incident analysis.

Ryan McIlmoyl — Shopify


Updated: April 25, 2021 — 9:10 pm
SRE WEEKLY © 2015 Frontier Theme