SRE Weekly Issue #121

SPONSOR MESSAGE

Determining the right tools for your SRE team(s) can get confusing. So, VictorOps, InfluxData, and Grafana are putting on a webinar—May 16th, 1 pm ET—to help you build your SRE toolchain: http://try.victorops.com/SREWeekly/Webinar

Articles

This latest in the CRE Life Lessons series takes on dependencies and how they impact a service’s SLO in obvious and subtle ways.

Robert van Gent — Google

This company discovered that the benefits of microservices came with some significant downsides. Here’s how they turned to chaos testing to improve reliability.

Meredith Courtemanche — TechTaret

Keeping in mind that this is written by the CTO of Gremlin, it contains some good points about buying versus building your chaos engineering system. It would apply to other chaos engineering services too — if there were any.

Matt Fornaciari — Gremlin, Inc.

Even as an experienced Terraform user, I learned about some Terraform features I hadn’t been aware of.

Nic Jackson — Hashicorp

In issue #98, I linked to a recording of John Allspaw’s DOES17 talk. In case you didn’t have time to listen, here’s a transcript. If you didn’t have time to read the Stella Report, I highly recommend reading this as an intro to the major concepts therein.

John Allspaw

Outages

Updated: May 13, 2018 — 9:29 pm
A production of Tinker Tinker Tinker, LLC Frontier Theme