SRE Weekly Issue #149

A message from our sponsor, VictorOps:

Runbook automation leads to nearly instant on-call incident response. SRE teams can leverage runbook automation to deepen cross-team collaboration, surface context to on-call responders, and shorten the incident lifecycle–ultimately helping overall service reliability:

http://try.victorops.com/sreweekly/runbook-automation-for-sre

Articles

But does that mean you don’t need to think about reliability issues associated with large-scale distributed systems? The answer is, not completely. While there are many things that GCP and Cloud Functions handle behind the scenes, you still need to keep a couple of best practices in mind while building a reliable serverless solution.

Slawomir Walkowski — Google

The Emotet malware gang is probably managing their server infrastructure better than most companies are running their internal or external IT systems.

Catalin Cimpanu — Zero Day

Designing a distributed data store is about juggling competing priorities. This author discusses the latency penalty you pay for synchronous replication, and why you might want it anyway.

Daniel Abadi

Learn how Etsy designed tooling and a repeatable process to forecast resource usage.

Daniel Schauenberg — Etsy

Check out how Grab implemented chaos engineering.

Roman Atachiants, Tharaka Wijebandara, Abeesh Thomas — Grab

Neat idea: use machine learning to select which automated tests to run for a given code change. The goal is a high likelihood of finding bugs while running fewer tests than traditional methods.

Mateusz Machalica, Alex Samylkin, Meredith Porth, and Satish Chandra — Facebook

In this blog post, we are going to discuss how the Auth0 Site Reliability team, led by Hernán Meydac Jean, used a progressive approach to build a mature service architecture characterized by high availability and reliability.

The system in question is a home-grown feature flags implementation.

Dan Arias — Auth0

Outages

The usual glut of Black Friday outages.  I hope you all had an uneventful Friday.

Updated: November 25, 2018 — 8:37 pm
A production of Tinker Tinker Tinker, LLC Frontier Theme