Happy holidays, for those that celebrate! I put this issue together in advance, so no Outages section this week.
This is another great deep-dive into strategies for zero-downtime deploys.
Suresh Mathew — eBay
How do you make sure your incident management process survives the growth of your team? This article has a useful list of things to cover as you train new team members.
David Caudill — Rootly
This article is published by my sponsor, Rootly, but their sponsorship did not influence its inclusion in this issue.
The trends in this article are:
- AIOps and self-healing platforms
- Service Meshes
- Lowcode DevOps
Biju Chacko — squadcast
I can’t get enough of these. Please write one about your company!
My favorite part is the discussion of Kyle Kingsbury’s work on Jepsen. Would distributed systems have even more problems if Kingsbury did not shed light on them?
PagerDuty analyzed usage data for their platform in order to draw inferences about how the pandemic has affected incident response.
There’s a ton of interesting stuff in here about confirmation bias and fear in adopting a new, objectively less risky process.
Robert Poston, MD