Articles
Heresy! This article provides a counterpoint to many of the benefits of IaC. While IaC may still be the right answer, it’s not a slam dunk.
Luke Shaughnessy
Short but sweet, this article outlines three focus areas that the author argues should be a part of any SRE role.
Kyle Robertson
Way beyond just an intro to aperture, this article also covers microservice architecture failure modes, techniques used to avoid failures, and the weaknesses in those techniques.
Cong Ma and Matt Ranney — Doordash
I’m including this here not just for the staff+ SREs out there. Many of these skills are important for SREs to develop much earlier than the Staff level, since our role can be so collaborative.
Ryn Daniels — GitHub
I love that fully half of this article is about mentoring developing SREs in identifying and managing risk.
Ross Brodbeck
Learn how the Honeycomb SRE team has structured its work, including a fully copy of the team charter.
Fred Hebert — Honeycomb
Full disclosure: Honeycomb is my employer and I am a member of the SRE team described in this article.
An intriguing approach: define technical debt as a risk, and manage it in much the same way that we handle reliability-related risks, with a “threat budget”.
Jason Bloomberg — Intellyx
Instead, because our time and attention is limited, we have to get good at identifying cues to indicate that our models have gotten stale or are incorrect.
Lorin Hochstein
Using a simulation, this article comes to the conclusion that a hybrid between FIFO and LIFO is better than picking just one.
Eugene Retunsky — DZone