There are quite a few pitfalls waiting for you if you try to implement SLOs for your mobile app. This article explains and offers strategies.
Virna Sekuj — The New Stack
Blamelessness in incident retrospectives can be a difficult concept to truly internalize. This article describes 3 common “failure modes”, that is, ways in which organizations struggle with blamelessness.
Tom Elliott — The Friday Deploy
Cloudflare spends a lot of time thinking about cooling, and it’s fascinating. I didn’t realize that spinning a fan faster consumed so much more energy!
Leslye Paniagua — Cloudflare
Explore the pitfalls associated with the excessive creation of microservices, insights on their causes, implications, and potential strategies for mitigation.
Sumit Kumar — DZone
Netflix stores a truly obscene number of events, each of which has a timestamp and a set of key-value pairs. This article goes into a ton of detail on how they built their system.
Rajiv Shringi, Vinay Chella, Kaidan Fullerton, Oleksii Tkachuk, and Joey Lynch — Netflix
A fun debugging story for a confusing crash bug, in which they found 6 other related bugs along the way.
Brett Wines — Slack
My favorite one is about the principle “You Ain’t Gonna Need It”:
The flip side of YAGNI, however, is that at some point you might actually need it.
Luc van Donkersgoed
When you create an index on multiple columns in Postgres, you’ll need to be sure that the order of the fields in the index allows it to be applied to your queries, as these folks learned.
Jean-Mark Wright