In this article, I will introduce several improvements being made by the Microservices SRE Team, embedded with other teams.
MizumotoShota — Mercari
What really stood out to me in this article is the Service Info section. A dashboard will quickly atrophy and lose its meaning without an explanation of what it’s for.
When things go wrong, who is in charge? And what does it feel like to do that role?
This is a summary of a forum discussion about incident command, in case you don’t have time to listen to the whole thing.
Emily Arnott — Blameless
Complex systems are weird, and a traditional deterministic view such as in older ITIL iterations doesn’t capture the situation. We need to evolve our practices.
How can you design and interpret metrics for systems optimized for latency or throughput?
You can optimize for latency or throughput in a given system, but not both, since the two are directly at odds.