This well-researched article caught me by surprise. It’s shocking that Ably received advice from AWS to stay under 400,000 simultaneous connections, despite Amazon’s own documentation stating support for “millions of connections per second”.
Paddy Byers — Ably
This blog is about how a group of hard-working individuals, with unique skills and working methods, managed to create a successful SRE team.
There’s a lot of detail about what their SREs do and how they communicate, with 3 projects as case studies.
Sergio Galvan — Algolia
Wait, there are 9 now?
Marc Hornbeek — Container Journal
There’s a nice little discussion of why “human error” is not a good enough answer for why a deviation (from standard operating procedure) happened.
Susan J. Schniepp and Steven J. Lynn — Pharmaceutical Technolog
They deployed an optimization that skipped sending some requests to the backend… and the backend metrics got worse. Why? Hint: aggregate metrics.
Dominik Sandjaja — Trivago