The author of this one draws a line between their two interests of formal methods and resilience engineering, and I’m so here for it.
Lorin Hochstein
In this part of the Scaling Nextdoor’s Datastores blog series, we’ll explore how the Core-Services team at Nextdoor serializes database data for caching while ensuring forward and backward compatibility between the cache and application code.
Ronak Shah — Nextdoor
MySQL’s ALTER TABLE INPLACE
has limitations and downsides, and INSTANT
does too, as explained in this article.
Shlomi Noach — Planetscale
If you have multiple different types of work in your system, a queue per type of work may be a good choice.
Bonus(?): includes a bathroom-based analogy.
Marc Brooker
One Lambda function per URL path? Or a monolithic function that handles multiple paths? There are benefits and drawbacks to each.
Yan Cui
Published on April 1.
The truth is, many incidents move faster when there’s executive oversight — a sense of urgency, pressure, and someone repeatedly asking, “What’s the ETA?”
Chris Evans — incident.io
This article is published by my sponsor, incident.io, but their sponsorship did not influence its inclusion in this issue.
I’m seeing a lot of echoes of Bainbridge’s Ironies of Automation in this article about AIOps and AI tooling. If AI handles most coding and incidents, how will humans handle the outliers?
Hamed Silatani — Uptime Labs
I wasn’t able to make it, so I really appreciate this recap. Sounds like SRECon was, unsurprisingly, heavily focused on AI this time around.
Niall Murphy