SRE Weekly Issue #363

A message from our sponsor, Rootly:

Manage incidents directly from Slack with Rootly 🚒.

Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?:


A super in-depth look at on-call compensation strategies. Includes a sampling of companies and how much they pay (if anything).

  Gergely Orosz — The Pragmatic Engineer

Husky uses a nifty sharding strategy where a customer’s shard allocation changes over time automatically based on load.

  Daniel Intskirveli — Datadog

This analogy goes far enough to even include rules. Anyone up for a round?

  Robert Ross

[…] in order to be truly great at being an SRE you will constantly need to understand how to work with people in the organization, how to set expectations and how to move the needle on people’s understanding of reliability.

  Ross Brodbeck

MongoDB -> Cassandra -> ScyllaDB. Storing a ton of stuff is hard.

  Bo Ingram — Discord

When designing complex technical systems, you should ask yourself, “how does the human operator fit into the picture”.

  Cursed Quail

It sounds like it was a great conference!

  Paige Cruz — Chronosphere

[…] complex systems don’t yield to analysis. We have to add another skill: sense-making.

  Jessica Kerr — Honeycomb
  Full disclosure: Honeycomb is my employer.

Updated: March 12, 2023 — 8:46 pm
A production of Tinker Tinker Tinker, LLC Frontier Theme