This article relates to Donella H. Meadows’s book, Thinking in Systems.
What follows is Meadows’ list of leverage points outfitted with those my ideas of where or how they can be applied to software development and web operations.
I know its past an hour but… we got ~600 Nagios emails a day. Boss forbade us from muting any of them. In weekly status meeting, he’d often quiz on-call on a random alert. If oncall didnt know about it, boss would often scream at us…
Jason Antman (@j_antman)
Find out how the Couchbase folks use Jepsen to test their database offering.
A supportive on-call environment is critical to ensuring reliability and resiliency.
Deirdre Mahon — Honeycomb
This is a follow-on to an article I linked to awhile back.
It’s really simpler to call it Tech Risk.
I love the idea of tracking the decisions an organization makes and the risks they entail.
- Google App Engine
- Yahoo Mail
- AOL Mail
- Tesla App
- Some Tesla owners were locked out of their cars when the app stopped working.
- Amazon’s Elastic Block Store (EBS) in us-east-1
- Amazon experienced an outage that resulted in the total loss of a small percentage of EBS volumes.
- Heroku Incident Followups
- Heroku Incident #1896
- Also this one: #1897.