Articles
the Data Reliability Engineering team is here to monitor, automate and manage pipelines to enable our partner USDE teams to have the ease of mind to tackle projects to help Mercari move forward.
LameyerDaniel and OhshimaTakako β Mercari
Hiring in the Site Reliability Engineering (SRE) space is notoriously difficult. So it makes sense to figure out how to expand the hiring pool beyond existing SREs.
Ash Patel β SREpath
SREs end up writing a lot of YAML. I mean, a lot. Fortunately it’s a really simple language with no hidden gotchas, right? Right?!
Ruud van Asseldonk
Two Terraform changes that were developed and tested individually went out to production simultaneously, with unexpected results.
Jan David Nose β Rust
Code search is a different beast from normal english language searching. Regexes, punctuation, no word stemming, and GitHub’s scale made this a challenging design.
Timothy Clem β GitHub
This article argues that folks outside of engineering are doing incident response, whether they call it that or not.
incident.io
In incidents, we’re concentrating on resolving impact as quickly as possible, and this can impair our ability to gather the information we need after the fact in order to actually figure out what happened.
Jake Cohen β PagerDuty