From 91c64532e1058f5deec17e3a35e7cf9e5ba7e939 Mon Sep 17 00:00:00 2001 From: Stephen Gutekanst Date: Mon, 17 Aug 2020 12:14:50 -0700 Subject: [PATCH] oncall: call out the ops incidents log more prominently I have linked several people to the ops incidents log only to find "Wow, I had no idea that existed!" on several occassions. This pulls it front-and-center so that hopefully more people will discover it. --- handbook/engineering/on_call/index.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/handbook/engineering/on_call/index.md b/handbook/engineering/on_call/index.md index 4f75e1dbe10..7144dabe35f 100644 --- a/handbook/engineering/on_call/index.md +++ b/handbook/engineering/on_call/index.md @@ -16,6 +16,12 @@ We have an ops on-call rotation managed through [OpsGenie](https://opsgenie.com) 1. File issues for any followup work that needs to happen. 1. If alerts are too noisy and/or inactionable, take actions to fix or disable alerts. +## Ops incidents log + +All significant incidents that occur on Sourcegraph.com are recorded in the [ops incidents log](https://docs.google.com/document/d/1dtrOHs5STJYKvyjigL1kMm6u-W0mlyRSyVxPfKIOfEw/edit?usp=sharing). This helps us keep track of what has happened historically, discuss follow-up work, and gives insight into what types of incidents we see. + +False incidents (flaky alerts, etc.) should be tracked directly in GitHub issues and do not need a log entry. + ## Slack channels You'll want to be in #dev-ops, #buildkite, and #opsgenie on [Slack](../../communication/team_chat.md) in particular. Most of the work you do as the on-call engineer should be discussed in #dev-ops.