Our Incident Management series discussed so far the importance of monitoring and a solid escalation policy in the swift detection of production issues. Both of them depend on a third capability that we will go over today: alerting.

Alerting notifies the engineering team to appropriately and timely respond to problems in production based on their severity.

They tend to fall into three categories:

  • Page: meets the definition of an emergency and requires...