This article continues the discussion on how your team can learn from failure after a production incident. While write-ups are very important in capturing and documenting what took place, the real value is created from an open and deliberate conversation with the team to identify the lessons learned and the improvements needed to create a more reliable system. That conversation is the post-mortem.
Post-mortems are the primary mechanism for teams to learn from failure....