Article Blog Image

Webinar: Lean SRE


When we think about Site Reliability Engineering, we tend to associate it with large tech companies that have the budget to build entire departments to improve production. I think that smaller organizations and startups sadly avoid adopting these practices due to that misconception.

I argue that SRE can be implemented by much smaller companies and yield significant benefits in reduced operational costs and time savings, freeing them to build a more compelling product.

Nothing is stopping a startup from defining SLOs, performing blameless post-mortems, or automating away sources of toil. However, how they approach establishing those practices will differ from larger companies, and won’t completely line up with what’s in the Google Book.

Similarly, bootstrapping an SRE program at a larger company can be achievable by starting from just one SWE team.

I call this approach ‘Lean SRE’, and will be running a webinar on April 11th at 2PM ET to explore this subject in greater detail.

To provide some value up front and foreshadow what’s to come, consider the Plan-Do-Check-Act cycle from Deming.

PDCA is a well-established idea these days in the DevOps and Agile community on how to establish continuous improvement (kaizen) for a business process. The question ‘Lean SRE’ attempts to answer is:

“What are the minimum SRE practices a team needs to continually improve the state of production, regardless of size or budget?”

EDIT: you can access the recording of the webinar here.