Article Blog Image

How to Show Your Value In DevOps/SRE

Best Practices

Since you are reading this post, I am sure that you can relate to the classic plight of the IT, sysadmin, or Operations team: They are invisible until things go wrong.

For practitioners of DevOps and Site Reliability Engineering, that can also be true, especially for teams where the low-hanging fruit has already been addressed.

When the big outage happens, it’s all too common for management to have the kneejerk reaction to ask questions like...

Article Blog Image

Kanban Quickstart

Best Practices

This article introduces Kanban(看板), a very effective process for organizing your team’s work and driving improvements, especially if you are on an interrupt-driven team such as Site Reliability Engineering, Operations, IT, or Customer Support.

The essential part of the process is the kanban board, which consists of cards representing each work item. Cards are moved between columns representing the state that the work item is in usually from left to right, such as:

  • Backlog
  • ...
Article Blog Image

Cross-Functional Collaboration

Best Practices

The most valuable and impactful work is done through others and not through the strivings of just one person. In the tech industry, creating customer value is a really complicated process and involves the efforts of different people, teams, and perspectives.

Consider a SaaS company: in order for it to be successful, groups like Engineering, Customer Success, Sales, Marketing, and Finance all need to exist and work together in tandem to create a product that...

Article Blog Image

Running Successful Engagements

Best Practices

Previously we discussed several types of engagement models that SRE can use when collaborating with software engineering teams, as well as their tradeoffs. Let’s go over some ways in which SRE managers or team leads can successfully start and run an engagement!

To refresh, an SRE engagement can take the form of: taking on operational ownership of a service from an engineering team, embedding SREs on an engineering team, or providing a set of...

Article Blog Image

SRE Engagement Models

Best Practices

Last time we went over the basics of what it means to run an SRE team based on the original ideas that came from Google. Let’s talk about the ‘engagement model’, which describes the way that an individual SRE or team works with software engineering organizations to help them achieve their goals.

The SRE Workbook describes the various types of activities at length— in my experience individual SRE engagements tend to fall into...

Article Blog Image

SRE Essentials

Best Practices

Interested in launching a Site Reliability Engineering(SRE) team? They have been gaining in popularity at tech companies for the past decade— and for good reason! They drive higher levels of operational maturity, remove sources of toil and incidents that slow the pace of feature delivery, and help make services more reliable(hence the name).

However, just because you commission a team of engineers with the job title, doesn’t guarantee you’ll reap the rewards! How do you...