management

How to Show Your Value In DevOps/SRE

Best Practices

14

Dec

07:00

Since you are reading this post, I am sure that you can relate to the classic plight of the IT, sysadmin, or Operations team: They are invisible until things go wrong.

For practitioners of DevOps and Site Reliability Engineering, that can also be true, especially for teams where the low-hanging fruit has already been addressed.

When the big outage happens, it’s all too common for management to have the kneejerk reaction to ask questions like...

Kanban Quickstart

Best Practices

28

Jul

15:00

This article introduces Kanban(看板), a very effective process for organizing your team’s work and driving improvements, especially if you are on an interrupt-driven team such as Site Reliability Engineering, Operations, IT, or Customer Support.

The essential part of the process is the kanban board, which consists of cards representing each work item. Cards are moved between columns representing the state that the work item is in usually from left to right, such as:

Backlog

Cross-Functional Collaboration

Best Practices

09

Jun

16:00

The most valuable and impactful work is done through others and not through the strivings of just one person. In the tech industry, creating customer value is a really complicated process and involves the efforts of different people, teams, and perspectives.

Consider a SaaS company: in order for it to be successful, groups like Engineering, Customer Success, Sales, Marketing, and Finance all need to exist and work together in tandem to create a product that...

Running Successful Engagements

Best Practices

27

Mar

20:00

Previously we discussed several types of engagement models that SRE can use when collaborating with software engineering teams, as well as their tradeoffs. Let’s go over some ways in which SRE managers or team leads can successfully start and run an engagement!

To refresh, an SRE engagement can take the form of: taking on operational ownership of a service from an engineering team, embedding SREs on an engineering team, or providing a set of...

SRE Engagement Models

Best Practices

20

Mar

09:00

Last time we went over the basics of what it means to run an SRE team based on the original ideas that came from Google. Let’s talk about the ‘engagement model’, which describes the way that an individual SRE or team works with software engineering organizations to help them achieve their goals.

The SRE Workbook describes the various types of activities at length— in my experience individual SRE engagements tend to fall into...

SRE Essentials

Best Practices

06

Mar

11:00

Interested in launching a Site Reliability Engineering(SRE) team? They have been gaining in popularity at tech companies for the past decade— and for good reason! They drive higher levels of operational maturity, remove sources of toil and incidents that slow the pace of feature delivery, and help make services more reliable(hence the name).

However, just because you commission a team of engineers with the job title, doesn’t guarantee you’ll reap the rewards! How do you...