This article explains why practices like DevOps and Site Reliability Engineering are essential for a successful technology business. Sure, they are touted as a way to change company culture and improve collaboration between teams, but what specific business value should you expect from investing in these capabilities?
Let’s start by remembering the goal of every business:
to make money by increasing throughput while simultaneously reducing inventory and operational expense.
In software development, let’s clarify the following terms:
- Throughput: number of customer subscriptions sold (in the case of a SaaS product). This can also be articulated as the rate of new features made available to the customer, which enables more sales (and renewals).
- Inventory: code not yet released to production and therefore not creating business value.
- Operational Expense: cost of infrastructure, cost of engineering time performing operational tasks
Therefore, the goal of every software engineering business is:
to make money by increasing the rate of subscriptions sold while shipping new code as quickly as possible and minimizing the cost of infrastructure and operations.
How do DevOps and SRE help meet this goal?
Improve Code Delivery
In order to sell, a product needs to have a set of features that are compelling enough for a potential customer to be willing to purchase a subscription. The engineering team must build and ship those features and iterate based on feedback from their customers- and do so faster than their competitors.
This requires the ability to deploy new code often. The higher the deployment frequency, the more opportunities the engineering team has to meet the market’s needs, which will enable more sales.
Similarly, in order to swiftly respond to customer feedback, short lead times between code being committed to the repository and being deployed to production are also essential. After all, customers want bugs to be fixed in a matter of hours, not days (or even weeks!).
Both of these capabilities are enabled by a mature CI/CD pipeline consisting of automation and robust test suites that allow for a rapid and uninterrupted flow of new features. CI/CD is a central technology of the DevOps toolchain as it allows software engineers to quickly experiment and iterate on what will delight the customer the most.
Reduce Infrastructure Costs
The more it costs to run infrastructure, the less profit margin the product has. At the same time, you can’t just skimp on hardware without risking reliability.
What can be done? Solutions can be found in performance engineering techniques commonly used in DevOps and SRE.
The first technique is capacity planning, which is the practice of using past metrics to forecast future hardware requirements. This ensures that you don’t overspend on hardware while having enough for peak load. This is a useful skill even in cloud environments, as techniques like horizontal scaling are not perfect solutions and aren’t easily applicable to all resource types (eg: databases).
Also, skills like load testing, workload characterization, and profiling allow for the identification of resource bottlenecks and their causes. By systematically identifying constrained resources and inefficient code, changes can be made over time to make the application perform more efficiently, requiring less hardware.
Reduce Operational Costs
A cornerstone of Site Reliability Engineering is the reduction of toil. Toil is defined as “work tied to running a production service that tends to be manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a service grows”. Examples of toil include:
- Incident response activities;
- Code releases;
- Service turnup and turndown;
- Capacity planning.
The more toil that an engineering team has, the less time they have to develop new features.
Therefore, automating away these tasks over time rather than continuing to perform them manually is key to keeping your engineering team productive and efficient, especially as the number of customers increases.
DevOps and SRE enable tech companies to succeed by increasing the flow of features to production, reducing unnecessary hardware spend, and allowing engineers to focus on writing code- which all directly improve the bottom line.
I’ve used these methodologies for years to help transform engineering teams and entire R&D departments. Interested in running a more profitable tech business? Let’s schedule an introduction!
(Image Credit: Image Hunter)