Detailed metrics and explanations for KPIs used in our Targeted Technical Solutions.
To measure our success, we'll use KPIs that capture both the speed and efficiency of our work, along with its impact on the business. This is not an exhaustive list, but rather a guide to the types of KPIs we might utilize, recognizing that the most relevant metrics will be determined on a project-by-project basis.
These metrics focus on the speed and stability of software delivery. They are excellent indicators of the efficiency of your development and operations processes.
Measures the frequency of software deployments to production. Higher deployment frequency indicates a streamlined and automated development pipeline.
Higher DF directly contributes to faster time-to-value (one of your stated KPIs). It also suggests a mature CI/CD pipeline and the ability to quickly iterate and respond to customer needs, which are crucial for "rapid wins."
Well-defined stream-aligned teams with clear responsibilities and streamlined enabling team support can optimize deployment processes, increasing DF.
Elite performers deploy multiple times per day.
Measures the time it takes for a code commit to reach production. Shorter lead times signify efficient development, testing, and deployment processes.
Reduced LT directly translates to faster time-to-value. It means you can quickly deliver features and bug fixes, impacting user adoption rates and customer satisfaction.
Reduced interactions/hand-offs between teams (e.g., through platform teams providing self-service capabilities) lead to reduced LT.
Elite performers have a lead time of less than one hour.
Measures the percentage of deployments that result in production failures. A lower CFR indicates higher quality software and robust testing practices.
Lower CFR means fewer rollbacks and disruptions, improving service reliability and contributing to a positive customer experience. It reflects the stability aspect of "rapid wins" - we want speed and reliability.
Enabling teams providing tooling or guidance for automated testing and quality checks, and platform teams building resilient infrastructure help minimize CFR.
Elite performers have a change failure rate between 0-15%.
Measures the time it takes to restore service after an incident. A shorter MTTR reflects effective incident response and recovery procedures.
Lower MTTR minimizes downtime and its impact on users, directly impacting customer satisfaction and operational efficiency. It's about minimizing the negative impact when things inevitably go wrong.
Clear ownership of services by stream-aligned teams, supported by platform team tooling for incident management, monitoring and observability can reduce MTTR. Complicated subsystem teams also must have clear ownership.
Elite performers have an MTTR of less than one hour.
These metrics are directly tied to the specific business goals the technology implementation aims to achieve.
Measures the time it takes for a new feature, product, or service to deliver tangible value to the business or its customers. This could be measured from project initiation, from first deployment, or another relevant milestone.
This is a core KPI for this solution. It directly reflects the "rapid wins" promise. It's heavily influenced by DORA metrics (especially DF and LT).
Well-structured teams with clear goals and minimal dependencies can significantly reduce TTV by enabling parallel workstreams and focused delivery.
For a new customer-facing feature, TTV could be the time from project start to the first 1000 users actively using the feature and generating revenue.
Tracks the reduction in operational costs achieved through the technology implementation.
Directly tied to the "optimizing operations" use case in your "Ideal for" section. Cloud-native solutions and AI can automate tasks, optimize resource utilization, and improve efficiency, leading to cost reductions.
Platform teams can help drive down costs through economies of scale in providing underlying infrastructure and services, and by simplifying adoption of cost-saving measures through self-service tooling.
Reduction in infrastructure costs due to cloud migration, or reduction in customer support costs due to AI-powered chatbots.
Measures how quickly and extensively users are adopting the new feature, product, or service.
Directly linked to the "improving customer experience" and "launching a new product or service" use cases. High adoption indicates the solution is meeting user needs and delivering value.
Stream-aligned teams focused on specific user journeys can improve adoption by delivering features that directly address user pain points and needs.
Percentage of customers using a new self-service portal, or the number of new sign-ups for a new product.
Measures customer satisfaction with the specific area impacted by the technology solution.
Directly linked to "improving customer experience." It reflects the impact of the "rapid wins" on the end-users.
Teams aligned to customer journeys and empowered to act on customer feedback will have a larger positive impact on CSAT/NPS.
Change in CSAT score for the customer support process after implementing AI-powered chatbots, or change in NPS after launching a new product feature.
These metrics provide insights into the health, performance, and efficiency of your cloud-native infrastructure and applications.
Percentage of infrastructure provisioned and managed through code.
Higher IaC adoption improves consistency, reduces manual errors, and speeds up deployment frequency (DF) and lead time for changes (LT).
Platform teams typically champion and facilitate IaC adoption across the organization.
Percentage of tasks (e.g., testing, deployment, scaling) that are automated.
Higher automation improves efficiency, reduces manual errors, and positively impacts DORA metrics.
Stream-aligned and platform teams collaborate to identify and implement automation opportunities.
Measures how efficiently cloud resources (CPU, memory, storage) are being used.
Optimized resource utilization leads to cost savings, a key business impact metric.
Platform teams are crucial in providing monitoring and tooling to optimize resource utilization across the organization.
Percentage of time a service is operational and accessible to users. Often measured in "nines" (e.g., 99.9%, 99.99%).
High availability is crucial for customer satisfaction and operational efficiency.
Shared responsibility between stream-aligned teams building the services and platform teams providing resilient infrastructure.
Track the effectiveness of security practices.
Security is paramount in cloud-native environments and impacts service availability and customer trust.
A dedicated security team (which could be a specialized platform team or an enabling team) would drive these, and work with stream-aligned and other platform teams to enhance security.
These KPIs are interconnected and create a holistic picture:
By tracking and improving some of these linked KPIs, we demonstrate the effectiveness of the "Targeted Technical Solution," ensuring that the "rapid wins" are not just fast but also stable, secure, cost-effective, and genuinely valuable to your businesses.