"When a measure becomes a target,
it ceases to be a good measure."
― Goodhart's Law

Traditional research presentations such as Software Productivity Decoded by Thomas Zimmermann (co-editor of the book Rethinking Productivity in Software Engineering discussed in the previous blog entry) have focused on productivity measures for the development of software delivered on premises, e.g.

Modification requests and added lines of code per year
Tasks per month
Function points per month
Source lines of code per hour
Lines of code per person month of coding effort
Amount of work completed per reported hour of effort for each technology
Ratio of produced logical code lines and spent effort
Average number of logical source statements output per month over the product development cycle
Total equivalent lines of code per person-month
Resolution time defined as the time, in days, it took to resolve a particular modification request
Number of editing events to number of selection and navigation events needed to find where to edit code

The Accelerate: State of DevOps Report 2019 has identified four metrics - commonly referred to as DevOps Research and Assessment (DORA) Metrics - that capture the effectiveness of the development and delivery process summarized in terms of throughput and stability. Their research has consistently shown that speed and stability are outcomes that enable each other.

They measure the throughput of the software delivery process using lead time of code changes from check-in to release along with deployment frequency. Stability is measured using time to restore— the time it takes from detecting a user-impacting incident to having it remediated— and change fail rate, a measure of the quality of the release process.

In addition to speed and stability, availability is important for operational performance. Availability is about ensuring a product or service is available to and can be accessed by your end users.

Deployment Frequency—
How often an organization successfully releases to production.
Lead Time for Changes—
The amount of time it takes a code commit to get into production.
Change Failure Rate—
The percentage of deployments causing a failure in production.
Time to Restore Service—
How long it takes an organization to recover from a failure in production.

I found the webinar by Jez Humble, CTO of DORA / Google Cloud Develeoper Advocate, to provide a great overview of the content of the DevOps Report 2019:

Performance metrics

Improving performance
Improving productivity
Culture

[Update 06/23/2022: See my new blog entry How to Misuse and Abuse DORA DevOps Metrics to avoid common pitfalls when applying DORA metrics in practice.]

Julian Colina endorses these process metrics and warns against flawed output metrics in his summary of the Top 5 Commonly Misused Metrics:

Lines of Code
Commit Frequency
Pull Request Count
Velocity or Story Points
"Impact“

Dan Lines even makes the point that Velocity is the Most Dangerous Metric for Dev Teams.

Velocity is a measure of predictability, not productivity. Never use velocity to measure performance and never share velocity outside of individual teams.

He proposes the following alternative measures:

If speed to value is your main goal, consider Cycle Time.
If predictability is your main goal, look at Iteration Churn.
If quality is your priority, Change Failure Rate and Mean Time to Restore are good.

Be aware of the kind of culture you want to create by applying these measures as measuring the wrong things for the wrong reasons can backfire.

Patrick Anderson acknowledges that DORA Metrics help to deliver more quickly but notes their limited focus on product development & delivery leaving out the product discovery phase. He advocates end-to-end Flow Metrics as part of Value Stream Management to deliver the right things more quickly at the right quality and cost and with the necessary team engagement.

Flow Time measures the whole system from ideation to production—starting from when work is accepted by the value stream and ending when the value is delivered to the customer.
Flow Velocity—How much customer value is delivered over time.
Flow Efficiency—What are the delays and wait times slowing you down?
Flow Load—Are demand vs. capacity balanced to ensure future productivity?
Flow Distribution—What are the trade-offs between value creation and protection work?

Check out the Linearb.io White Paper 17 Metrics for Modern Dev Leaders if you are looking for even more metrics clustered into three categories of KPIs (work quality, delivery pipeline, investment profile) across two dimensions (iterations, teams).

Somewhat similar Gitential has broken up its Value Drivers & Objective Metrics to Improve Your Software Development into four buckets (speed, quality, efficiency, collaboration):

The SPACE Framework (The SPACE of Developer Productivity) features five different dimensions of looking at productivity; hence the acronym SPACE:

Satisfaction is how fulfilled developers feel with their work, team, tools, or culture; Well-being is how healthy and happy they are, and how their work impacts it.
Performance is the outcome of a system or process.
Activity is a count of actions or outputs completed in the course of performing work.
Communication and collaboration capture how people and teams communicate and work together.
Efficiency and flow capture the ability to complete work or make progress on it with minimal interruptions or delays, whether individually or through a system.

To measure developer productivity, teams and leaders (and even individuals) should capture several metrics across multiple dimensions of the framework—at least three are recommended.

Another recommendation is that at least one of the metrics include perceptual measures such as survey data. […]. Many times, perceptual data may provide more accurate and complete information than what can be observed from instrumenting system behavior alone.

Including metrics from multiple dimensions and types of measurements often creates metrics in tension; this is by design, because a balanced view provides a truer picture of what is happening in your work and systems.

This leads to an important point about metrics and their effect on teams and organizations:
They signal what is important.

One way to see indirectly what is important in an organization is to see what is measured, because that often communicates what is valued and influences the way people behave and react. […]. As a corollary, adding to or removing metrics can nudge behavior, because that also communicates what is important.

Listen to the Tech Lead Journal podcast #43 - The SPACE of Developer Productivity and New Future of Work - Dr. Jenna Butler (Microsoft) for an overview of the framework and other research initiatives related to the "New Future of Work". See link above for transcript, mentions and noteworthy links to related research articles.

In this context you may want to take note of Steven A. Lowe's six heuristics for effective use of metrics:

Metrics cannot tell you the story; only the team can do that.
Comparing snowflakes is waste.
You can measure almost anything, but you can't pay attention to everything.
Business success metrics drive software improvements, not the other way round.
Every feature adds value; either measure it or don't do it.
Measure only what matters now.

And if you want to read more about the topic then this blog provides further references.

Discovery & Delivery – On Innovation, Software Product Management and Productivity

Tuesday, June 29, 2021

Software Productivity Metrics

"When a measure becomes a target,
it ceases to be a good measure."
― Goodhart's Law

Monday, June 28, 2021

Productivity in Software Engineering

Key Findings

Tuesday, June 29, 2021

Software Productivity Metrics

"When a measure becomes a target, it ceases to be a good measure." ― Goodhart's Law

Monday, June 28, 2021

Productivity in Software Engineering

Key Findings

"When a measure becomes a target,
it ceases to be a good measure."
― Goodhart's Law