Tuesday, June 29, 2021

Software Productivity Metrics

"When a measure becomes a target, 
it ceases to be a good measure." 
Goodhart's Law 

Traditional research presentations such as Software Productivity Decoded by Thomas Zimmermann (co-editor of the book Rethinking Productivity in Software Engineering discussed in the previous blog entry) have focused on productivity measures for the development of software delivered on premises, e.g.

  • Modification requests and added lines of code per year
  • Tasks per month
  • Function points per month
  • Source lines of code per hour
  • Lines of code per person month of coding effort
  • Amount of work completed per reported hour of effort for each technology
  • Ratio of produced logical code lines and spent effort
  • Average number of logical source statements output per month over the product development cycle
  • Total equivalent lines of code per person-month
  • Resolution time defined as the time, in days, it took to resolve a particular modification request
  • Number of editing events to number of selection and navigation events needed to find where to edit code


The Accelerate: State of DevOps Report 2019 has identified four metrics - commonly referred to as DevOps Research and Assessment (DORA) Metrics - that capture the effectiveness of the development and delivery process summarized in terms of throughput and stability. Their research has consistently shown that speed and stability are outcomes that enable each other.

They measure the throughput of the software delivery process using lead time of code changes from check-in to release along with deployment frequency. Stability is measured using time to restore— the time it takes from detecting a user-impacting incident to having it remediated— and change fail rate, a measure of the quality of the release process. 

In addition to speed and stability, availability is important for operational performance. Availability is about ensuring a product or service is available to and can be accessed by your end users.

  • Deployment Frequency
    How often an organization successfully releases to production.
  • Lead Time for Changes
    The amount of time it takes a code commit to get into production.
  • Change Failure Rate
    The percentage of deployments causing a failure in production.
  • Time to Restore Service
    How long it takes an organization to recover from a failure in production.

I found the webinar by Jez Humble, CTO of DORA / Google Cloud Develeoper Advocate, to provide a great overview of the content of the DevOps Report 2019:

  • Performance metrics
  • Improving performance
  • Improving productivity 
  • Culture 
  • [Update 06/23/2022: See my new blog entry How to Misuse and Abuse DORA DevOps Metrics to avoid common pitfalls when applying DORA metrics in practice.]

    Julian Colina endorses these process metrics and warns against flawed output metrics in his summary of the Top 5 Commonly Misused Metrics:

    1. Lines of Code
    2. Commit Frequency
    3. Pull Request Count
    4. Velocity or Story Points
    5. "Impact“

    Dan Lines even makes the point that Velocity is the Most Dangerous Metric for Dev Teams

    Velocity is a measure of predictability, not productivity. Never use velocity to measure performance and never share velocity outside of individual teams.

    He proposes the following alternative measures:

    • If speed to value is your main goal, consider Cycle Time.
    • If predictability is your main goal, look at Iteration Churn.
    • If quality is your priority, Change Failure Rate and Mean Time to Restore are good.

    Be aware of the kind of culture you want to create by applying these measures as measuring the wrong things for the wrong reasons can backfire. 

    Patrick Anderson acknowledges that DORA Metrics help to deliver more quickly but notes their limited focus on product development & delivery leaving out the product discovery phase. He advocates end-to-end Flow Metrics as part of Value Stream Management to deliver the right things more quickly at the right quality and cost and with the necessary team engagement.

    • Flow Time measures the whole system from ideation to production—starting from when work is accepted by the value stream and ending when the value is delivered to the customer. 
    • Flow Velocity—How much customer value is delivered over time.
    • Flow Efficiency—What are the delays and wait times slowing you down?
    • Flow Load—Are demand vs. capacity balanced to ensure future productivity?
    • Flow Distribution—What are the trade-offs between value creation and protection work?

    Check out the Linearb.io White Paper 17 Metrics for Modern Dev Leaders if you are looking for even more metrics clustered into three categories of KPIs (work quality, delivery pipeline, investment profile) across two dimensions (iterations, teams).

    Somewhat similar Gitential has broken up its Value Drivers & Objective Metrics to Improve Your Software Development into four buckets (speed, quality, efficiency, collaboration):













    The SPACE Framework (The SPACE of Developer Productivity) features five different dimensions of looking at productivity; hence the acronym SPACE:

    • Satisfaction is how fulfilled developers feel with their work, team, tools, or culture; Well-being is how healthy and happy they are, and how their work impacts it.
    • Performance is the outcome of a system or process.
    • Activity is a count of actions or outputs completed in the course of performing work.
    • Communication and collaboration capture how people and teams communicate and work together.
    • Efficiency and flow capture the ability to complete work or make progress on it with minimal interruptions or delays, whether individually or through a system.

    To measure developer productivity, teams and leaders (and even individuals) should capture several metrics across multiple dimensions of the framework—at least three are recommended.



    Another recommendation is that at least one of the metrics include perceptual measures such as survey data. […]. Many times, perceptual data may provide more accurate and complete information than what can be observed from instrumenting system behavior alone.

    Including metrics from multiple dimensions and types of measurements often creates metrics in tension; this is by design, because a balanced view provides a truer picture of what is happening in your work and systems. 

    This leads to an important point about metrics and their effect on teams and organizations: 
    They signal what is important. 

    One way to see indirectly what is important in an organization is to see what is measured, because that often communicates what is valued and influences the way people behave and react. […]. As a corollary, adding to or removing metrics can nudge behavior, because that also communicates what is important.

    Listen to the Tech Lead Journal podcast #43 - The SPACE of Developer Productivity and New Future of Work - Dr. Jenna Butler (Microsoft) for an overview of the framework and other research initiatives related to the "New Future of Work". See link above for transcript, mentions and noteworthy links to related research articles.


    In this context you may want to take note of Steven A. Lowe's six heuristics for effective use of metrics:

    1. Metrics cannot tell you the story; only the team can do that.
    2. Comparing snowflakes is waste.
    3. You can measure almost anything, but you can't pay attention to everything.
    4. Business success metrics drive software improvements, not the other way round.
    5. Every feature adds value; either measure it or don't do it.
    6. Measure only what matters now.

    And if you want to read more about the topic then this blog provides further references. 

    Monday, June 28, 2021

    Productivity in Software Engineering

    Here are my key takeaways from the 2019 book Rethinking Productivity in Software Engineering (edited by Caitlin Sadowski and Thomas Zimmermann). 


    This open access book collects the wisdom of the 2017 "Dagstuhl" seminar on productivity in software engineering, a meeting of community leaders, who came together with the goal of rethinking traditional definitions and measures of productivity.

    The results of their work includes chapters covering definitions and core concepts related to productivity, guidelines for measuring productivity in specific contexts, best practices and pitfalls, and theories and open questions on productivity. 

    Key Findings

    • There is no single metric to measure software development productivity
      and attempting to find one is counterproductive
    • Many productivity factors (technical, social, cultural) need to be considered (> Context)
      • Relationship between developer job satisfaction and productivity
    • Different stakeholders may have varied goals and interpretations of any sort of productivity measurement (> Alignment)
    • Individual developers, teams, organizations and market have different perceptions of productivity (> Level)
      • Productivity goals may be in tension across these different groups
      • Developers do not like metrics focused on identifying the productivity of individual engineers
    • Productivity perceptions vary greatly according to the period of time that is considered (> Time)
    What to do instead: 
    • Design a set of metrics tailored for answering a specific goal
    • Invest in finding and growing managers who can observe productivity 


    BTW, the myth of the "10x programmer" also gets debunked.

    Most of the key points outlined above are covered in Chapter 2 No Single Metric Captures Productivity (Ciera Jaspan, Caitlin Sadowski; Google) which quotes Bill Gates:

    “Measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs.”


    A variety of alternative metrics are discussed in my next blog article