A Lexicon of Software Development Debt

David Cottingham
Cloudy Musings
Published in
15 min readApr 6, 2017

--

Image from Morgufile, by andyk

Technical debt is a concept familiar to anyone who has worked on a software product that is maturing. Shortcuts taken at the beginning return to haunt developers in the form of hard-to-fix customer issues, or make adding new functionality akin to piling another Jenga block upon an already teetering tower. But what is included in the term is nebulous: perhaps the only good characterisation is to use its antithesis, namely, delivering features, which seems a poor definition.

Much (virtual) ink has been spent on discussing technical debt. In this post, my aim is to discuss other forms of “debt” when developing software, borrowings that must somehow be measured, and balanced against each other, in order to properly prioritise work.

A caution: many kinds of delayed or incomplete work do not constitute debt. Be wary of classifying everything as such!

Technical Debt

Firstly, let us consider a narrower definition of “Technical Debt” than the archetypal one. As the reader will know, software that is built up over time (may!) start(s) out with a simple design, and clean, even elegant, code, but gradually grows various unsightly appendages. These occur as new functionality is “bolted on” to an existing implementation that had never considered that new use case, or as bug fixes to address the issues found by a myriad of “special case” customers over time. Shortcuts are sometimes taken because a given ship date must be met, or code is duplicated because it is seen as less risky than to modify the original method. The result can be fragile, hard to understand, and difficult to add to code.

Some of these aspects are open to measurement: for example, cyclomatic complexity measures the total number of execution paths through a particular code “block”. Above a certain number (20) it’s reckoned to be difficult and risky to maintain (see How to Calculate Technical Debt and Express it Clearly for more on that). How one assigns a monetary cost to this is not an exact science, but it would not be unreasonable to measure the length of time an engineer takes to fix a bug in code with high cyclomatic complexity versus a bug in less complex code. Average that out over multiple bugs, and it ought to be possible to have an approximate relationship of person hours to cyclomatic complexity, and thus make the case for improving the code.

Code duplication is another sin that is clearly open to measurement. Again, how one then assesses the costs is open to interpretation; measuring the effort involved in adding new functionality where multiple (very similar) code blocks must be updated (not forgetting the potential for errors to creep in as a very-similar-but-not-identical task is performed repeatedly) should give some idea of the benefit of tackling the problem.

In any case, these issues can be tackled by refactoring. Note that this is not the same as re-designing (see Architectural Debt), nor is it re-writing from scratch (which can introduce its own issues); it is a process that ideally happens constantly, to avoid the code base becoming so contorted as to require a major overhaul. It should be factored into the cost of development of new features: refactoring is not (or rather, should not need) dedicated allocations of time, but rather costings should assume that any addition to existing code will require some refactoring to ensure it remains easily maintainable.

Aside: Evidently there will be times where refactoring has been neglected for a long period of time, i.e. Technical Debt has built up. In such cases a clear definition of what exactly needs to be refactored, coupled with clear metrics and goals (e.g. reduction in code duplication by n%), with a phased and costed implementation plan is the way forward.

Having taken this more constrained view of Technical Debt, let us explore what other types of debt exist. We’ll consider six classes, beginning with what many people would consider to be a subset of Technical Debt, but which I would suggest is useful to separate out.

Architectural Debt

Architectural Debt concerns issues that have been caused by decisions at the design phase, rather than the implementation phase, of a project. As is often pointed out, if an error is spotted at design time, the cost of rectifying it is significantly lower to doing so after implementation. Viewed in reverse, an error made at design time is likely to require non-trivial effort to address than an error made during implementation. This, in my view, is the distinction between Architectural (design-time errors) and Technical (implementation-time errors) Debt.

Aside: note that Architectural Debt may well not be due to deliberate recklessness: Martin Fowler has a handy quadrant diagram that illustrates nicely how some errors are inadvertently caused, even when a prudent approach appears to be being taken. It also points out that one can incur debt in a prudent and deliberate way, too (i.e. a conscious choice for good business reasons).

In my (limited!) experience, it would seem that an issue of Architectural Debt is a little like a leaking dam: customers will find issue after issue related to said debt, and the product team will valiantly attempt to stem the “leak in the dam” over and over. However, no quantity of patching can ultimately remedy the issue of (e.g.) the dam being made of porous rock. At some point the dam needs to be re-designed and re-implemented: a costly decision that may take years to recoup the investment from.

Paying back architectural debt is likely to be an activity that an engineering team is keen to do: after all, they are suffering the stress (and resource drain) of constantly trying to implement work arounds. Clearly, though, one cannot choose to pay back all debt (of any kind) at once. Thus, the question of what to do when.

Firstly, we should carefully quantify the resources required to address each item of Architectural Debt. Whilst it is tempting to want to “re-write the world”, this is an example of the Second System Effect that Fred Brookes wrote about:

An architect’s first work is apt to be spare and clean. He knows he doesn’t know what he’s doing, so he does it carefully and with great restraint.

As he designs the first work, frill after frill and embellishment after embellishment occur to him. These get stored away to be used “next time.” Sooner or later the first system is finished, and the architect, with firm confidence and a demonstrated mastery of that class of systems, is ready to build a second system.

This second is the most dangerous system a man ever designs. When he does his third and later ones, his prior experiences will confirm each other as to the general characteristics of such systems, and their differences will identify those parts of his experience that are particular and not generalizable.

The general tendency is to over-design the second system, using all the ideas and frills that were cautiously sidetracked on the first one. The result, as Ovid says, is a ‘big pile.’

Joel Spolsky has written similarly of the dangers of re-writing from scratch, over at Things You Should Never Do, wisely pointing out that

The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they’ve been fixed.

It is therefore crucial that the scope and goals of the Architectural Debt payback work be abundantly clear. At that point, a design and “t-shirt size” costing can be worked out, which ultimately yields a monetary value.

Meanwhile, quantification of the resources being spent “plugging the dam” should also be made. Hopefully this is relatively easy, in that support cases that have required work arounds can be identified, and the time spent on them totalled up to arrive at a yearly cost. Note, though, that there is also a tax on product team morale caused by constant dam plugging/fire fighting: whilst hard to quantify, it should at least be born in mind.

Not only is there an impact to morale, there may also be a direct impact to value creation. Travis Kimmel points out in his article The Value of Quantifying Technical Debt that unless we carefully consider how to address debt, we find it consuming our most senior engineers. I would argue that it is paying back Architectural Debt where this risk is greatest: it is unlikely that a junior engineer will have the necessary historical knowledge to successfully understand and then re-design the architecture. Thus, the most experienced engineers are given the task, which in turn leads to reduced rate of innovation, and the likely increase in Value Debt (see later).

Defect Debt

Every software project has a bug backlog. Normally, critical issues are addressed before each release ships, but minor issues are in many cases deferred, and gradually form a morass of tickets that are bulk-moved (postponed) repeatedly. Release notes may contain the same issues over multiple versions of the software, and customer may grumble (but not sufficiently to be a problem) about “rough edges”.

Note: I define Defect Debt to be confined to usability issues that customers experience. It does not include “to do”/refactoring items that are internal to the software, which I would categorise as Technical Debt.

Quantifying the effort required to fix minor issues is probably relatively easy. What is non-trivial is understanding what the return on investment of fixing them is. One potential way is to search online fora and social media for occurrences of where people are complaining about such problems, but attempting to work out the costs of these (in terms of lost sales due to reputational impact) is likely to be hard.

However, analysing cases that arrive at front-line support can yield very concrete cost data. If there are many problems that are in fact configuration issues, or addressed using simple work arounds, the pay-off in fixing the underlying bug is the saving of answering the support calls. Reducing the number of support calls that are about trivial issues will mean staff can focus on helping customers with real product issues, and thus improve customers’ experience.

(Aside: I had not previously thought about Defect Debt, until reading Patroklos Papapetrou’s article on Technical Debt 101. Well worth highlighting.)

Documentation Debt

Very much linked to Defect Debt, product documentation (yes, some products still need this!) and API documentation can be the cause of support calls or wasted developer time, respectively. Like source code, documentation can need refactoring over time, as features are added, screenshots need updating, or best practices change. It is certainly not the case that documentation is somehow static.

One could argue that like code, metrics concerning duplication across documentation sets (e.g. “details of feature x are spread over three different guides”), and something akin to cyclomatic complexity (“which section should I now turn to having read this one?”) are useful in measuring Documentation Debt. User studies, to determine how long it takes for someone to perform a particular action using the software, given a particular documentation set, may be helpful. Support call data may, however, provide the most easily-quantified cost of hard to understand documentation.

Test Debt

Most treatises on technical debt will at some point extol the virtues of test automation, and highlight that one cannot refactor safely without good unit (and indeed system) tests. If an organisation has not yet automated its testing, there’s clear debt there.

Assuming there is automation in place, in my view it is crucial that said automation continues to be invested in. Measures such as code (test) coverage are helpful, but they do not tell the whole story. How long does an organisation spend triaging the results of each test run? What level of resource commitment is required to do so? How long is the time period between a developer committing code, and receiving the test results for that commit? How much negative testing takes place?

Test Debt can have real impact on velocity. Measuring metrics such as those suggested above will provide information on the staff costs (e.g. resources spent on triaging results), but can also highlight why a bug mountain can pile up, if development continues whilst waiting significant amounts of time for test results to arrive.

In addition, it is worth noting that automated tests and their frameworks can be complex software in themselves, and hence subject to Technical, Architectural, and Documentation Debt. Ensuring that refactoring tests is included in the costs of new features is therefore also very important.

Feature Debt

Every product owner strives to make their product have a smorgasbord of unique features, thus (hopefully) ensuring that customers have many reasons to choose them over the competition. However, there is also the concept of Minimum Viable Product (MVP), another term with many definitions.

MVP for an established product is in my view the set of features without which a customer will not give a salesperson the time of day. For a product with no competitors, MVP translates into satisfying at least one use case that a customer has that no other product can address. In a more crowded market, MVP includes all the features that are considered “essential” by customers (and probably analysts).

Of course, not meeting MVP does not mean there will zero sales: some customers will have their own definitions of MVP which are subsets of the “average customer’s” MVP. But it is likely that the product will be counting itself out of a non-trivial percentage of the market. This can be a valid choice: some use cases may be important today, but in a year’s time will be regarded as legacy, and thus it is not worth investing in them. Bear in mind, however, that this also applies to features that are present in the product (see the section on Value Debt, below). Additionally, features that are regarded as “new” today can become part of the definition of MVP in fairly short order, as all of your competitors implement them, and they effectively become commoditised.

Thus, it is important to keep a watching brief on how a product’s feature set compares to today’s (and tomorrow’s predicted) MVP. The delta between the two I term Feature Debt.

Evidently one cannot use telemetry to provide an indication of the number of customers who are not using the product because they regard the absence of said feature as a blocker to purchase — they’re not using it. Thus, two methods come to mind. Firstly, analysing win/loss data to understand the value of deals lost due to not having a particular feature will provide an idea of the revenue being lost (and an idea of the premium customers are prepared to pay a competitor for the feature). Secondly, if there are ecosystem products that customers purchase to fill the perceived gap in your product will provide an indication of the value of said functionality, were it to be present in your product.

Note that Feature Debt is not the same as having a backlog of good ideas for new features: implementing any of those will attract customers to you as compared to your competitors. Feature Debt is instead the set of features without which many constituents of your target market will simply not consider you. Thus, paying back Feature Debt is somewhat thankless: it buys a ticket to the party, but only differentiation will mean you stand out to the guests.

Paying back Feature Debt is not an activity that must be done before all else. As noted above, some customers will have a more restricted definition of MVP than others (the “ticket to the party is cheaper”), and differentiating features will mean they purchase from you. The key is to understand (a) what the MVP consists of for most customers, and (b) to have a roadmap to provide those features over time.

Security Debt

Any mature software product will probably have needed to evolve cipher suites, key lengths, and encryption algorithms over time. New minimum standards emerge, protocol specifications are found to be irreparably broken, and brute force attacks become cheaper. Thus, security not only concerns ensuring that there are no “holes” in the product, but must also encompass keeping pace with these standards. Of course, the danger is that product owners allow standards to drop; after all, the product using sub-par security technologies will probably still achieve customers’ main use cases. Only when a security audit or scan is performed does not being up-to-date become a problem. However, it is a problem that can almost certainly not be quickly fixed!

If quantifying the costs of Security Debt is easy, I would argue that the situation is dire, because lost sales (due to non-compliance) is how one would easily measure it. Under normal circumstances, reputational impact were a customer to be compromised due to such debt is hard to estimate. Picking a large number from thin air, perhaps by examining the costs of data breaches that have taken place at enterprises, then multiplying it by a rough probability based on the likelihood of the debt actually being exploited, may be all that can be done.

Despite the difficulty in estimating its monetary value, Security Debt deserves not to be forgotten!

Value Debt

Ultimately, customers care little about the types of debt outlined in the preceding sections, other than perhaps (a) Architectural Debt, as a proxy for the length of time a customer issue might take to address, or (b) Feature Debt, an excess of which results in the product becoming unusable. It therefore follows that we “owe” to our customers an ever-increasing number of reasons to purchase the product, which I term “Value Debt”.

Aside (i): it is important to clarify why the number increases, and what such an increase implies. Evidently business landscapes change rapidly, both customers’ use cases/processes, and competitors’ feature sets. Those changes mean that the “goal posts” of what is expected of a product move, and thus the value ascribed to it is likely to decrease over time if the feature set remains static. By default, therefore, the product’s feature set monotonically increases, which in turn both increases the support burden, but also risks increasing Technical and Architectural Debt. Judiciously retiring features (preceded by deprecation) is evidently helpful, though it can alienate customers who have used the product for long periods of time.

Aside (ii): it could be argued that Value Debt is not really debt, but just a fact of life that should cause us to keep innovating. It is left to the reader to decide whether this is “true” debt, but evidently the need to balance it against all other forms of debt I hope is not contentious.

Of all the types of debt, Feature Debt and Value Debt are the only ones that will increase over time, even if there is zero change made to the software itself. They are intrinsically linked to the world outside of the development team, and their sizes are both unpredictable and (can be) subject to significant step changes, such as when a new version of a competing product emerges, and causes one of your flagship features to become a commodity one. Suddenly the premium customers are willing to pay for the product is reduced.

Such unpredictability can cause friction between a development team and a product owner: in an ideal world, a roadmap would be crafted at the beginning of the year, and would not change until it had been delivered in its entirety. Product owners should always strive to avoid cataclysmic changes to the backlog, and good planning and the reservation of capacity to cater for the unexpected are both important. However, there will still be times when Value Debt suddenly increases: quantifying it and clearly communicating the increase are vital.

Quantifying Value Debt is unlikely to be an exact science. If the incremental customer value of each existing feature in the product is known, the debt that is incurred when that feature is commoditised or rendered obsolete is easily known. For example, if the price of a widget is £10, and feature x has a value to the average customer of £2, we can say that we will have a 20% Value Debt when (not “if”) that feature is surpassed by another product’s (or a change in customer business processes).

Given that it is unlikely that we can truly quantify the value of each feature (particularly as it is time-variant), a reasonable approximation might be to examine the fraction of customers using each feature (ideally via telemetry), and ascribe value based on that. If n% of customers make use of a feature, it follows that a similar fraction of revenue is contingent upon it. It is worth noting that customer adoption/value is not always proportional to the amount of engineering effort invested!

In practice, this is only half the story, as we must also consider how to “pay back” any Value Debt that we incur. New product (and hence feature) pricing is perhaps an even more inexact science than understanding the value of an existing feature. None the less, making an attempt in terms of customer time saved, total addressable market, and so on is worth doing, in order to communicate why the feature holds a particular position on the backlog.

The Bottom Line

Clearly there are multiple types of debt that one can incur during the production of software. I believe that there are three crucial points concerning it:

  1. Every project/release is an exercise in balancing all the different types of debt.
  2. Quantifying each type of debt (value, and redemption cost) is the best way to communicate to all involved what the trade-offs are. Statistics are much less contentious than assertions.
  3. Debt in itself is neither good nor bad: there are valid reasons for incurring and paying back each type of debt at different points during the software lifecycle. As with financial debt, incurring a manageable quantity of software debt can be temporarily helpful (e.g. buying a house on a mortgage might be akin to shipping a major feature with some debt to gain first mover advantage), but allowing any type of debt to spiral out of control is what must be avoided.

“Debt” can be an emotive term, and in my view there is a tendency for people to use the term “Technical Debt” for too wide a purpose. By providing a more specific categorisation, I hope we can use different measurement techniques to arrive at the monetary costs of each type, and thus communicate business trade-offs more easily.

End-notes: if you’re now convinced you should measure absolutely everything, take five minutes to read The Risks of Measuring Technical Debt.
If you want a good introduction to how to manage, measure, and justify technical debt, Steve McConnell’s Construx slides on
Managing Technical Debt are well worth a read.

--

--

CTO at IQGeo; ex-CPTO at Checkit; ex-Citrix XenServer & Microapps; husband; father; Christian; cyclist; TCK; orienteer; photographer. Views mine alone.