Originally published on DevOps.com
Many ALM tools and Value Stream Management Platforms graph lead time, but few really tell you what to do about it, how to use it, and why you shouldn’t use the average. In fact, so much of what is written on this subject is rather vague and “hand wavy.” Part of the reason is that Value Stream Management is a big job, and it isn’t something a tool can provide. Rather, it’s a human endeavor. Another driver for the hand waving may be that the writers really don’t know much about lead time because they’ve never actually used it.
Improving the value stream is a big topic but there are definitely things we can cover in this space: how to use lead time, how to get it, prerequisites to using it well, techniques or approaches, and references for further study. I’ll start with the simplest, most mechanical topic, and move to the harder topics last: brief thoughts on how to improve and the prerequisites.
What is Lead Time?
Lead time is how long it takes a work item to get through a system, such as your IT development value stream. If your system is under control, recently measured lead times can be a good predictor for lead time in the future. People want to know when stuff will be done so they can make plans.
Lead time is also a lagging indicator for many of the actions used to improve, well, lead time and flow. It’s what some people call an output metric — you make a bunch of improvements to the system and improved lead time is what you get out of your efforts.
Where to Start
Measure lead time separately for the few types of work where it really matters, such as production defects and epics or initiatives. For every work item, record the start and end dates. But first you have to define what you consider the start and end. If you are using the measure to communicate with a customer, use dates from the customer’s perspective. If you are trying to improve the development part of the pipeline, post planning, then start the clock when it’s scheduled for dev. In that way, some people distinguish “flow time” from “lead time.” “Cycle time” is sometimes used for an even smaller slice of the value stream, usually just one step. (I’m not going to debate here whether that’s a proper or official use of that term.)
Whatever you do, for heaven’s sake, don’t use the average lead time. Instead, use a percentile. I use the 80th percentile. For illustration, the 25th percentile is the point at which 25% of the observations fall below that point. The 80th percentile of your lead time observations (measures) is the point at which 80% of your historical lead time observations fall below that point. The 50th percentile is not always the median, but for the sake of this article we can say it’s close.
Average, however, could be materially off. It’s worse to use the average than the median because the average can be thrown off by outliers more than would the median and 80th percentile. If you are good enough with statistics to correctly identify and remove outliers, that’s great, but few people do that at all, much less with statistical precision, and it’s really not necessary for most IT work.
The only thing I’d use the median for is to monitor the trend — to see if the median is improving. But using it for forecasting or expectation setting would be bad. You see, when using median, 50% of observations took less time, but 50% took longer. You wouldn’t want to tell a customer that he has a 50/50 chance of getting a fix in two weeks. Telling them they have an 80% probability of getting a fix in three weeks, in my experience, is more palatable. You want to be able to tell your customers (or marketing, management, support org or program management) that “80% of the time we resolve this kind of issue in n weeks.” Most people are happy with those odds. Anything higher takes in so much of the “long tail” of the distribution that it makes forecasts not terribly useful for planning.
How to Get the Graph
Use the lead time graph in your existing tooling if you can; get the data into a data warehouse and use a BI tool if you must; or use Excel only as a proof of concept or if your organization is small. More on this below. Compute your 80th percentile lead time expectation every sprint or month and graph it in a bar chart so you have a new bar every period.
How to Get the Data
It’s easy to find the 80th percentile in your dataset. If your tooling can give you your choice of percentiles for lead time, that’s fantastic. If not, you’ll need to gather all of your lead time observations. You can do it in Excel, but then you’ll be forever updating Excel with fresh data or reimporting the data. That’s workable if you have a small number of teams and observations to input each sprint. Otherwise, you’ll want to get it into a database and make a BI dashboard.
How Much Data to Get
You don’t need a whole lot of data. “Data beyond 24 weeks is most likely out of date,” says agile data and probabilistic forecasting guru, Troy Magennis.
How to Compute It
In Excel, use the percentile.inc function. Generally, order your lead time data from smallest to largest, multiply your number of observations by .8, round up to an integer, then identify the observation at that location. Technically, you should average the 2 observations at that point if you didn’t have to round, but I’ve not found that materially significant in any of my data for these purposes so I don’t do it.
What to Do with It
I publish my “80% lead time expectation.” I talk about it with the people who are anxiously waiting for the delivery. I talk about it with my engineering team. I talk about it with my management team, PMO, project managers and program managers. I talk about it with my lean and agile coaches and consultants and Scrum Masters. I talk about it with my team leads. I want everyone in the loop and on board with the improvement goal. I use it to explain how certain behavior — such as expedites and high work-in-progress (WIP) — work against improving the lead time expectation.
Look at your bar chart showing the changes in your lead time expectation over time. See if it’s moving in the right direction. Use A3s, Toyota Kata, lean principles, and systems thinking to improve the system. Engage your upstream and downstream neighbors in the improvement process and in making process policies explicit.
What If the Data Isn’t in One Tool?
Change your process before you change your tooling, and don’t tool up a bad process. A common problem is when an organization doesn’t have the beginning and ending lifecycle states of a given class of work in a single tool. (I didn’t say value stream here because you don’t need the whole value stream mapped for every class of work in order to get lead time.) You can combine data from multiple tools in order to get the end-to-end, but the simplest thing that could possibly work is to have your top-level item reflect or approximate that end-to-end flow. Before deciding you need a new tool, there are a couple of easier approaches to try first:
- Add the starting or ending date to the items of interest, whichever of those is missing. Sometimes I see those planning an initiative only track the item until it’s approved and released to engineering. It may be easier to add another date to that item and track it to delivery to production than to change tooling.
- Add more steps to an existing kanban. Likewise, it may be easier to add additional columns to a kanban and keep the items on the board until they are in production. (Or get them on the board earlier in their lifecycle.)
- Add another kanban. It may be that you don’t have a kanban. Your initiatives or your production requests aren’t tracked well. Get them in a kanban. This is not the same as changing tooling. You probably already have a tool that can give you a kanban. Use it. Every organization should have an epic kanban.
You can use lead time as a lagging indicator of your process improvement efforts without any of these prerequisites. However, if you are using lead time to set completion date expectations, you should attend to these matters first: stable teams, predictable teams (stable throughput), a stable system (few process changes) and low WIP.
If you are trying to make predictions, being predictable is helpful. A leading indicator for predictability is velocity variance or throughput variance. I compute that as the standard deviation divided by the average, and I want that to be .2 or lower. That (.2) is the point at which the numbers are more reliable, in my experience with many organizations’ data.
You need to have stable teams. Your organization will not be predictable if your team makeup is changing all the time, or if you are moving people around between projects, or if you are allocating people to multiple projects.
You need to control your WIP at every level. Put all of these on your metrics dashboard:
- The number of items being worked on per individual. This should be one per individual, or less. It’s usually easier to gather this data at a team level. At a team level, the number of items being worked on should be less than the number of individuals. Encourage working together. If you are pair-programming most of the time, your WIP should be less than the number of pairs. With TDD, good test coverage, and good Continuous Integration practices, you should be able to get multiple pairs on one user story.
- The number of epics or initiatives each team is working on. This should be on your metrics dashboard, and list teams with more than 1 epic in progress. But remember, it’s not the team’s fault. Fix the system. Don’t blame the team.
- The number of open sprints. This should be 1 per team. On your dashboard, list the teams with more than 1 open sprint.
- The number or epics or initiatives the organization is working on. Graph this measure each sprint on a bar chart so you can see the trend. Work to improve (lower) the trend.
- The number of releases being maintained (fixed, patched, level 3 support). Graph the trend and work to lower it.
- The number of releases being supported (help desk, service desk, level 1 and 2 support). Graph the trend and work to lower it.
Lead time, if tracked and used correctly, is a great predictor for how quickly work will be completed, allowing you to more accurately forecast delivery dates. You should measure it, talk about it, and review it often to see what improvements have been made, and can be made. And remember:
- Measure lead time separately for the few types of work where it really matters.
- Use percentiles rather than averages for more accurate forecasting and expectation setting.
- Adjust your process before you change your tools.
- Mike Rother, Toyota Kata
- Poppendieck, The Lean Mindset and Implementing Lean Software Development
- Donella Meadows, Thinking in Systems
- David J. Anderson, Kanban: Successful Evolutionary Change for Your Technology Business
- Reinertsen, Principles of Product Development Flow
- Sam L. Savage, Flaw of Averages
- Douglas W. Hubbard, The Failure of Risk Management
- Douglas W. Hubbard, How to Measure Anything
- Troy Magennis, Throughput Forecaster, Focused Objective,
Other references not directly referred to: Eliyahu M. Goldratt, Theory of Constraints
Andrew Fuqua is the ConnectALL SVP of Products. He joined ConnectALL after a long-standing career as an Enterprise Transformation Consultant. Andrew has an extensive career of 30 plus years of varied experience — held positions in consulting, management, product management, and development. Andrew is an active contributor to the Agile community, an established speaker, influencer and a published author.