Cycle Time Revisited

The cycle time has been a hot topic for me lately. A debate is going on on kanbandev.

Meanwhile, someone at a client site has asked my advice on continuous delivery (of software) and whether “the cycle time” is a useful metric for it based on some article they found somewhere on the Internet and read over the weekend. The article was written purely at the practice level, without context and the author assumed whatever cycle time meant to them was “the cycle time” and didn’t bother with any definitions. That was a lot to sort out.

There is no such thing as “the cycle time”

It is an overloaded term and its uses should always be qualified.

In the manufacturing domain, where there is a stable definition of cycle time and where the term can be used without qualifiers, it means something very different from how the author used the term. It means the (average) interval between successive deliveries. For example, if the cycle time of a car assembly line is 45 seconds that means, on average, every 45 seconds a new car rolls off (actually, at nearly perfect 45-second intervals, because there is very little variability in the manufacturing process). A total of 60 * 60 / 45 = 80 cars are produced every hour. (Note that the lead time to manufacture a car is significantly longer than 45 seconds.)

If we’re to adopt this definition in the software development domain, the cycle time means the reciprocal of the deployment frequency. For example, if a team demoes their user stories every two weeks, but actually ships only after multiple increments are integrated into a six- month release plan, then their cycle time is 6 months. If they ship at the end of every sprint, then their cycle time is 2 weeks. If they deploy 50 times a day on average, their average cycle time is approximately 29 minutes. The average cycle time of the software delivery process at Amazon is reportedly 11 seconds.

The Software G-Forces

Kent Beck, one of the eXtreme Programming pioneers, proposed a model called the Software G-Forces. He showed the scale of deployment frequencies (cycle times), from yearly to quarterly to monthly to weekly to daily to hourly or less. He also showed how the distribution of where software companies fit on this scale changed over time. There are no best practices for delivery. Whatever the practices you’ve got are the right practices for delivering at the frequency you’ve got.

Speaking in terms of delivery frequency also avoids the terminological overload of “cycle time.”

Cycle time and continuous delivery

Beck further contends that if you want to double or triple your frequency, doing so will likely break your existing process. You have to figure out which new practices to add to your process and which practices to part with. Addition and subtraction of practices will always be contextual. For example:

  1. Delivering once in 6 months may involve little test automation, but, if you want to deliver every 2-3 months, you will probably need to invest in test automation. The full manual regression test may take too long.
  2. Going from once every few months to once every few weeks is likely to require automated provisioning of various testing, staging, certification and production environments. “Snowflake” servers may stand in the way.
  3. Delivering once a two-week sprint or more frequently generally requires a properly designed, unit-testable code base with good amount of unit test coverage. Are all developers fluent in SOLID principles? There will be no trivial bugs for testers to catch – are they trained in exploratory testing?
  4. Delivering more frequently than once a week tends to break processes based on time-boxed frameworks.
  5. To deliver more than once a day, you may need to solve the bottlenecks in your deployment pipeline.
  6. And so on – you can easily turn this into a very long checklist of practices.

What about local cycle times?

Other uses of “cycle time” are possible in the software domain, but they all mean the time through a local activity (or a continuous sequence of activities) and all need to be qualified where they’re from and to (something the author neglected to do). As it turned out upon careful examination of the article still influencing my client, its author didn’t mean any of the above definitions. He meant the time from the code commit to seeing it in production. It was the local cycle time of the deployment pipeline. (In the Kanban method terminology, his clock started at the second commitment point.)

By optimizing the local cycle time of the deployment pipeline, the author was effectively solving the Problem #5 from the long list above. This immediately raised the question whether the Problem #5 was contextually relevant to my client. (The answer was, without giving away anything: to some teams, yes; to other teams, no.)

Conclusion

Back to the original question: is “the cycle time” useful metric to inform progress towards continuous delivery? Generally speaking, no, unless we can all agree on the definition, which is clearly not happening.

Meanwhile, using the delivery frequency can be much more productive you can avoid using it to rank your teams. Team A delivers once a month; Team B, once a day. Is Team B better than Team A? No. Is Team B working on what it will take to deliver three times a day? No, they’re just cranking out user stories? If so, that’s not so good. Is Team A working on what it will take to delivery every two weeks? If yes, great!

If you’re still looking for a best practice from this memo, here it is. It’s okay to deliver only twice a year out of a legacy codebase with mostly manual testing. It’s not OK to say, when we’ll rewrite this codebase in a few years, it will be more conducive to deployments at the end of each sprint. It is also not OK to say, we’re deploying every day and are far ahead of those dinosaurs and their twice-a-year deployments. Instead, it may be preferable that each team have a goal to deploy 2-3 times more frequently than they do now. The leaders can communicate and set such expectations with the teams.

Posted in Uncategorized | Tagged , , | 2 Comments

Introducing the STATIK Canvas

I’d like to introduce a new simple tool, the STATIK canvas. STATIK is an acronym that refers to the Systems Thinking Approach to Introducing Kanban in Your Technology Business.

Kanban (in the knowledge work context) is an evolutionary improvement method. It uses virtual Kanban systems as a way to catalyze improvement and adaptive capabilities. A Kanban system is introduced into the environment which comprises the service delivery team and its customers and partners. This is a critical moment. Systems thinking is key!

It is not the goal of this post to explain STATIK, it is rather to introduce the canvas and let people download and try it. Therefore I’ll skip this explanation and encourage the readers to explore the key STATIK resources:

Download

OK, so here is the file to download.

Instructions

The proposed STATIK canvas is roughly the size of an A3 paper. It is intended to be filled in by pencil and capture only the most important stuff. The following are instructions by section.

1. Context for Change

This section captures the internal and external (from customer’s perspective) sources of dissatisfaction and variability. Stories collected in this section often contain words that reveal work item types, hidden risk information, odd patterns of demand, unmet expectations (used in Section 2 – Demand Analysis), external and specialist dependencies (Section 3 – Workflow Mapping), implicit classes of service (Section 4).

2. Demand Analysis

This section contains the demand analysis template introduced by Dave White. The following information is collected for each work item type:

  1. Source – where the requests to deliver this type of work item arrive from
  2. Destination – where the results of work are delivered to
  3. Arrival rate. This must be a number of requests per unit of time. (“We have 300 items in the backlog” is not good enough. If you get this answer, ask where they come from and how.)
  4. Nature of Demand – note any patterns.
  5. Customer’s delivery expectations, even if unreasonable.

3. Classes of Service

For each work item type, specify which class(es) of service are being provided, their policies and delivery expectations.

4. Input and Output Cadences

Specify them for every work item type.

5. Kanban Board Visualization Design

This section is intended to be a simple sketch helping the delivery team, manager and coach figure out the major outlines of the visual board. These may include swim lanes, two-tier structure, use of colour, etc. There should not be any need to make this section a miniature replica of the actual board.

Learning

Here is a number of things I hope to learn by trying out this canvas. I expect its design to improve as a result.

  • Whether the canvas is helpful to capture the thinking process of introducing (or updating the design of) a Kanban system
  • Whether it is helpful to hang the canvas near the Kanban board to help remember why certain visual Kanban board elements are the way they are
  • The relative proportions of the sections
  • Level of detail or instructions needed in each section
  • Whether the “Roll Out” section belongs in the canvas
  • Any surprises, things I don’t expect to learn
Posted in Kanban | Tagged , , , | 2 Comments

The Best of 2014

I’m starting the new year with a short summary of what was the best on this blog in the past year.

I wrote two extended series of posts on two different topics. Each topic deserved more than 300 to 1,000 words that would fit into one typical blog post. So I wrote several posts, varying in depth, focus and appropriateness for different audiences.

  • The first series was about my knowledge-centric approach to visualizing processes in creative, intellectual fields of work. Look not for process steps and hand-offs of work, but for various ways people collaborate to create knowledge.
  • The second series was about Lead Time. I tried to cover several new insights in to probability distributions of delivery times in intellectual work and how we could use them practically.
    • Inside a Lead Time Distribution is the key post in the series. It goes over the key points on a typical distribution curve and shows how we can practically use them. It is interesting that the ways we use the data differ significantly from the left to the right side of the curve. So I chose to paint those various points using rainbow colours so the rich picture they reveal doesn’t look like a boring bar chart.
    • Two practical (and technical) posts remained popular during the year. How to Match to Weibull Distribution in Excel (using only spreadsheet software) first appeared in 2013. Last year, I updated the formulas in the attached spreadsheet to automate a few operations and make it even easier to use. Also last year, I added a twin post to it, How to Match to Weibull Distribution Without Excel. You can still do math (the old way), but if you are willing to give up some precision, the technology becomes very simple – you can visually match several known patterns.
    • The remaining posts in the series should come up if you search the site for “lead time.”

The knowledge-discovery process stuff is fairly complete and stable at this time. The lead time, probabilistic approach and forecasting topics will likely see new developments in 2015. If you’d like to learn more how to apply this knowledge and practical experience in your company, please feel free to connect with me by email or Skype so that we can discuss it.

Besides the two post series, my popular post from the past, The Elusive 20% Time, was turned into a contribution to the new book More Agile Testing by Lisa Crispin and Janet Gregory. (The contribution was in the area of organizational practices helping achieve greater software quality.) I posted the condensed, cleaned-up, copyedited version of this article on this blog under the title The Still Elusive 20% Time. (There is a pending comment on it at the time I’m writing this, which deserves a reply and probably a new blog post – stay tuned.)

Of all other posts, the best one in terms of 2014 page views turned out to be this one, written in 2013: Scrum, Kanban and Unplanned Work. It contains one of my trademark phrases, switching from Scrum to Kanban, missing the most of both. It also rebuts some of the Kanban misconceptions, which I continued to hear from practitioners and their under-informed coaches throughout the year.

A quick review on my posts shows some of them are due for an update and some seem like old baggage and have lost their relevance. I will fix this in 2015 if time allows.

Posted in blog | Tagged | Leave a comment

#BeyondVSM: Understanding and Mapping Your Process of Knowledge Discovery

This short post will serve as the “table of contents” for a series of six posts I wrote this year about mapping processes in creative industries based on knowledge creation and information discovery and not by handoffs of work between people and teams.

  • Understanding Your Process as Collaborative Knowledge Discovery: the first post in the series explores the problem and the new ways of looking at it
  • Examples of using this approach to map processes of knowledge discovery in two different industries
  • Mapping Your Process as Collaborative Knowledge Discovery. I wrote about how to actually create such process maps with real people who do the work, why, and how to use these maps in Kanban system design. This post turned into three, each covering different layers:
    • Recipes: how to actually do it and what not to do, without much explaining why. Sorry, the post was already long enough. Of course, recipes are not enough, but that’s what I have the next two posts for!
    • Observations: what actually happened as I tried this new approach. The experience informed various tips on how to do it.
    • Thinking
Posted in Kanban | Tagged , , | 1 Comment

Lead Time and Iterative Software Development

I have introduced my forecasting cards and written about lead time distributions in my recent blog post series. Now I’d like to turn to how these concepts apply in iterative software development, particularly the popular process framework Scrum.

Let’s consider one of the reference distribution shapes (Weibull k=1.5), which often occurs in product development, particularly software. I went through various points on this curve and replaced them with what they ought to mean in this specific context.

Lead time distributions and the timebox. The chart shows the mode, the median, the average, and the 75th percentile relative to the sprint duration

Scrum teams often complain that their user stories are not finished in the same sprint they were started. I have often observed in such situations that their stories are simply too large.

Even if typical stories were smaller than the duration of the sprint, such as, 7-8 days in a 10-business-day, two-week sprint, that was not small enough. The teams, Scrum masters, product owners held, perhaps subconsciously, the notion that we can “keep the average and squeeze the variance”, that is keep the 7-8-day average but limit variability — estimate, plan and task better — so that the right side of the distribution fits within the timebox. Recent lead time distribution research, examining many data sets from different companies (including those using iterative Agile methods) refute this notion. One of the key properties of common lead time distributions is that the average and standard deviation are not independent.

Another suggestion — keeping the average story to half the sprint duration, so that the ends of the bell curve gives us zero in the best case and the sprint duration in the worst case — is another illusion. Lead time distributions are asymmetric!

Leftshifting diagram: as the lead time distribution curve shifts to the left, very few data points don't fit into the timebox

The real strategy is to left-shift the whole distribution curve.

This Kanban-sourced knowledge led to many quick wins as the Scrum teams, their Scrum masters and product owners I coached gave themselves a goal to systematically make their stories smaller. They simply asked, what can we do to double the count of delivered stories in the next few sprints, covering roughly the same workload in each sprint? After the doubling, ask the same question again until the stories are small enough.

How Small?

How small do user stories need to be? We can turn to our forecasting cards, which give the control-limit-to-average ratios between 3.9 and 4.9 for the two most common distribution shapes (1.25 and 1.5). In the extreme case, we have to assume the exponential distribution (I have observed quasi-exponential distributions in some cases in incremental software development), which gives us the ratio of 6.6. The ratio of average lead time to sprint duration in the range of 1:4 to 1:6 can be used as a guideline.

To make this rule of thumb a bit more practical, lets take into account these practical considerations: (1) the lead time is likely to be measured in whole days, (2) the number of business days in a sprint is likely to be a multiple of five, and (3) the median (half-longer, half-shorter) is easier to use in feedback loops than the average.

The control-limit-to-median ratios for the same distribution shapes are (consulting the forecasting cards again) 4.5 to 6.1; in the extreme case, 9.5. Therefore, half of the stories in one-fifth of the sprint duration can be used as a guideline. In the extreme cases, we may need one-tenth instead of one-fifth.

None of this is news to experienced Scrum practitioners, particularly those with eXtreme Programming backgrounds. XP tribe has appreciated the value of small stories since long ago, and invented and evangelized techniques, such as Product Sashimi to make them smaller.

Posted in Kanban | Tagged , , , , , , | 1 Comment

Introducing Lead Time Forecasting Cards

I’m introducing a simple tool: lead time forecasting cards.

The set of six (so far) forecasting cards

Each card displays a pre-calculated distribution shape, using Weibull distribution with shape parameters 0.75, 1 (Exponential distribution), 1.25, 1.5, 2 (Rayleigh distribution), and 3. (Since I printed the first batch, I realized I need to include k=0.5 in the collection.)

For each distribution, the following points are marked with rainbow colours:

  • mode
  • median
  • average
  • percentiles (63rd, 75th, 80th, 85th, 90th, 95th, 98th, and 99th)
  • the upper control limit (99.865%)

The scale of each card is such that the lead time average is 1. Your average is different, so multiply it by the numbers given in the table on each card.

I will be bringing a small number of printed cards to the upcoming conferences, training classes and consulting clients. The goal is, of course, to get feedback, refine them, and then make them more widely available.

Posted in Kanban | Tagged , , , | 1 Comment

Lead Time Distributions and Antifragility

This post continues the series about lead-time distribution and deals with risks involved in matching real-world lead-time data sets to known distributions and estimating the distribution parameters.

Convex option payoff curve. Losses are limited on the left, gains are unlimited on the right

One of the key ideas of Nassim Nicholas Taleb’s book Antifragile is the notion of convexity, demonstrated by this option payoff curve. The horizontal axis shows the range of outcomes, the vertical axis shows the corresponding payoff. With this particular option, the payoff is asymmetric. Our losses in case of negative outcomes on the left are limited to a small amount. But our gains on the right side (positive outcomes) are unlimited. Note that this is due to the payoff function’s convexity. If the function was only increasing as a straight line, both our losses and gains would be unlimited.

A concave payoff function would achieve the opposite effect: limited gains and unlimited losses.

An antifragile system exposes itself to opportunities where benefits are convex and harm is concave and avoids exposure to the opposite.

In the book’s Appendix, Taleb considers a model that relies on Gaussian (normal) distribution. Suppose the Gaussian bell curve is centered on 0 and the standard deviation (sigma) is 1.5. What is the probability of the rare event that the random variable will exceed 6? It’s a number, and a pretty small one, which anyone with a scientific calculator can calculate.

Gaussian distribution analysis: probability of a rare event as a function of sigma is a convex function.

Gaussian distribution analysis: probability of a rare event as a function of sigma is a convex function.

Right? Wrong. We don’t really know that sigma is 1.5. We simply calculated it from a set of numbers collected by observing some phenomenon. The real sigma may be a little bit more or a little bit less. How does that change the probability of our rare event? There is a chart in the Appendix, but I rechecked the calculations, and here it is — it’s a (very) convex function.

If we overestimate sigma a little bit, it’s really less than what we think it is, on the left side of the chart, we overestimate the probability of our rare event — a little bit. But if we underestimate sigma a little bit, we underestimate the probability of our rare event — a lot.

Convexity Effects in Lead Time Distributions

Weibull distribution analysis: probability of exceeding SLA as a function of parameter

Weibull distribution analysis: probability of exceeding SLA as a function of parameter

Weibull distribution analysis: probability of exceeding SLA as a function of scale parameter (shape parameter k=3).

Weibull distribution analysis: probability of exceeding SLA as a function of scale parameter (shape parameter k=3).

Let’s apply this convexity thinking to lead time distributions of service delivery in knowledge work. Weibull distributions with various parameters are often found in this domain. Let’s say we have a shape parameter k=1.5 and a service delivery expectation: 95% of deliveries within 30 days. If we are spot-on with our model, the probability to fail this expectation is exactly 5%. How sensitive is this probability to the shape and scale parameters?

With respect to the shape parameter, the probabilities to exceed the SLAs are all convex decreasing functions (I added the SLAs based on 98th and 99th percentiles to the chart). If we underestimate the shape parameter a bit, we overstate the risk a bit; if we overestimate it a bit, we understate the risk — a lot.

In other distribution shape types (k<1, k>2), it is the same story. The risk of underestimating-overestimating the shape parameter is asymmetric.

What about the scale parameter? It turns out there is less sensitivity with respect to it. The convexity effect (it pays to overestimate the scale parameter) is present for k>2 (as shown by the chart), it is weaker for 1<k<2, and the curves are essentially linear for k<1.

Conclusions

When analyzing lead time distributions of service delivery in creative industries, it is important not to overestimate the shape parameter. The under-overstatement of risk due to the shape parameter error is asymmetrical.

Matching a given lead-time data set to a distribution doesn’t have to be a complicated mathematical exercise. We should also not fool ourselves about the precision of this exercise, especially given our imperfect real-world data sets. Using several pre-calculated reference shapes should be sufficient for practical uses such as defining service level expectations, designing feedback loops and statistical process control. If we find our lead-time data set fits somewhere between two reference shapes, we should choose the smaller shape parameter.

Posted in Kanban | Tagged , , , | Leave a comment