The 20% time has become popular in the software industry in recent years. Even though most programmers don’t work at companies that have 20% time, most have heard or know someone who works at a place like Google, where programmers spend 80% of their hours working on what the company requires them to do and 20% on their own projects. Or so we have been told.
A shop across town is doing it and now we want to do it too. Many programmers have tried to introduce 20% time in their workplaces and that proved to be very difficult. So, how can we do it? What are the dos and don’ts? Is there some theory behind this practice? I want to summarize answers to these questions in this post and hope programmers find it useful.
The main reason for 20% time is to keep capacity utilization at 80% rather than at 100%.
You can think of a software development organization as a system that turns feature requests into developed features. You can model its behaviour using queueing theory. Using queueing theory to understand how responsiveness of a software organization depends on its utilization is presented thoroughly in the Chapter 3 of Donald Reinertsen’s 2009 book, The Principles of Product Development Flow. The same logic can also be found in the popular 2006 book by Mary and Tom Poppendieck, Implementing Lean Software Development: From Concept to Cash. It has an example of how Google achieves greater effectiveness by avoiding 100% utilization. I recall having a discussion with a colleague after reading that book – Google’s effectiveness could also be due to the fact that all Googlers we both knew seemed to spend all their waking hours inside Google. We were not 100% sure about the utilization argument. But I read Reinertsen’s book later and it became abundantly clear.
So, programmers thinking of establishing 20% time need to understand the theory behind it.
If requests arrive faster than the system can service them, they queue up. When arrivals are slower, the queue size decreases. Because the arrival and service processes are random, the queue size changes randomly with time. The mathematically inclined can ask about this randomness: there must be some probability distribution, so what will the queue size be on average? Math (queueing theory) has an answer to that: if both arrival and service processes are Markov, then:
where the Greek letter rho is the utilization coefficient equal to the ratio of service and arrival rates. If the processes are non-Markov, the math is more complicated, but doesn’t change the conclusions.
If you plot this function, you can see that the average queue length remains low while utilization is up to 0.8, then rises sharply and goes to infinity. You can understand this intuitively by thinking about your computer’s CPU: when its utilization approaches 100%, the computer becomes unresponsive.
The economics of software development is such that software companies incur big costs when their queues are in high-queue states. This includes missed market opportunities, obsolete products, late projects, and waste caused by building features in anticipation of demand. The 20% time is thus the scientific answer to the problem of optimizing economic outcomes: avoid high-queue states by avoiding utilization ratios causing them. It is essentially the slack that keeps the system responsive.
Several practical conclusions follow immediately:
- if you’re considering 20% time and doing cost accounting (developers’ time costs X, but/and the company can/cannot afford it), you’re doing it wrong. If a company can give its programmers 20% time on the basis of cost, it can afford to give them a 25% across-the-board raise. It may have some explaining to do as to why it has been underpaying them so much for so long.
- if you’re allocating 20% to a Friday every week, you’re doing it wrong
- if you’re setting up a 20% time project proposal submission/review/approval system, you’re doing it wrong
- if you’re filling out timesheets, you’re doing it wrong
- if you’re using innovation as a motivator for 20% time, you’re doing it wrong. While new products have come out of 20% projects, they were not the point. If your company cannot innovate during its core hours, that’s a problem!
- The 20% time is not about creativity. Don’t say you’ll unleash your creativity with 20% time, ask why you’re not creative enough already during your core hours.
Those Are All Don’ts, Where Are the Dos?
You may ask now, you’ve told us all these ways to do it wrong, what about doing it right? Let me answer with the best question I’ve heard while discussing this subject: “If 20% of your capacity is mandated to be filled with non-queue items, then you’ve just shrunk your capacity to 80%, and 80% is your new 100%. Right?”
Yes, “80 is the new 100” highlights the main problem with the attempts to adopt the 20% time without understanding the theory. You need to escape the utilization trap, not to stay in the trap and allocate the 100% differently! You cannot mandate the 20% time, because you cannot choose your utilization percentage, because it’s an output variable. It is a ratio of characteristics of two processes, so it is what it is because the processes are the way they are. So you can only do it the hard way – by changing the processes: the arrival process (demand) and the service process (capability). Balancing capability and demand – we’re basically talking about a lean transformation here or building your company lean from the start. As your lean initiative progresses, slack emerges. But if try to mandate 20% time, you end up in the same utilization trap with less capacity.