Measuring Variation

The software development process only seems random because there has not been any systematic study. It does not matter what is trying to be measured if it has a high variation or large standard deviation the outcomes only seem to be random and unpredictable. This is true whether trying to measure software development or predicting the outcome of the roll of a couple of dice.

To really understand variation, it should be measured. There are many ways to study variation, and the most common method is to calculate the standard deviation. Ouch! The very word standard deviation makes some people’s skin crawl. The larger the standard deviation, the more difficult it is to predict the outcome of any single event. All casinos know the standard deviation does not have to be very high to prevent the prediction of the outcome of any single event.

Before I write too much about variation and standard deviations, I need to tell you about the game of craps. In the game of craps the shooter rolls two six-sided dice. The shooter tries to roll 7’s or 11’s. No matter how hard it is tried the outcome of a specific roll of dice cannot be predicted. Now, I can predict the distribution of the next 100 rolls or so with great precision, and so can the casino. The same is true with software development. Many software organizations have a distribution of performance which is wider than the roll of a couple of dice. This means the outcome of roll is more likely to be predicted than the outcome of a specific project. Predicting the outcome of any single project becomes difficult because of large variances of past performance, but the prediction the outcome of the next dozen or so projects is easily done. The future distribution of projects will match the past distribution of projects unless the environment is changed.

Gamblers do not win at craps because they are good at predicting results, they are just lucky, but their luck always runs out. I have visited a lot of casinos around the world, and many of them are incredibly beautiful resorts. I often remind myself these buildings have not been built with winners money. Often, an estimate is correct because it was a lucky estimate. Unless an organization begins to standardize processes and complete projects in a consistent manner, projects may as well be estimated by wetting a finger and sticking it in the air. This is no different than trying to estimate the specific results of a roll of a couple of dice. It is a waste of time and effort. All kinds of estimating models can be built with a lot of statistics, but they are basically useless when it comes to estimating the results of a specific future project. You can’t win at a casino because all the games are rigged against you. The statistics are crystal clear: you play long enough at any casino game, then you are going to lose. You may win sometimes, but the casino knows you are going to lose in the long run. It does not matter if it is the weather, behavior, or software development anything, with a high variation in results is hard to predict.

One of the problems with CMM implementations is that very little initial measurement takes place, so value of the CMM initiative is not adequately quantified. An organization that undertakes any initiative needs to establish a baseline of performance first. Most of the productivity gains and performance improvements of CMM occur in the early stages of adoption. This is just like any self-improvement initiative. It is easy to lose weight during the early phases of a diet, but it becomes harder to lose weight later on. If weighing or measuring is delayed during the early stages of a diet, success of the diet cannot be gauged. Actual measurement programs are not implemented until later stages of CMM (after level 2 or level 3). Incremental improvement becomes more difficult as time goes on, so measurements taken in the later stages of CMM often understate organizational improvement. I discourage organizations from implementing a measurement program until they stabilize their environments; instead, I encourage organizations to take a sample and develop ranges of productivity and variation.

The very first thing WeightWatchers did is put me on a scale. I also measured my waist size. The very first thing that needs to be done with any initiative is to establish a baseline of performance. This can be accomplished by sampling several past projects. The average rate of productivity is equally important to measuring the consistency by calculating the standard deviation. Organizations that are chaotic have low productivity rates and high variation (the standard deviation high). A large standard deviation indicates a large variation in performance, and a small standard deviation indicates a small variation in performance. The more varied performance, the more difficult it is to have a consistent outcome. As organizations improve and move up, the CMM levels, productivity improves and variation in performance also becomes less.

It is common for a software organization to have wide confidence intervals. Let’s say an organization has a mean of 25 hours per function point, and the calculated confidence interval is 45 hours/fp to 5 hours/fp. Any future project can have a range of productivity from 5 hours/fp to 45 hours/fp. The range of productivity depends on many different factors. Basically, those things that were done during a 5 hour/fp project want to be repeated and doing projects the way they were done for the 45 hour/fp projects needs to be avoided. If the software organization is not even aware of this difference, the pitfalls of past projects cannot be avoided.