Overcoming Failure Modes | Agile Software Development: The Cooperative Game (2nd Edition)

Trygve Reenskaug cautioned me about discussing human failure modes. "If you give a dog a bad name," he reminded me of the old saying, "you might as well hang him." The hazard is that some people will use my naming of failure modes as excuses for poor work. Trygve reminded me that often what passes as a human failure has more to do with the working environment, as discussed in the last section and illustrated in this story he told me:

The Small-Goods Shop

There was a small-goods and buttons shop nearby that was always in terrible shape. The place was a mess, and the girls were either doing their nails or on the phone and didn't have much time for the customers.

That business closed, and another small-goods and buttons shop opened in its place. This place was wonderful! It was clean and tidy, and the girls were attentive to their customers. The only thing was . . . it was the same two girls!

C3 Culture Shifts

The Chrysler Comprehensive Compensation project experienced several shifts, as in this story (C3 1998).

The team initially valued "thinking ahead," "subtle but extensible designs," and "my code is private."

The team, largely under the impetus of Kent Beck, rebuilt itself with the core values "make it simple and clear," "you don't really need that subtle item," "all code is public," and "any pair of people sitting together may change anything." With these shifts, the same people also adopted a different and highly disciplined set of practices.

Those caveats having been placed, I do notice people having certain kinds of "failure modes." I regularly see methodologies and projects fail for not taking these human characteristics into account. We can build systems of people that are less likely to fail by explicitly taking these characteristics into account.

The five failure modes to take into account are people

Making mistakes
Preferring to fail conservatively
Inventing rather than researching
Being creatures of habit
Being inconsistent

Making Mistakes

That people make mistakes is, in principle, no surprise to us. Indeed, that is exactly why iterative and incremental development were invented.

Iterative refers to a scheduling and staging strategy that lets you rework pieces of the system.

Iterative development lets the team learn about the requirements and design of the system. Grady Booch calls this sort of learning "gestalt, round-trip design" (1994), a term that emphasizes the human characteristic of learning by completing.

Iterative schedules are difficult to plan, because it is hard to guess in advance how many major learnings will take place. To get past this difficulty, some planners simply fix the schedule to contain three iterations: draft design, major design, and tested design.

Incremental refers to a scheduling and staging strategy in which pieces of the system are developed at different rates or times and integrated as they are developed.

Incremental development lets the team learn about its own development process as well as about the system being designed. After a section of the system is built, the team members examine their working conventions to find out what should be improved. They might change the team structure, the techniques, or the deliverables.

Incremental is the simpler of the two methods to learn, because cutting the project into subprojects is not as tricky as deciding when to stop improving the product. Incremental development is a critical success factor for modern projects (Cockburn 1998).

The very reason for incremental and iterative strategies is to allow for people's inevitable mistakes to be discovered relatively early and repaired in a tidy manner.

That people make mistakes should really not be any surprise to us. And yet, some managers seem genuinely surprised when the development team announces a plan to work according to an incremental or iterative process. I have heard of managers saying things like

"What do you mean, you don't know how long it will take?"

or

"What do you mean, you plan to do it wrong the first time? I can go out and hire someone who will promise to do it right the first time."

In other words, the manager is saying that he expects the development team not to make any major mistakes or to learn anything new on the project.

One can find people who promise to get things right the first time, but one is unlikely to find people who actually get things right the first time. People make mistakes in estimation, requirements, design, typing, proofreading, installing, testing . . . and everything else they do. There is no escape. We must accept that mistakes will be made and use processes that adjust to the fact of mistakes.

Given how obvious it is that people make mistakes, the really surprising thing is that managers still refuse to use incremental and iterative strategies. I will argue that this is not as surprising as it appears, because it is anchored in two failure modes of humans: preferring to fail conservatively rather than risk succeeding differently, and having difficulty changing working habits.

Preferring to Fail Conservatively

There is evidence that people generally are risk-averse when they have something in their hands that they might lose and risk-accepting if they are in the process of losing something and may have a chance to regain it (Piattelli-Palmarini 1996).

Piattelli-Palmarini describes a number of experiments involving risks and rewards. The interesting thing is that even when the outcomes are mathematically identical, the results are different depending on how the situation is presented.

Illusions of Choice

Piattelli-Palmarini cites a dual experiment. In the first, people are given $300 and then have to choose between a guaranteed $100 more or a 50/50 chance at $200 more.

People prefer to take the guaranteed $100.

In the second, people are given $500 and then have to choose between having $100 taken away from them or a 50/50 chance of having $200 taken away from them.

People prefer to risk having $200 taken from them.

(Piattelli-Palmarini, p. 58)

Mathematically, all outcomes are equal. What is interesting is the difference in the outcomes depending on how the problem is stated.

Piattelli-Palmarini sums up the aspect relevant to project managers: We are risk-averse when we might gain.

Consider a manager faced with changing from waterfall to incremental or iterative scheduling. The waterfall strategy is accepted as a normal, conservative way of doing business, even though some people think it is faulty. The manager has used this strategy several times, with varying success. Now, one of his junior people comes to him with a radically different approach. He sees some significant dangers in the new approach. His reputation is riding on this next project. Does he use the normal, conservative strategy or try out the risky new strategy?

Odds are that he will use the normal, conservative strategy, a "guaranteed" standard outcome, rather than one that might work but might blow up in strange ways.

This characteristic, "preferring to fail conservatively rather than to risk succeeding differently," gets coupled with people's fear of rejection and the difficulty they have in building new work habits. The three together explain (to me) why managers continue to use the long-abused one-pass waterfall development process. Based on this line of thinking, I expect that people will continue to use the waterfall process even in the presence of mounting evidence against it and increasing evidence supporting incremental and iterative development. Use of the waterfall process is anchored in a failure mode.

In keeping with variations among people, some people have the opposite tendency. Often, though, the most adventuresome people are those who have little to lose personally if the project fails.

The good news is that there are opportunities for both sorts of people. The bad news is that these people probably find themselves on the same project.

Inventing Rather Than Researching

This behavioral mode may be peculiar to American and European software developers. (I don't have enough experience with Indian and Asian developers to comment on their habits.) It is the tendency to avoid researching previous solutions to a problem and just invent a new solution on the spot.

This tendency is usually described as a sickness, the Not-Invented-Here (NIH) Syndrome. I prefer not to view it as a disease but rather as a natural outgrowth of cultural pressures. One might instead call it the Invent-Here-Now Imperative. It grows in the following way:

From their earliest school days, students are instructed not to copy other people's work, not to help each other, and to be as original as possible in all but rote memory acts. They are given positive marks for originality and punished for using other people's solutions. (Recently, a fourth grade teacher told her students not to call each other to discuss homework problemsnot even to ask which problems to do!)

Through the university level, assignments are designed to produce grades for individual work, not for teamwork. This reaches a culmination in the Ph.D. dissertation, where originality is a core requirement.

Somewhere in these years of schooling, some people join the profession of "programmer," a person whose job is to program and who advances in the profession by writing harder and more uniquely original programs.

Under these circumstances, it is hardly surprising that the people internalize the Invent-Here-Now Imperative.

Upon showing up at work, though, these same people are told by the business owners that they should not write new programs but should scavenge solutions created throughout the industry over the history of the field. They should use as many existing solutions as possible, without violating intellectual property rights.

The rewards offered for this behavior are meager. People continue to receive low evaluations for reusing code instead of writing new code. Promotion comes to those who do the most and the best programming, not those who successfully hook together existing components. Technical authors still refer to people who do such work as low-level "component assemblers."

Frakes and Fox did a survey and found that education and attitudejust showing people that the culture values reuse over developing new solutionsshowed the greatest correlation with increased reuse (Frakes 1995). Reward structures did not show a significant effect, nor did object-oriented technology, CASE tools, or a myriad of other factors.

Texas Instruments fought its "Not-Invented-Here" syndrome with an unusual award, the "Not Invented Here But I Did It Anyway" award (Dixon 2000). This NIHBIDIA award not only rewards people who make use of previous results, but it pokes fun at people caught up in the NIH syndrome at the same time. In this way, it creates a social effect of the type Frakes and Fox were referring to.

People who are professionals in some different field do practice effective reuse. These people, using the computer to accomplish some assignment of value in that other field, develop their sense of accomplishment from the program's effect in that other field, not from the cleverness of the programming. They are therefore motivated to put the software together to get on with their other work. They happily accept a less glamorous design if it can be put into use quickly.

Being Inconsistent Creatures of Habit

Asking a person to change his habits or to be consistent in action are the two most difficult requests I can think of. We are creatures of habit who resist learning new behaviors, and at the same time we tend toward inconsistency.

This may seem like a harsh judgement, so I illustrate it with a conversation I heard among four people. Each was a senior manager or had a Ph.D., so these were people you would most expect to be able to handle changing habits and being consistent.

The Clean Desk Technique

One of the four said, "I'm suffering from the flood of paper that comes into my office. I can't think of how to manage it."

A second offered, "It's easy. Keep your desk entirely clean. Put four baskets on one side and a set of folders in the top drawer. When a new piece of paper shows up, deal with it directly, and put it into its correct filing place . . ."

He actually didn't get that far before the other three jumped in together:

"Keep my desk clean!? I can't do that!"

The second speaker never got to finish explaining his technique. The demand was that the people act with care at 100 percent consistency. A few people can accomplish this. Most people, though, vary from hour to hour, having a good hour followed by a bad one. Some people even celebrate being inconsistent and careless.

Worse than asking them to be consistent, the second speaker asked them to both change their habits and be consistent in that change.

This story tells me, as a methodologist, that if we ever do discover an optimal design process, people will resist using it and then use it sporadically or carelessly.

If only people could just act consistently . . .

Of course, if they could do that, they could keep their desks clean, avoid cavities, lose weight, give up smoking, play a musical instrument, and possibly even produce software on a regular and timely basis.

We already know of a number of good practices:

David Gries detailed how to derive correct programs in The Science of Programming (1981).
Beck and Cunningham (1989) and Wilkinson (1995) described using CRC cards in object-oriented design.
Beck (2000) and Jeffries (2001) described pair programming and test-first design in the context of Extreme Programming.
Careful design checking and statistical testing were detailed in the Clean-room methodology (Becker 1996).
Humphrey (1997), in his Personal Software Process, provided detailed instructions about how programmers can become more effective through checking where errors are introduced.

Consistent application of any of these ideas would improve most of the projects I have visited. As Karl Wiegers quipped, "We are not short on practices; we are short on practice."

Countering with Discipline and Tolerance

Methodologists deal with people's common weaknesses with discipline or tolerance:

Create mechanisms in the methodology that hold strict behavioral standards in place.
Design the methodology to be tolerant of individual variations.

Most choose discipline.

Because consistency in action is a human weakness, high-discipline methodologies are fragile. Even when they contain good practices, people are unlikely to keep performing those practices over time. Performing a disciplined activity daily is just as hard in software development as keeping your desk clear in the clean-desk technique just mentioned.

To remain in practice, a high-discipline methodology must contain specific elements that keep the discipline in place.

Let's look briefly at three high-discipline methodologies: Cleanroom, Personal Software Process, and Extreme Programming.

In Cleanroom, production code is not allowed to be compiled before being checked in. Typing errors and syntax errors are considered part of the statistical process being controlled (new language features and system calls are learned on nonproduction code). The team doing the compiling can then detect the rate at which errors are introduced during program entry.

This is a high-discipline rule and requires explicit management support and checks.

In the Personal Software Process, the practitioner is to write down how long each activity took and to tabulate at what point errors were introduced. From these notes, the person can determine which activities are most error-prone and concentrate more carefully next time. The difficulty is, of course, that the logs take effort to maintain, requiring consistency of action over time. Not producing them properly invalidates PSP.

PSP contains no specific mechanisms to hold the high-discipline practices in place. It is, therefore, not terribly surprising to find the following experience report coming from even a highly disciplined development group. The following words about PSP were written by a military group that had been trained in PSP and had achieved the Software Engineering Institute's Capability Maturity Model Level 5 rating (Webb 1999):

PSP Report

"During the summer of 1996, TIS introduced the PSP to a small group of software engineers.

Although the training was generally well received, use of the PSP in TIS started to decline as soon as the classes were completed. Soon, none of the engineers who had been instructed in PSP techniques was using them on the job.

When asked why, the reason was almost unanimous: 'PSP is extremely rigorous, and if no one is asking for my data, it's easier to do it the old way.'"

Extreme Programming (XP) is the third methodology to call for high-discipline practices. It calls for programming in pairs (with pair rotation), extensive and automated unit tests completed prior to code check-in each day, adherence to the group's coding standards, and aggressive refactoring of the code base.

Based on the discussion above, I expected to find adherence to the XP practices to be short-lived in most groups. My interview results were somewhat surprising, though.

People report programming in pairs to be enjoyable. They therefore program in pairs quite happily, after they adapt to each other's quirks. While programming in pairs, they find it easier to talk each other into writing the test cases and adhere to the coding standards.

The main part of XP that is high-discipline and resistant to the pressure of programming in pairs is the code refactoring work. I still find that most people on the team do not refactor often, generally leaving that to the senior project person.

However, unlike PSP, Extreme Programming contains a specific mechanism to help with the discipline. It calls for one person to act as "coach" and keep the team members sensitive to the way in which they are using the practices.

It is interesting to note that all three of these methodologies were invented by people who were, themselves, consistent in the habits they required. So it is not as though high-discipline methods can't be used. They are just "fragile."

The alternative to requiring discipline is being tolerant of individual variation.

Adaptive Software Development (Highsmith 2000) and the Crystal methodology family described in this book are the only two methodologies I know of that are explicitly about being "variation-tolerant." Each methodology calls for the team members to form consensus on the minimum compliance needed in the work products and practices. Each suggests the use of standards but does not require that standards be enforced.

For "tolerant" methodologies to work, the people on the project must care about citizenship issues and generally take pride in their work. In such projects, the people develop a personal interest in seeing that their work is acceptable. Getting this to happen is no more guaranteed than getting people to follow standards, but I do see it accomplished regularly. It was also reported by Dixon (2000, p. 32).

Which is better: high-discipline or high-tolerance methodologies?

Strict adherence to strict (and effective) practices should be harder to attain but may be more productive in the end.
Tolerant practices should be easier to get adopted but may be less productive.

Part of the difficulty in choosing between them is that there currently is no consensus as to which practices are effective or ineffective under various circumstances. As a result, project leaders might enforce strict adherence to practices they consider effective and be surprised at the negative result they encounter.

The "Continuous Redocumentation" story in the last chapter gave one example of false adherence to discipline. The sponsors required that every change to any part of the system be immediately reflected in all documentation. They probably thought this would be an effective practice. In their context, though, it proved too costly, and the project was canceled.

In other words, although strict adherence to effective practices leads to an effective team, strict adherence to ineffective practices leads to an ineffective team.

If only we knew which was which.