Discipline

We were teaching a class at a company whose development processes were relatively chaotic. The class went well until we got to the part on quality, and then a bit of discouragement set in. "You mean we have to do all that stuff? We thought lean meant going fast, but that quality stuff looks like a lot of work!" Indeed.

Shortly after that class, we gave a seminar to aerospace developers. When we got to the part on quality we heard: "Oh, we've been doing that for years." And indeed they had. The people in this group probably knew as much about automated testing as anyone we have met.

You can't go fast without building quality into the product, and that takes a lot of discipline. We find that such discipline is second nature to people working on life-critical systems, companies with Six Sigma programs, and organizations with high CMM assessments. But to a small and rapidly growing company that hasn't thought much about quality, becoming lean means becoming very disciplined about the way software is developed.

The Five S's

When we walk into a team room, we get an immediate feel for the level of discipline just by looking around. If the room is messy, the team is probably careless, and if the team is careless, you can be sure that the code base is messy. In an organization that goes fast, people know where to find what they need immediately because there is a place for everything and everything is in its place. In Mary's manufacturing plant, housekeeping inspections were held weekly, and they included the programming area as well as the production floor.

The five S's are a classic lean tool to organize a workspace so everything is at hand the moment it's needed, and there is no extra clutter. The S's stand for five Japanese words that start with an S: seiri, seiton, seiso, seiketsu, and shitsuke. They have been translated into five English words that also start with an S: sort, systematize, shine, standardize, and sustain.

We recently remodeled our kitchen, and as we moved into the new space we found ourselves using the five S's to reconfigure our workspace.

Sort (Seiri): First we sorted through all our kitchen tools and set aside anything we had not used in the last year. Only the things we actually used went back into the kitchen.
Systematize (Seiton): The big project was to find a place for everything that would make it easy to find and close at hand. We moved shelves and invested in drawer organizers and wall hooks. We rearranged things several times before we found the right place for each tool and appliance.
Shine (Seiso): With everything finally put away, we cleaned up the kitchen and were ready to start cooking.
Standardize (Seiketsu): Then we agreed on two (new!) policies: We would fill and run the dishwasher every night, and we would put everything away first thing every morning.
Sustain (Shitsuke): Now we just have to keep up the discipline.

The software development workspace is not just the physical workroom, but also the desktop on a computer screen, the layout of the team server, and the code base everyone is working on. So after you apply the five S's to the team room, think about applying it to the logical workspace behind the screen. And since software is a reflection the organization that developed it, take a good look at the code base as well.

Sort (Seiri): Sort through the stuff on the team workstations and servers, and find the old versions of software and old files and reports that will never be used any more. Back them up if you must, then delete them.
Systematize (Seiton): Desktop layouts and file structures are important. They should be crafted so that things are logically organized and easy to find. Any workspace that is used by more than one person should conform to a common team layout so people can find what they need every place they log in.
Shine (Seiso): Whew, that was a lot of work. Time to throw out the pop cans and coffee cups, clean the fingerprints off the monitor screens, and pick up all that paper. Clean up the whiteboards after taking pictures of the important designs that are sketched there.
Standardize (Seiketsu): Put some automation and standards in place to make sure that every workstation always has the latest version of the tools, backups occur regularly, and miscellaneous junk doesn't accumulate.
Sustain (Shitsuke): Now you just have to keep up the discipline.

The 5 S's for Java

Sort (Seiri): Reduce the size of the code base. Throw away all unneeded items immediately. Remove:
- Dead code
- Unused imports
- Unused variables
- Unused methods
- Unused classes
- Refactor redundant code
Systematize (Seiton): Organize the projects and packages. Have a place for everything and everything in its place.
- Resolve package dependency cycles
- Minimize dependencies
Shine (Seiso): Clean up. Problems are more visible when everything is neat and clean.
- Resolve unit test failures and errors ( passed == 100%)
- Improve unit test coverage ( > 80%)
- Improve unit test performance
- Check AllTests performance
- Resolve checkstyle warnings
- Resolve PMD warnings
- Resolve javadoc warnings
- Resolve TODO's
Standardize (Seiketsu): Once you get to a clean state, keep it that way. Reduce complexity over time to improve ease of maintenance.
Sustain (Shitsuke): Use and follow standard procedures.

This is the way we do things.

Kent Schnaith^[18]

^[18] From private e-mail communication. Used with permission.

Standards

As we travel around the world, we have come to appreciate the importance of standards. We really appreciate the train system and metro transportation we find in almost every European city, but we are especially appreciative in the United Kingdom, because we try very hard not to drive a car there. We couldn't avoid driving in New Zealand recently, but even after two weeks we had to be vigilant at every turn or we would find ourselves on the wrong side of the road. Every electrical gadget we own can operate on any voltage, and we carry a bag full of plug converters (see Figure 8.7). We have learned to think of temperatures in Celsius and distances in kilometers, but we have not gotten used to electrical switches where up is off.

Figure 8.7. Standards?

Standards make it possible to operate reflexively and move information without conversion waste. A standardized infrastructure with a common architecture reduces complexity and thus lowers cost. Any organization that wants to move fast will have standards in place. Here are some of the standards that a software development organization should consider:

Naming conventions
Coding standards
User interaction conventions
File structures
Configuration management practices
Tools
Error log standards
Security standards

Of course, standards are useless on paper; they have to be used consistently to be valuable. In a lean environment, standards are viewed as the current best way to do a job, so they are always followed. The assumption, however, is that there is always a better way to do things, so everyone is actively encouraged to challenge every standard. Any time a better way can be found and proven to be more effective, it becomes the new standard. The discipline of followingand constantly changingstandards should be part of the fabric of an organization.

Appreciating Standards

I know of a case where there were two programs, both with the name "SYNC." They were stored in the same location on two different systems. On System 1, SYNC saved local files by merging them into the master file, while on System 2 SYNC reset the local files by replacing them with the master file. An operator who frequently worked on System 1 happened to be working on System 2. She found SYNC in the same location she always found SYNC, so she assumed that it was SYNC. She ran SYNC to merge a large number of recent transactions into the master file. To her horror, the master file was copied over her local flies, completely wiping out everything she was trying to save. It was a very expensive "mistake," and the root cause was probably the absence of naming standards.

Mary Poppendieck

Code Reviews

"Are code reviews waste?" people often ask in our classes. Good question. We think that using code reviews to enforce standards or even to find defects is a waste. Code analyzers and IDE checks are the proper tools to enforce most standards, while automated testing practices are the proper way to avoid most defects. Code reviews should be focused on a different class of problems. For example, the code of inexperienced developers might be reviewed for simplicity, change tolerance, absence of repetition, and other good object-oriented practices. Code reviews can also be a good tool for raising awareness of issues such as complexity. One organization we know of computes the McCabe Cyclomatic Complexity Index^[19] for newly committed code, and when it reaches a threshold of 10, a code review is triggered.

^[19] The McCabe Cyclomatic Complexity Index is a measure of the number of execution paths through a program. See "A Complexity Measure," by Thomas J. McCabe, IEEE Transactions on Software Engineering, Vol. Se-2, No.4, December 1976.

Formal Code Review

In our organization, a group of Cobol developers was transitioning to Java. They had training in Java syntax, but object-oriented thinking is counterintuitive to people who have done procedural programming for years. We used code reviews to help show the developers how to use standard patterns while writing and refactoring object-oriented code. When a developer finished a section of code and requested a review, it was held as soon as possible and was open to anyone who wanted to attend. The developer showed the code to two technical leaders, who then discussed how it might be improved. The two experts did not always agree, and the ensuing discussion gave the developers thinking tools rather than answers. The atmosphere of the reviews was so open and educational that they became very popular. Soon more and more developers were sitting in on the reviews, because they learned so much. The collective capability of the new Java developers increased dramatically through this process.

Jill Aden^[20]

^[20] From a presentation at the Twin Cities OTUG meeting, on June 17, 2003, and later conversations. Used with permission.

Some policies that require code reviews prior to check-in create a large accumulation of partially done work waiting for review, but in a lean environment this is an unacceptable situation that should be avoided aggressively. One way to provide code review with no delays is to use some form of pairing while the code is being written.

Pairing

Pairing (also called pair programming) is the practice of having two people work side-by-side on the same task. This provides continuous code review in a flow rather than a batch. The judicious use of pairing can be very valuable. Developing code in pairs can enhance learning, teamwork, and problem-solving skills. Moreover, pairing often increases productivity because both the quality and the robustness of the code are enhanced when viewed through two sets of eyes. In addition, pairs tend to deflect interruptions and keep each other on task. Anyone who has ever programmed will tell you that developing software is a continuous exercise in solving problems, and pair problem solving is a pattern we find in many occupations. Pairing makes it easier to add new people to a team, because there is a built-in mentoring program.

Pairing is not for everyone nor for all situations. But pairing often creates synergy: Two people will frequently deliver more integrated, tested, defect-free code by working together than they could produce working separately. And pairing is one of the best ways to achieve the benefits of reviews without building up inventories of partially done work.

Open Source Reviews

In Open Source communities, "committers" review all code before it is committed to the code base. Committers are trusted developers who have demonstrated their competence and commitment to the project. Typically there is one committer for about every ten contributors, and code is always submitted in small batches. This allows submissions to be reviewed and committed very quickly. After submission, the code is subject to the review of the entire community, where scrutiny is immediate and advice is freely given. It's been said that if you really want to learn how to code, try your hand at Open Source, and you will get plenty of feedback on how to improve.

Mary Poppendieck

Mistake-Proofing

When you connect a projector to a laptop with a video cable, it's difficult to plug it in wrong, because it is mistake-proof. One side is wider than the other and has more pins. After a quick look, most people will get it right. A USB cable, on the other hand, is not so simple. How many times do you try it one way, push a bit, realize that it's backward, turn it over, and try again? Although it is virtually impossible to plug a USB cable in wrong, it is not always obvious which way it should be oriented. This cable is not quite mistake-proof enough for our tastes.

The worst offender, however, is no longer with us. The IDE cable appeared in 1984 on the IBM AT, where it was used to connect the hard drive to the system board. For many years, the cable had 40 holes that plugged into 40 pins on the disk drive. There was a red stripe down the side to let you know which side was for pin 1, but it was often difficult to tell where pin 1 was on the drive, or the stripe was hard to see, or you just weren't paying attention. As a result, quite often the cable got attached upside down or misaligned, missing a couple of pins on one end or the other. Many a dead hard drive or fried controller can be attributed to the fact that this cable was not mistake-proof. Various indents and tabs were added to the drive and connector, but these were not consistently located and most were not effective. It took years for the industry to remove pin 20 from each disk drive, fill in pin 20 on the connector, and finally make it impossible to make a mistake with an IDE cable (see Figure 8.8).

Figure 8.8. A keyed IDE connector

When we ask in a class if anyone has ever assembled a PC with an un-keyed IDE cable, we usually get a few groans, as people who've been around a while raise their hands. Then we ask anyone who had ever made a mistake with an IDE cable to lower their hands. Invariably, all hands go down. We have also had our share of IDE cable misalignments, and we know two things: Everyone who ever plugged in unkeyed IDE cables considered themselves to be an expert, and everyone knew enough to be very careful. Yet almost every one of us has, at one time or another, made a mistake plugging in that cable.

Mistakes are not the fault of the person making them, they are the fault of the system that fails to mistake-proof the places where mistakes can occur. With software, anything that can go wrong will eventually go wrongjust ask the people in operations if you don't believe us. So don't waste time counting the number of times individuals are "responsible" for defects and pressuring them to be more careful. Every time a defect is detected, stop, find the root cause, and devise a way to prevent a similar defect from occurring in the future. Perhaps a new unit test might be the proper countermeasure, but a development team should think broadly about effective ways to mistake-proof code. Get testing and operations involved in identifying potential failure points and add mistake-proofing before defects occur.

Automation

One of most rewarding ways to mistake-proof development is to automate routine tasks.^[21] Even small teams should automate everything they can, and even occasional tasks should be candidates for automation. Automation not only avoids the eventual mistakes that people will always make, it shows respect for their intelligence. Repetitive tasks are not only error prone; they send the message that it's OK to treat people like robots. People should not be doing things by rote; they should thinking about better ways of doing their job and solving problems.

^[21] For the ideas on what and how to automate, see Pragmatic Project Automation: How to Build, Deploy and Monitor Java Applications, by Mike Clark, Pragmatic Press, 2004. See also Pragmatic Project Automation for .NET by Ted Neward and Mike Clark, Pragmatic Press, forthcoming.

Examples of Automation

One Click Build: Automating the build is the first step. After the build, fire off a set of automated tests to see if anything broke. Many teams use Ant.
Scheduled Builds: Once the build can be done in one step, it's easy to schedule it. A build can be triggered by a clock or a code check in. Quite often, teams use Cruise Control.
Build Result Notification: No sense doing a build if no one knows the results. It's easy to set up an e-mail notification if the build fails. It's fun to turn on a red semaphore or light up a red lava lamp to alert everyone to a failed build.
One-Step Release: A release probably means creating a release branch and packaging the code into a file or set of files that can be downloaded or distributed on a CD. This really should be an automated process. There is no room for error.
Bullet-Proof Installation: When the distribution files arrive at customer sites, the installation process should be automated, even if you send people on-site to help. The distribution media might include a few diagnostics tools to help when (not if) an installation fails.

Test-Driven Development

As we mentioned in Chapter 2, Shigeo Shingo taught that there are two kinds of tests: tests that find defects after they occur and tests to prevent defects.^[22] He regarded the first kind of tests as pure waste. The goal of lean software development is to prevent defects from getting into the code base in the first place, and the tool to do this is test-driven development.

^[22] See Shigeo Shingo, Study of 'Toyota' Production System, Productivity Press, 1981, Chapter 2.3.

"We can't let testers write tests before developers write code," one testing manger told us. "If we did that, the developers would simply write the code to pass the tests!" Indeed, that's the point. Some people feel that a tester's job is to interpret the specification and translate it into tests. Instead, why not get testers involved with writing the specifications in the form of executable tests to begin with? This has the advantage of mistake-proofing the translation process, and if it's done with the right tools, it can also provide automatic traceability from specification to code. The more regulated your industry is, the more attractive executable specifications can be.

If the job of testing is to prevent defects rather than to find them, then we should consider what this means for the various kinds of tests that we typically employ. Brian Marick proposes that we look at testing from the four perspectives we see in Figure 8.9. This figure shows that testing has two purposes: We want to support programmers as they do their job, and we also need to critique the overall product that the software supports. We can do this from a technical perspective or from a business perspective. This gives us four general test categories, which we will describe briefly.

Figure 8.9. Types of testing^[23]

^[23] From Brian Marick, "Agile Testing Directions," available at www.testing.com/cgibin/blog/2003/08/21-agile-testing-project-1. Used with permission.

Unit Tests (Also Called Programmer Tests)

Unit tests are written by developers to test that their design intent is actually carried out by the code. When developers write these tests first, they find that the code design evolves from the tests, and in fact the unit test suite becomes the software design specification. Writing tests first usually gives a simpler design because writing code that is testable leads to loose coupling of the code. Because of this virtuous circle, once developers start using test-first development, they are reluctant to develop software any other way.

A selection of unit tests are assembled into a test harness that runs at build time. The build test suite must be fast. A build and test should take less than 10 minutes, or else developers will avoid using it. Therefore the code tested with the build test suite is frequently separated from the database with mock objects to speed up execution.

The reason unit tests are sometimes called programmer tests is because programmers use unit test tools to test more than small elements of code. Unit test tools are used to test at any level: unit, feature, or system.

Story Tests (Also Called Acceptance Tests)

Story tests identify the business intent of the system. They are the tests that determine whether the software will correctly support customers in getting their job done. When story tests are written before the code, they help everyone think through what the customers' job really involves and how it will be supported by software. The team works through examples of what the system should do, and the story tests become a specification-by-example. If we are going to write a specification ahead of time, it may as well be executable to save the waste of writing tests later and then tracing code to specifications.

Automated story tests are not usually run through the user interface; they should generally be run underneath the user interface. In order for this to work, user interfaces should be a very thin presentation layer; all logic and policies should be at lower, separately testable layers. Usually the remaining user interaction layer can be thoroughly tested separately from the business logic.^[24]

^[24] Visually oriented interaction layers such as those found in graphical programs, games, and similar software are more likely to require manual inspection than transaction-oriented interfaces.

Story tests should be automated and run as often as practical. Generally they are not part of the build test suite because they usually require a server or database. However, a selection of story tests should be run every day, a more complete set should run every week, and every one should pass by the end of an iteration.

Usability and Exploratory Testing

Usability and exploratory tests are, by definition, manual tests. When a system passes its automated tests, we know that it does what it is supposed to do, but we do not know how it will be perceived by users or what else it might do that we haven't thought to test for. During usability tests, actual users try out the system. During exploratory tests, skilled testing specialists find out what the system does at the boundaries or with unexpected inputs. When exploratory tests uncover an area of the code that is not robust, a test should be written to show developers what needs to be done, and the test should be added to the test harness.

Property Testing

Property testing includes the so-called nonfunctional requirements such as response time, scaling, robustness, and so on. Specialized tools exist to test systems under load, to check for security problems, and so on. There are also tools that generate combinatorial tests that test every possible configuration the system might encounter. If these tools are appropriate for your environment, invest in them and develop the skills to use them well. Start using them early in the development process. Apply them in an environment that is as close as you can get to actual operational conditions.

Configuration Management

Configuration management is a central discipline in any software development environment, and agile practices create significant demands on a configuration management system. Consider that:

Any area of code can be worked on by several people at a time.
Releases are small and frequent.
As one set of features is released to production, new features are being developed.
The entire code base constantly undergoes refactoring (continuous improvement).

The configuration management system is the librarian that checks out code lines to people who will change them, then files the new versions correctly when they are checked back in. It can be used to manage repositories of not only code, but also documentation and test results, a particularly useful feature in a regulated environment. Every development organization should have a configuration management system that supports the scenarios used in the organization, and it should establish policies governing how the system is used.

Here are some typical scenarios handled by configuration management systems:^[25]

^[25] For many additional scenarios and considerations on how to use configuration management systems, see Brad Appleton's articles at: www.cmwiki.com/AgileSCMArticles.

Developers check out files into their private workspace. They make changes, then just before checking the code back in, they merge the changes with the current code base on their machine and do a private build and test. If everything works, they check their new code into the configuration management system, which triggers a public build and test.
At the time of a release, a branch is created that includes all the code to be included in the release. Developers continue adding new features to the main code line. Any changes that are made in the release during testing or maintenance are merged back into the main code line as soon as possible. (When code is released as part of the iteration, or a branch is deployed without modification, this scenario is unnecessary.)

Continuous Integration

Whenever code is checked out to a private workspace or branched into a separate code line, incompatibilities will arise across the two code lines. The longer parallel code lines exist, the more incompatibilities there will be. The more frequently they are integrated, the easier it will be to spot incipient problems and determine their cause. This is so fundamental that you would expect continuous integration to be standard practice everywhere. But it's not so simple. First of all, it is counterintuitive for developers to check in their code frequently; their instincts are to make sure that an entire task is complete before making it public. Secondly, lengthy set-up times exacerbate the problem. Private builds take time. Checking in code takes time. Public builds and tests can take a lot of time. Stopping to fix problems discovered in the builds and tests can really slow things down. So there is strong incentive to accumulate large batches of code before integrating it into the code base.

Stretching out the time between integration is false economy. The lean approach is to attack the set-up time and drive it down to the point where continuous integration is fast and painless. This is a difficult discipline, partly because developing a rapid build and test capability requires time, skill, and ongoing attention, and partly because stopping to fix problems can be distracting. But a great epiphany occurs once a software development team finds ways to do rapid builds and continuous integration and everyone faithfully stops to fix problems as soon as they are detected. We have been told many times that this is a very difficult discipline to put in place, but once it is working, no one would ever think of going back to the old way of doing things. Invariably teams experience an accelerating ability to get work done, which makes believers out of the skeptics.

Nested Synchronization

As we mentioned earlier in this chapter, large systems should have a divisible architecture that allows teams to work concurrently and independently on subsystems. But that is not the whole story. What happens when the independent teams try to merge these subsystems? In theory, the interfaces have been well defined and are stable, and there are no surprises when the subsystems are eventually integrated. In practice, it hardly ever works that way. In fact, the failure of subsystem integration at the end of development programs has been one of the biggest causes of escalating schedules, escalating costs, and failed programs.

If you think of software as a tree and the subsystems as branches, you might get a picture of where the problem might be. Generally you would not grow branches separately from each other and then try to graft them onto the trunk of a tree. Similarly, we should not be developing subsystems separately from each other and then try to graft them onto a system. We should grow subsystems from a live trunkif the trunk doesn't exist, we should build it first. The subsystems will then be organically integrated with the system, and we don't have to worry about trying to graft everything together at the end.

The rule we stated for parallel code lines also holds for larger subsystems: The longer parallel development efforts exist, the more incompatibilities there will be. The more frequently they are integrated, the easier it will be to spot incipient problems and determine their cause. Instead of defining interfaces and expecting them to work, we should actually build the interfaces first, get them working, and then build the subsystems, synchronizing them with each other as frequently as practical.

How does this work in practice? It starts with continuous integration. Several times a day developers check in code, which triggers an automatic build and test to verify that everything is still synchronized at the micro level. An acceptance test harness is run overnight. Various configurations and platforms may be tested over the weekend. Every iteration the entire code base is synchronized and brought to a "releasable" state. It may also be synchronized with other subsystems or perhaps with hardware. Every release the code is deployed to synchronize with customers.

With large systems and geographically dispersed teams, nested synchronization is even more critical, because it provides a basis for continual communication and regularly aggregates the collective knowledge of the teams in one place. Teams working on closely related subsystems should synchronize their work by working off of a common code base. Major subsystems should be brought together at the end of every iteration. On large projects, the entire system should be synchronized as frequently as practical. Using Polaris as an example, full-system synchronization points occurred when the first missile was launched, when the A1 was tested, when it deployed, and so on. These full-system synchronization points were never more than 18 months apart, and you can be sure that there were many more subsystem synchronization points. Although 18 months may seem like a long time, full-system synchronization occurred six times more frequently than called for in the original Polaris schedule.

Full-system synchronization means bringing the system into a "ready to release" state, as far as that is practical. If there are any live field tests that can be run, synchronization points are the time to run them. All participating teams and key customers should be involved in analysis and critical decision making at each synchronization point. Major synchronization points are not a one-day event. They are several days set aside to thoroughly analyze the system's readiness, absorb what has been learned, focus on key decisions that must be made at that point, make any necessary course corrections, plan the next stage in some detail, and celebrate the current success.