Practice 4. Refactoring
Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure. It is a disciplined way to clean up code that minimizes the chances of introducing bugs. In essence when you refactor you are improving the design of the code after it has been written. [Fowler 1999]
Refactoring involves changing software in a disciplined way so that the structure of the code is changed but the code's behavior remains unchanged. Refactoring is distinguished from merely rewriting code (which developers do all the time) by the discipline required. Refactoring is performed in a disciplined, step-by-step manner with tests in place to catch problems as you proceed. Rewriting is performed in a more ad-hoc delete/replace manner. Refactoring will help keep your code clean; rewriting may not.
We Need the Vocabulary and Discipline of Refactoring
Over the last few years the term refactoring has entered the common vernacular of software developers and managers. However, I am concerned that the term is now too commonly used because I don't see enough people doing true disciplined refactoring. What has me concerned is that too many people are confusing generic code cleanups (rewriting) with refactoring. We are losing the value of the vocabulary and the discipline of refactoring.
Recently in a job interview, the person I was interviewing mentioned that he was being taught refactoring in a university course. When I asked which refactoring methods he used, he looked at me like I had two heads. After a bit of discussion, it turns out the course did not expose the students to any of the refactoring terminology and patterns such as Extract Method, Extract Superclass, etc. Hence, the instructor was using the term refactoring in a generic way that had no meaning. And even worse, the students were "refactoring" without having any automated tests! To the students in this course, and to many others in the industry, refactor has become a synonym for rewrite.
But we need the vocabulary of refactoring! Having a developer tell me she refactored the code is as descriptive as her telling me she had an amazing lunch. However, if a developer tells me he extracted a superclass from a number of classes, I can immediately form an image in my head of what the new class structure is and what the common behavior must be like. I could then apply this knowledge when I look at the code later. Similarly, if she describes her lunch as an eggplant, yellow and red pepper, and portobello mushroom sandwich with a salad on the side, I can get hungry thinking about it.
And we need the discipline of refactoring! Randomly rewriting code does not mean that the code is easier to understand or that the design that results is better. The result could be a mess, and it could lead to oscillation, as the same code is continually rewritten and "cleaned up." Refactoring, on the other hand, should lead to better design because of the steps involved, with automated tests in place to catch problems as the refactoring proceeds and to ensure that the behavior of the resulting changes is correct.
The vocabulary of refactoring provides the power to clearly describe what we mean. The discipline of refactoring helps us avoid random code rewrites that lead to bad code and incomprehensible designs. Let's not lose the value of the vocabulary and discipline and challenge each other to refactor, not rewrite!
The quality of communication is higher with refactoring than with rewriting. This is because with refactoring the person who made the changes can describe the changes through the type of refactoring that was performed. For example, a developer could say "In this change I did an 'extract method' refactoring, removing largely duplicated code from methods X, Y, and Z." The type of refactoring (extract method) immediately conveys to any other developer an immediate and rough feeling of the changes that were made and the complexity of the work required without having to look at the code. Without the refactoring discipline, then, the only recourse for other developers to understand the complexity of the change is to actually look at the code before and after the changes were made.
The level of confidence is higher with refactoring than with rewriting. This is mostly because of the discipline required; with refactoring you are discouraged from trying to make multiple changes at the same time and rather to make changes in small, safe steps that can each be verified through tests. However, when rewriting code, it is too easy to approach a problem with wild abandon, which often results in a worse state or no improvement.