Extract Composite | Refactoring to Patterns


Prev	don't be afraid of buying books	Next

Subclasses in a hierarchy implement the same Composite.

Extract a superclass that implements the Composite.

Motivation

In Extract Superclass [F], Martin Fowler explains that if you have two or more classes with similar features, it makes sense to move the common features to a superclass. This refactoring is similar: it addresses the case when the similar feature is a Composite [DP] that would be better off in a superclass.

Subclasses in hierarchies that store collections of children and have methods for reporting information about those children are common. When the children being collected happen to be classes in the same hierarchy, there's a good chance that much duplicate code can be removed by refactoring to Composite.

Removing such duplication can greatly simplify subclasses. On one project, I found that people were confused about how to add new behavior to the system, and much of the confusion stemmed from the complex, child-handling logic propagated in numerous subclasses. By applying Extract Composite, subclass code became simple, which made it easier for folks to understand how to write new subclasses. In addition, the very existence of a superclass named to express that it handled Composites communicated to developers that some rich functionality could be inherited via subclassing.

This refactoring and Extract Superclass [F] are essentially the same. I apply this refactoring when I'm only concerned with pulling up common child-handling logic to a superclass. Following that, if there is still more behavior that can be pulled up to a superclass but isn't related to the Composite, I apply the pull-up logic in Extract Superclass.

Benefits and Liabilities

+	Eliminates duplicated child-storage and child-handling logic.
+	Effectively communicates that child-handling logic may be inherited.

Mechanics

These mechanics are based on the mechanics from Extract Superclass [F].

1. Create a composite, a class that will become a Composite [DP] during this refactoring. Name this class to reflect what kind of children it will contain (e.g., CompositeTag).

Compile.

2. Make each child container (a class in the hierarchy that contains duplicate child-handling code) a subclass of your composite.

Compile.

3. In a child container, find a child-processing method that is purely duplicated or partially duplicated across the child containers. A purely duplicated method has the same method body with the same or different method names across child containers. A partially duplicated method has a method body with common and uncommon code and the same or different method names across child containers.

Whether you've found a purely duplicated or partially duplicated method, if its name isn't consistent across child containers, make it consistent by applying Rename Method [F].

For a purely duplicated method, move the child collection field referenced by the method to your composite by applying Pull Up Field [F]. Rename this field if its name doesn't make sense for all child containers. Now move the method to the composite by applying Pull Up Method [F]. If the pulled-up method relies on constructor code still residing in child containers, pull up that code to the composite's constructor.

For a partially duplicated method, see if the method body can be made consistent across all child containers by using Substitute Algorithm [F]. If so, refactor it as a purely duplicated method. Otherwise, extract the code that is common across all child-container implementations by using Extract Method [F] and pull it up to the composite by using Pull Up Method [F]. If the method body follows the same sequence of steps, some of which are implemented differently, see if you can apply Form Template Method (205).

Compile and test after each refactoring.

4. Repeat step 3 for child-processing methods in the child containers that contain purely duplicated or partially duplicated code.

5. Check each client of each child container to see if it can now communicate with the child container using the composite interface. If it can, make it do so.

Compile and test after each refactoring.

Example

This refactoring occurred on the open-source HTML Parser (see http://sourceforge.net/projects/htmlparser). When the parser parses a piece of HTML, it identifies and creates objects representing HTML tags and pieces of text. For example, here's some HTML:

 <HTML>    <BODY>       Hello, and welcome to my Web page! I work for       <A HREF="http://industriallogic.com">          <IMG src="/books/1/476/1/html/2/http://industriallogic.com/images/logo141x145.gif">       </A>    </BODY> </HTML>

Given such HTML, the parser would create objects of the following types:

Tag (for the <BODY> tag)
StringNode (for the String, "Hello, and welcome . . .")
LinkTag (for the <A HREF="…"> tag)

Because the link tag (<A HREF="…">) contains an image tag (<IMG SRC"…">), you might wonder what the parser does with it. The image tag, which the parser treats as an ImageTag, is treated as a child of the LinkTag. When the parser notices that the link tag contains an image tag, it constructs and gives one ImageTag object as a child to the LinkTag object.

Additional tags in the parser, such as FormTag, TitleTag, and others, are also child containers. As I studied some of these classes, it didn't take long to spot duplicate code for storing and handling child nodes. For example, consider the following:

 public class LinkTag extends Tag...    private Vector nodeVector;    public String toPlainTextString() {       StringBuffer sb = new StringBuffer();       Node node;       for (Enumeration e=linkData();e.hasMoreElements();) {          node = (Node)e.nextElement();          sb.append(node.toPlainTextString());       }       return sb.toString();    } public class FormTag extends Tag...    protected Vector allNodesVector;    public String toPlainTextString() {       StringBuffer stringRepresentation = new StringBuffer();       Node node;       for (Enumeration e=getAllNodesVector().elements();e.hasMoreElements();) {          node = (Node)e.nextElement();          stringRepresentation.append(node.toPlainTextString());       }       return stringRepresentation.toString();    }

Because FormTag and LinkTag both contain children, they both have a Vector for storing children, though it goes by a different name in each class. Both classes need to support the toPlainTextString() operation, which outputs the non-HTML-formatted text of the tag's children, so both classes contain logic to iterate over their children and produce plain text. Yet the code to do this operation is nearly identical in these classes! In fact, there are several nearly identical methods in the child-container classes, all of which reek of duplication. So follow along as I apply Extract Composite to this code.

1. I must first create an abstract class that will become the superclass of the child-container classes. Because the child-container classes, like LinkTag and FormTag, are already subclasses of Tag, I create the following class:

  public abstract class CompositeTag extends Tag {     public CompositeTag(        int tagBegin,        int tagEnd,        String tagContents,        String tagLine) {        super(tagBegin, tagEnd, tagContents, tagLine);     }  }

2. Now I make the child containers subclasses of CompositeTag:

 public class LinkTag extends  CompositeTag public class FormTag extends  CompositeTag // and so on...

Note that for the remainder of this refactoring, I'll show code from only two child containers, LinkTag and FormTag, even though there are others in the code base.

3. I look for a purely duplicated method across all child containers and find toPlainTextString(). Because this method has the same name in each child container, I don't have to change its name anywhere. My first step is to pull up the child Vector that stores children. I do this using the LinkTag class:

 public abstract class CompositeTag extends Tag...     protected Vector nodeVector;  // pulled-up field public class LinkTag extends CompositeTag...      private Vector nodeVector;

I want FormTag to use the same newly pulled-up Vector, nodeVector (yes, it's an awful name, I'll change it soon), so I rename its local child Vector to be nodeVector:

 public class FormTag extends CompositeTag...      protected Vector allNodesVector;     protected Vector nodeVector; ...

Then I delete this local field (because FormTag inherits it):

 public class FormTag extends CompositeTag...      protected Vector nodeVector;

Now I can rename nodeVector in the composite:

 public abstract class CompositeTag extends Tag...      protected Vector nodeVector;     protected Vector children;

I'm now ready to pull up the toPlainTextString() method to CompositeTag. My first attempt at doing this with an automated refactoring tool fails because the two methods aren't identical in LinkTag and FormTag. The trouble is that LinkTag gets an iterator on its children by means of the linkData() method, while FormTag gets an iterator on its children by means of the getAllNodesVector().elements():

 public class LinkTag extends CompositeTag    public Enumeration linkData() {       return children.elements();    }    public String toPlainTextString()...       for (Enumeration e= linkData();e.hasMoreElements();)          ... public class FormTag extends CompositeTag...    public Vector getAllNodesVector() {       return children;    }    public String toPlainTextString()...       for (Enumeration e= getAllNodesVector().elements();e.hasMoreElements();)          ...

To fix this problem, I must create a consistent method for getting access to a CompositeTag's children. I do this by making LinkTag and FormTag implement an identical method, called children(), which I pull up to CompositeTag:

 public abstract class CompositeTag extends Tag...     public Enumeration children() {        return children.elements();     }

The automated refactoring in my IDE now lets me easily pull up toPlainTextString() to CompositeTag. I run my tests and everything works fine.

4. In this step I repeat step 3 for additional methods that may be pulled up from the child containers to the composite. There happen to be several of these methods. I'll show you one that involves a method called toHTML(). This method outputs the HTML of a given node. Both LinkTag and FormTag have their own implementations for this method. To implement step 3, I must first decide whether toHTML() is purely duplicated or partially duplicated.

Here's a look at how LinkTag implements the method:

 public class LinkTag extends CompositeTag    public String toHTML() {       StringBuffer sb = new StringBuffer();       putLinkStartTagInto(sb);       Node node;       for (Enumeration e = children();e.hasMoreElements();) {          node = (Node)e.nextElement();          sb.append(node.toHTML());       }       sb.append("</A>");       return sb.toString();    }    public void putLinkStartTagInto(StringBuffer sb) {       sb.append("<A ");       String key,value;       int i = 0;       for (Enumeration e = parsed.keys();e.hasMoreElements();) {          key = (String)e.nextElement();          i++;          if (key!=TAGNAME) {             value = getParameter(key);             sb.append(key+"=\""+value+"\"");             if (i<parsed.size()-1) sb.append(" ");          }       }       sb.append(">");    }

After creating a buffer, putLinkStartTagInto(…) deals with getting the contents of the start tag into the buffer, along with any attributes it may have. The start tag would be something like <A HREF="…"> or <A NAME="…">, where hrEF and NAME represent attributes of the tag. The tag could have children, such as a StringNode, as in <A HREF="…">I'm a string node</A> or child ImageTag instances. Finally, there is the end tag, </A>, which must be added to the result buffer before the HTML representation of the tag is returned.

Let's now see how FormTag implements the toHTML() method:

 public class FormTag extends CompositeTag...    public String toHTML() {       StringBuffer rawBuffer = new StringBuffer();       Node node,prevNode=null;       rawBuffer.append("<FORM METHOD=\""+formMethod+"\" ACTION=\""+formURL+"\"");       if (formName!=null && formName.length()>0)          rawBuffer.append(" NAME=\""+formName+"\"");       Enumeration e = children.elements();       node = (Node)e.nextElement();       Tag tag = (Tag)node;       Hashtable table = tag.getParsed();       String key,value;       for (Enumeration en = table.keys();en.hasMoreElements();) {          key=(String)en.nextElement();          if (!(key.equals("METHOD")             || key.equals("ACTION")             || key.equals("NAME")             || key.equals(Tag.TAGNAME))) {             value = (String)table.get(key);             rawBuffer.append(" "+key+"="+"\""+value+"\"");          }       }       rawBuffer.append(">");       rawBuffer.append(lineSeparator);       for (;e.hasMoreElements();) {          node = (Node)e.nextElement();          if (prevNode!=null) {             if (prevNode.elementEnd()>node.elementBegin()) {                // It's a new line                rawBuffer.append(lineSeparator);             }          }          rawBuffer.append(node.toHTML());          prevNode=node;       }       return rawBuffer.toString();    }

This implementation has some similarities and differences compared with the LinkTag implementation. Therefore, according to the definition presented earlier in the Mechanics section, toHTML() should be treated as a partially duplicated child-container method. That means that my next step is to see if I can make one implementation of this method by applying the refactoring Substitute Algorithm [F].

It turns out I can. It is easier than it looks because both versions of toHTML() essentially do the same three things: output the start tag along with any attributes, output any child tags, and output the close tag. Knowing that, I arrive at a common method for dealing with the start tag, which I pull up to CompositeTag:

 public abstract class CompositeTag extends Tag...     public void putStartTagInto(StringBuffer sb) {        sb.append("<" + getTagName() + " ");        String key,value;        int i = 0;        for (Enumeration e = parsed.keys();e.hasMoreElements();) {           key = (String)e.nextElement();           i++;           if (key!=TAGNAME) {              value = getParameter(key);              sb.append(key+"=\""+value+"\"");              if (i<parsed.size()) sb.append(" ");           }        }        sb.append(">");     } public class LinkTag extends CompositeTag...    public String toHTML() {       StringBuffer sb = new StringBuffer();        putStartTagInto(sb);       ... public class FormTag extends CompositeTag    public String toHTML() {       StringBuffer rawBuffer = new StringBuffer();        putStartTagInto(rawBuffer);       ...

I perform similar operations to make a consistent way of obtaining HTML from child nodes and from an end tag. All of that work enables me to pull up one generic toHTML() method to the composite:

 public abstract class CompositeTag extends Tag...     public String toHTML() {        StringBuffer htmlContents = new StringBuffer();        putStartTagInto(htmlContents);        putChildrenTagsInto(htmlContents);        putEndTagInto(htmlContents);        return htmlContents.toString();     }

To complete this part of the refactoing, I'll continue to move child-related methods to CompositeTag, though I'll spare you the details.

5. The final step involves checking clients of child containers to see if they can now communicate with the child containers using the CompositeTag interface. In this case, there are no such cases in the parser itself, so I'm finished with the refactoring.


	Amazon