8.8. Additional Examples8.8.1. Adding Width and Height Attributes to Image TagsThis section presents a somewhat advanced example of in-place search and replace that updates HTML to ensure that all image tags have both WIDTH and HEIGHT attributes. (The HTML must be in a StringBuilder , StringBuffer , or other writable CharSequence .) Having even one image on a web page without both size attributes can make the page appear to load slowly, since the browser must actually fetch such images before it can position items on the page. Having the size within the HTML itself means that the text and other content can be properly positioned immediately, which makes the page-loading experience seem faster to the user . [ ]
When an image tag is found, the program looks within the tag for SRC, WIDTH, and HEIGHT attributes, extracting their values when present. If either the WIDTH or the HEIGHT is missing, the image is fetched to determine its size, which is then used to construct the missing attribute(s). If neither the WIDTH nor HEIGHT are present in the original tag, the image's true size is used in creating both attributes. However, if one of the size attributes is already present in the tag, only the other is inserted, with a value that maintains the image's proper aspect ratio. (For example, if a WIDTH that's half the true size of the image is present in the HTML, the added HEIGHT attribute will be half the true height; this solution mimics how modern browsers deal with this situation.) This example manually maintains a match pointer, as we did in the section starting on page 383. It makes use of regions (˜ 384) and method chaining (˜ 389) as well. Here's the code: // Matcher for isolating <img> tags Matcher mImg = Pattern.compile("(?id)<IMG\s+(.*?)/?>").matcher ( html ); // Matchers that isolate the SRC, WIDTH, and HEIGHT attributes within a tag (with very nave regexes) Matcher mSrc = Pattern.compile("(?ix)\bSRC =(\S+)").matcher( html ); Matcher mWidth = Pattern.compile("(?ix)\bWIDTH =(\S+)").matcher( html ); Matcher mHeight = Pattern.compile("(?ix)\bHEIGHT=(\S+)").matcher( html ); int imgMatchPointer = 0; // The first search begins at the start of the string while ( mImg .find( imgMatchPointer )) { imgMatchPointer = mImg .end(); // Next image search starts from where this one ended // Look for our attributes within the body of the just-found image tag Boolean hasSrc = mSrc .region( mImg .start(1), mImg .end(1)).find(); Boolean hasHeight = mHeight .region( mImg .start(1), mImg .end(1)) .find(); Boolean hasWidth = mWidth .region( mImg .start(1), mImg .end(1)).find(); // If we have a SRC attribute, but are missing WIDTH and/or HEIGHT ... if ( hasSrc && (! hasWidth ! hasHeight )) { java.awt.image.BufferedImage i = // this fetches the image javax.imageio.ImageIO.read(new java.net.URL(mSrc.group(1))); String size ; // Will hold the missing WIDTH and/or HEIGHT attributes if ( hasWidth ) // We're told the width, so compute the height that maintains the proper aspect ratio size = "height='" + (int)(Integer.parseInt( mWidth .group(1)) * i .getHeight() / i .getWidth()) + "' "; else if ( hasHeight ) // We're told the height, so compute the width that maintains the proper aspect ratio size = "width='" + (int)(Integer.parseInt( mHeight .group(1)) * i .getWidth() / i .getHeight()) + "' "; else // We're told neither, so just insert the actual size size = "width='" + i .getWidth() + "' " + "height='" + i .getHeight() + "' "; html .insert( mImg .start(1), size ); // Update the HTML in place imgMatchPointer += size .length(); // Account for the new text in mImg's eyes } } Although it's an instructive example, a few disclaimers are in order. Because the focus of the example is on in-place search and replace, I've kept some unrelated aspects of it simple by allowing it to make fairly na ve assumptions about the HTML it will be passed. For example, the regular expressions don't allow whitespace around the attribute's equal sign, nor quotes around the attribute's value. (See the Perl regex on page 202 for a real-world, Java- applicable approach to matching a tag attribute.) The program doesn't handle relative URLs, nor any ill-formatted URLs for that matter, as it doesn't handle any of the exceptions that the image-fetching code might throw. Still, it's an interesting example illustrating a number of important concepts. |