Summary

Java > Core SWING advanced programming > 4. JEDITORPANE AND THE SWING HTML PACKAGE > The JEditorPane Control

The `JEditorPane` Control

The basic design of JEditorPane makes it a very simple component to use for the common task of loading and displaying a page of HTML or a plain text document. In fact, as we'll demonstrate, it's as simple as creating a JEditorPane and passing it a pointer to the document that you want to load, in the form of a URL.

A Simple HTML Viewer

Figure 4-1 shows a simple example of a JEditorPane displaying a Web page downloaded directly from the Internet.

Figure 4-1. Using `JEditorPane` to display a Web page.

You can try this example out by typing the command

 java AdvancedSwing.Chapter4.EditorPaneExample1

and then typing the URL of the Web page you'd like to view in the text field at the bottom of the frame. In this case, the Prentice Hall home page has been loaded from the URL http://www.prenhall.com. If you load this page and compare it with what you see in a real Web browser, you'll probably find it hard to see the difference. In fact, very little code was needed to create this mini HTML viewer, as you can see from Listing 4-1.

Most of the code in Listing 4-1 is concerned with creating the frame and adding the label and the JTextField that allows you to type a Uniform Resource Locator (URL) only a few lines are directly concerned with JEditorPane itself. First, the JEditorPane is created using its default constructor, which loads an empty document, as you can see when the program starts. The next line ensures the pane's editing capabilities are turned off the reason for which you'll see later in this chapter when we look in detail at the HTML support built into JEditorPane. Nothing else of any interest happens until you press the return key in the JTextField. At this point, the URL of the Web page you want to see is read from the input field and passed to the JEditorPane setPage method. That's all you need to do to make it load an HTML page.

Listing 4-1 A Simple HTML Viewer Implemented Using `JEditorPane`

 package AdvancedSwing.Chapter4; import java.awt.*; import java.awt.event.*; import java.io.*; import javax.swing.*; import javax.swing.text.*; public class EditorPaneExample1 extends JFrame {    public EditorPaneExamplel() {       super("JEditorPane Example 1");       pane = new JEditorPane();       pane.setEditable(false); // Read-only       getContentPane().add(new JScrollPane(pane), "Center");       JPanel panel = new JPanel();       panel.setLayout(new BorderLayout(4, 4));       JLabel urlLabel = new JLabel("URL: ", JLabel.RIGHT);       panel.add(urlLabel, "West");       textField = new JTextField(32);       panel.add(textField, "Center"),       getContentPane().add(panel, "South");       // Change page based on text field       textField.addActionListener(new ActionListener() {           public void actionPerformed(ActionEvent evt) {              String url = textField.getText ();              try {                 // Try to display the page                 pane.setPage(ur1);              } catch (IOException e) {                 JOptionPane.showMessageDialog(pane,                   new String[] {                      "Unable to open file",                      url                   }, "File Open Error",                   JOptionPane.ERROR_MESSAGE);                }           }       }) ;    }    public static void main(String[] args) {       JFrame f = new EditorPaneExamplel();       f.addWindowListener(new WindowAdapter() {          public void windowClosing(WindowEvent evt) {             System.exit(0) ;          }       }) ;       f.setSize(500, 400) ;       f.setVisible(true);    }    private JEditorPane pane;    private JTextField textField; }

Initializing a `JEditorPane`

There are several ways to load a JEditorPane with some content and arrange for it to be displayed. The most obvious way is to use one of the following three constructors:

 public JEditorPane (String ur1) throws IOException public JEditorPane (URL ur1) throws IOException public JEditorPane (String type, String text)

The first two constructors both load a document given its URL, either in the form of a String or as a URL object. If you pass the URL in the form of a String, JEditorPane attempts to create the corresponding URL object for you and throws an IOException if the String is not a valid representation of a URL. Given a valid url object, the document is loaded from its source, which may be a local file or a Web server somewhere; if an error occurs while the data is being read, an IOException is again the result. The fact that these constructors require a URL does not imply that only HTML files can be loaded you need to use a URL no matter what type of data is in the file being read, be it HTML, RTF, plain text, or a private format of your own.

The third constructor allows you to load text from a String variable. This might be useful if you wanted to display some HTML that you create dynamically or load independently from some external source. In this case, the type argument tells JEditorPane how to interpret the text; if it's encoded in HTML, the type would be passed as "text/htm1." You'll see more shortly about how the document type is established for the other constructors and how it is used.

If, as in Listing 4-1, you don't know the URL of the document you want to load when the JEditorPane is created, you can use the default constructor instead. In this case, the component is initialized with an empty Document that will allow it to display plain text. There are several ways to load content into an existing JEditorPane. If you want to load some fixed text, you can simply use the setText method:

 public void setText (String t);

If you haven't already loaded a Document into the JEditorPane, the contents of the String argument will be treated as plain text. If, however, the component is already loaded with some content or you have explicitly installed an EditorKit and a Document for some other type, the text will be considered to be of the same type as that for which the component is currently initialized. This means, for example, that if you load an HTML page into a JEditorPane and then replace its content using setText, the String will be expected to be formatted in HTML. This is not, however, a common way to load a JEditorPane and, if you use it, you will probably want to explicitly set the content type so that the String is interpreted properly. You'll see how to do that shortly.

Core Note

The setText method completely replaces the content of the JEditorPane by installing a new Document. Because of this, it makes no sense to create a JEditorPane, manually install a DefaultStyledDocument to which you apply a logical style, and then use setText to add the text You might try this approach to create a component in which all the text is displayed in a 15-point, italic font, for example, by creating an empty document and setting a logical style containing this font, expecting it to apply to the text subsequently installed using setText. This would fail, however, because the setText method would remove the original Document and so lose the attributes in the logical style. If you want to set the model content without losing the logical style, you can use the Document insertstring method to install the text.

The most common way to open a document in JEditor Pane, and the one shown in Listing 4-1, is to use one of the setPage methods:

 public void setPage (String ur1) throws IOException public void setPage (URL url) throws IOException

These methods look the same as two of the constructors shown previously and, in fact, they are directly called from the constructors that they mirror. As before, if there is a problem with the URL or an error occurs while reading the data, an IOException is thrown.

Content Type and Editor Kit Selection

The most powerful feature of JEditorPane is that it can adapt itself to be able to display and edit many kinds of data by plugging in the appropriate EditorKit and the corresponding Document model. You've already seen this in action in the simple Web browser example in the last section, where an EditorKit and a model suitable for handling HTML were installed automatically. How does JEditorPane know the type of data that it is working with and how does that determine the helper classes that it installs? There are actually several ways for JEditorPane to find out which document type it is handling:

By explicit installation of an editor kit.
From the setContentType method.
Automatically from its input source.

These three mechanisms are actually three different levels of application programming interface (API), starting with the most primitive level and working up to the most abstract. Let's look at how each of them works.

Installation of an Editor Kit

To do its job properly, JEditorPane needs to be configured with the right EditorKit and the correct Document implementation for the type of data that it is handling. There is a tight binding between the EditorKit and the Document classes that it can handle. This is reflected in the fact that EditorKit has a method called createDefaultDocument that can be used to obtain an instance of the appropriate Document type for a given EditorKit. Thus, to obtain a consistent set of classes for JEditorPane, it is sufficient to create the right EditorKit for the type of data that you want it to display and use the JEditorPane setEditorKit method to install it. When you do this, a new Document of the correct type will automatically be used. Table 4-1 shows the correlation between the type of content to be displayed, the EditorKit and the Document class required.

Table 4-1. Mapping from Content Type to Editor Kit and Document Class
*Content Type*	*Editor Kit*	*Document Class*
Plain text	`PlainEditorKit`	`Plain Document`
RTF	`RTFEditorKit`	`DefaultStyleDocument`
HTML	`HTMLEditorKit`	`HTMLDocument`

Core Note

As described earlier, when the setEditorKit method is invoked, a new empty Document of the type shown in the table will be installed. If you need to install a custom Document type, you can either follow setEditorKit invocation with an explicit call to setDocument, or you can subclass the editor kit and override its createDefaultDocument method to return an instance of your Document class instead of the default one. To make this work, you'll need to change the mapping from content type to editor kit so that your editor kit is used instead of the usual one. You'll see how this mapping is done in "The setContentType Method".

Let's look at a simple example of how this works. Earlier, we said that you could use the setText method to install some text into a JEditorPane. If you simply create a JEditorPane using its default constructor and then call setText, the text that you pass will be interpreted as plain text and displayed with no special formatting. Suppose instead that you wanted to pass a String containing HTML and have it displayed as if it had been loaded into a Web browser. One way to achieve this is to create a JEditorPane using the default constructor, invoke setEditorKit to install an HTML editor kit, and then use setText to supply it with some HTML to format. You can see this process in action by typing the following command:

 java AdvancedSwing.Chapter4.EditorPaneExample2

This example creates a frame with a JEditorPane and a JTextArea into which you can type some text to be installed into the JEditorPane, as shown in Figure 4-2.

Figure 4-2. Using `JEditorPane` to display HTML and plain text.

As well as the text area, the lower half of the frame displays the type of Document and the editor kit that are currently installed in the JEditorPane. Because the JEditorPane was created using its default constructor, it starts with a PlainDocument and a PlainEditorKit installed. If you now type some text into the text area and press the option button (also known as a radio button) labeled Install text, the text that you typed will appear in the JEditorPane, exactly as you typed it, because both text components are working with plain text. If you now select the option button marked html, two things will happen. First, the document type and editor kit at the bottom of the frame will change to HTMLDocument and HTMLEditorKit respectively and, second, the content of the JEditorPane will disappear.

What happened here? Here's the code that is connected to the option buttons in this example:

 ActionListener radioButtonListener = new ActionListener() {     public void actionPerformed(ActionEvent evt) {        JRadioButton b = (JRadioButton)evt.getSource();        String type = ((b == plainButton) ? "text/plain":                        "text/html");        final EditorKit kit =                       pane.getEditorKitForContentType(type);        SwingUtilities.invokeLater(new Runnable() {           public void run() {              pane.setEditorKit(kit);           }        });     } };

When either of the buttons is pressed, the actionPerformed method is executed in the Abstract Window Toolkit (AWT) event thread. This method looks at the source of the event to decide whether to switch to HTML or plain text: Here, the variable plainButton contains a reference to the option button used to select plain text, so it can be used to determine which of the two buttons was pressed. Based on this, the JEditorPane getEditorKitForContentType method is called with argument text/plain or text/html. You'll find out more about this method later in this section, but it should be obvious that this call is getting a reference to an editor kit suitable for handling plain text or HTML, respectively. Having obtained the editor kit, it is installed by calling the setEditorKit method.

Core Note

Because a large part of the code for this example is concerned only with setting up the user interface, to save space the complete listing is not shown here only the code relevant to this discussion is included. If you want to look at the complete example, you can find it on the CD-ROM that accompanies this book.

When a new editor kit is installed, the Document associated with the previous one will no longer be appropriate, because each editor kit requires its own specific Document subclass. In this case, when the HTMLEditorKit is installed, it creates an empty HTMLDocument, which replaces the PlainDocument used by the PlainEditorKit, causing the loss of all its content. This is why the JEditorPane becomes empty after the change from plain text to HTML.

Now, if you press the install text button one more time, the text from the JTextArea will be installed into the JEditorPane again. Here is the code that is executed when the button is pressed:

 installButton.addActionListener(new ActionListener() {    public void actionPerformed(ActionEvent evt) {       // Get the text and install it in the JEditorPane       SwingUtilities.invokeLater(new Runnable() {          public void run() {             pane.setText(textArea.getText() ) ;          }       });    } }) ;

This code simply uses the setText method of JEditorPane to install the new text. This time, because an HTMLDocument and HTMLEditorKit are installed, the text is interpreted as HTML rather than as plain text. In Figure 4-2, the text typed into the text area contained HTML markup and, as you can see, the header text, which was surrounded by H2 tags, has been rendered differently from the rest of the text and appears in a paragraph of its own, set off from the body of the text. If you were now to press the Plain Text option button, the PlainEditorKit and PlainDocument would be installed and the HTML would disappear from the upper display. Pressing Install text again would show the HTML tags and the text as it appears in the JTextArea.

Core Note

In the earlier code extract, the setEditorKit method is not invoked directly. Instead, the SwingUtilities invokeLater method is called to arrange to have it executed after the actionPerformed method returns. As you know, you can only safely update Swing components from within the AWT event thread. However, this code was already in the event thread, so there was no reason to postpone the call to setEditorKit to ensure thread safety. The reason for deferring it is to make the user interface (UI) appear slightly more responsive. When the option button is pressed, its appearance changes to reflect the fact that it has been activated; in fact, its background changes to gray, but the black dot that indicates that it has been selected doesn't appear until the action associated with it has been carried out Changing the editor kit can take a relatively long time. If this operation were carried out in the actionPerformed method, the user would see the option button in its transitional state for a discernible amount of time. By deferring this operation, the option button is allowed to redraw itself in its final state before the editor kit is switched, which gives the user the impression of a more responsive application.

The only aspect of this example that you haven't yet seen is how the labels showing the editor kit and document type get updated. You might expect the code that changes the editor kit in response to a change in the state of the option buttons to extract the new values and update the labels but, as you can see from the code presented earlier, that isn't how it happens. In fact, both the editor kit and the Document are bound properties of the JEditorPane, so changes can be detected by registering for PropertyChangeEvents, which is how this example works. Here is the code that responds to these events:

 // Listen to the properties of the editor pane pane.addPropertyChangeListener(new PropertyChangeListener() {    public void propertyChange(PropertyChangeEvent evt) {       String prop = evt.getPropertyName();       if (prop.equals("document")) {          docLabel. setText (evt. getNewValue () . getClass ( ) . getName ( )) ;       } else if (prop.equals("editorKit")) {          kitLabel.setText (evt.getNewValue( ).getClass( ).getName( ));       }    } });

You may have noticed that when it is displaying plain text, JEditorPane uses an editor kit called JEditorPane. PlainEditorKit (which, because of the way in which inner class names are constructed, will actually appear as JEditorPane$PlainEditorKit). This might surprise you, because in Chapter 1 we said that the text components use DefaultEditorKit when handling plain text. In fact, this is the case for JTextField, JPasswordField, and JTextArea, but not for JEditorPane. The reason for the difference is the ViewFactory. As you saw in Chapter 3, every text component needs a ViewFactory, which is supplied either by its editor kit or its UI class. For the simple text components, the ViewFactory is part of the UI class, so DefaultEditorKit does not provide one. Because it can, theoretically, handle any type of document, it is not possible for JEditorPane's UI class to supply a fixed ViewFactory, because the Views required will depend entirely on the type of document being rendered. JEditorPane expects its editor kit to supply a ViewFactory that can manufacture views appropriate to its associated document type. Because DefaultEditorKit does not supply a ViewFactory, JEditorPane cannot use it directly. PlainEditorKit is, in fact, a subclass of DefaultEditorKit that overrides only the getviewFactory method to return a ViewFactory that creates a WrappedPlainView for every Element in the document.

The `setContentType` Method

The example you have just seen demonstrates that you can arrange for the display capabilities of JEditorPane to be changed according to the type of document that you want it to handle by using the setEditorKit method. Setting the editor kit, however, requires you to know which editor kit to use for your document type. If you are using one of the types for which JEditorPane has built-in support, you can avoid having to hard-code this information in your application by using the higher-level setContentType method:

 public final void setContentType (String type);

This method maps the type argument to the correct editor kit and installs it by invoking setEditorKit, as was shown in the earlier example. In fact, it uses the same technique as you saw in that example to perform the mapping from type to editor kit. So what is the type argument and how does the getEditorKitForContentType method use it to obtain an editor kit? The type argument has the following format:

 MIME-type ; parameters

The optional parameter section is a space- or comma-separated list of values of the form key = value, most of which, if they are present at all, are not directly used by JEditorPane. You'll see later in this chapter how you can use the parameter list to specify the character encoding of the document's data. The interesting part of the content type is the first part, which specifies the document's data type in the form of a MIME (Multipurpose Internet Mail Extensions) type description. MIME document types are specified in two parts a type and a subtype. The complete set of valid MIME types is forever growing; you can find out all about MIME by reading the Internet RFCs (Request For Comments) numbered 2045 through 2049, which you can find on the Internet at the following URL: http://ds.internic.net/rfc/rfc2045.txt and So on.

JEditorPane comes with built-in mappings from four MIME types to the editor kits needed to support documents of that type, as shown in Table 4-2. This table is, of course, very similar to Table 4-1, which showed the mapping from content type name to the editor kit and the corresponding Document class.

Table 4-2. Mapping from MIME Type to Editor Kit
*MIME Type*	*Editor Kit*
`text/plain`	`PlainEditorKit`
`text/rtf`	`RTFEditorKit`
`application/rtf`	`RTFEditorKit`
`text/html`	`HTMLEditorKit`

As you can see, there are actually two MIME types that correspond to an RTF document. These default mappings are installed automatically when a JEditorPane is created. There are several methods that can be used to create new mappings or retrieve existing ones:

 public EditorKit getEditorKitForContentType (String type) public void setEditorKitForContentType (                 String type, EditorKit k) public static EditorKit createEditorKitForContentType (                 String type) public static void registerEditorKitForContentType(                 String type, String classname) public static void registerEditorKitForContentType(                 String type, String classname,                 ClassLoader loader)

At first glance, it looks like there is duplicate functionality here. For example, getEditorKitForContentType and createEditorKitForContentType both return an EditorKit given a content type in the form of a String. In fact, there is a three-level scheme for holding the mapping between type and editor kit, as shown in Figure 4-3.

Figure 4-3. `The EditorKit` registry.

At the bottom of the diagram is the editor kit type registry, which stores in a Hashtable the mapping from the content type to the fully qualified class name of the class that implements the editor kit for that type of content. For example, it might have an entry mapping the MIME type text/html to the class name javax.swing.text.html.HTMLEditorKit. The registry is initialized with a hard-coded default set of mappings equivalent to those shown in Table 4-2, installed using the static method registerEditorKitForContentType. This method has two variants, one of which includes a ClassLoader argument. In fact, the editor kit type registry consists of two Hashtables, one mapping content type to class name and one mapping content type to the ClassLoader required to load the editor kit class. When no ClassLoader is given, no entry is made in the second Hashtable. Both of these Hashtables are held as static data, so there is only one copy shared by all JEditorPanes.

When JEditorPane (or other code) needs an EditorKit, it calls the getEditorKitForContentType method, which looks first in the TypeHandler Hashtable, shown at the top of Figure 4-3. This table maps from content type to an instance of an EditorKit. Unlike the editor kit type table, there is one copy of the TypeHandler table for each JEditorPane instance. This table is initially empty, so getEditorKitForContentType will not find the required EditorKit there. As a result, it invokes the static method CreateEditorKitForContentType, which looks in yet another Hashtable the EditorKit registry shown in the middle of Figure 4-3. Like the EditorKit type registry, this is a static member of JEditorPane, so it is a shared resource. Again, this table is initially empty, so the required EditorKit won't be found here either. Finally, CreateEditorKitForContentType uses the MIME type to access the EditorKit type registry (at the bottom of Figure 4-3) to find the required EditorKit's class name and the corresponding ClassLoader if applicable. If the MIME type is known, the required class is loaded (using the ClassLoader if one is configured) and an instance of the class is loaded into the static EditorKit registry, with the MIME type as the key.

Now that the EditorKit registry has an instance of the EditorKit, createEditorKitForContentType creates a copy of it (using its clone method) and returns it to getEditorKitForContentType, which stores it in the TypeHandler table. Subsequent calls to getEditorKitForContentType on the same JEditorPane object with the same content type will return the same EditorKit from the TypeHandler table. If, however, this method is invoked on a different JEditorPane, it won't find the entry in its TypeHandler table because this table is not shared. Instead, it will call createEditorKitForContentType, as before. This time, however, an instance of the correct EditorKit will be found in the static EditorKit registry and a new cloned copy will be returned and stored in the second JEditorPane's TypeHandler table.

The result of this is that, in the case of HTML, for example, a single HTMLEditorKit (let's call it instance A) will be loaded into the static EditorKit registry the first time any JEditorPane needs to display an HTML document. A cloned copy of this EditorKit (instance B) will then be loaded into that JEditorPane's TypeHandler table and will also be installed in the JEditorPane. If the document type of the same JEditorPane is changed to some other type and then back to HTML, the same EditorKit (instance B) will be used again. If another JEditorPane needs to display an HTML document, it won't find an HTMLEditorKit in its own TypeHandler table, but it will find instance A in the shared EditorKit registry and will clone it, creating instance C, which it will place in its TypeHandler table and use it to load the HTML document. Subsequent calls to retrieve an HTMLEditorKit in this second JEditorPane will also retrieve instance C.

If you want to add your own custom EditorKit for use with a private document type, you can do so by installing a mapping from the content type of your document to the name of your custom EditorKit's class in the EditorKit type registry. This would make the new document type available to all JEditorPanes in the application. For example, if you implement an EditorKit in the class com.mycom.text.MyType for a new type of document to which you assign the MIME type application/x-my-type, you can make it globally available like this:

 JEditorPane.registerEditorKitForContentType(       "application/x-my-type",       "com.mycom.text.MyTypeEditorKit");

Core Note

Actually, if you are using Java 2, you will need to use the form of the registerEditorKitForContentType that Specifies the classLoader to use instead of the simpler form shown here, which is sufficient only in Java Development Kit (JDK) 1.1. The more complex form works both for JDK 1.1 and Java 2. You'll see later why this is necessary.

You can arrange for this EditorKit to be loaded using setContentType:

 JEditorPane pane = new JEditorPane(); pane.setContentType("application/x-my-type");

This doesn't, of course, install the appropriate Document class, which you will need to implement and install an instance of using setDocument. On the other hand, if you want to make this EditorKit available in one specific JEditorPane, you can use the setEditorKitForContentType method instead. To do this, you will need to have created the JEditorPane and an instance of the EditorKit:

 JEditorPane pane = new JEditorPane(); EditorKit myKit -= new MyTypeEditorKit(); pane.setEditorKitForContentType("application/x-my-type", myKit); pane.setContentType("application/x-my-type");

Configuration from the Input Source

You've now seen two different ways to determine the EditorKit and Document installed into a JEditorPane you can either call getEditorKitForContentType and install the returned EditorKit, or you can use the more convenient setContentType method, which performs those two steps for you. Either way, you need to have the content type of the document that you need to display in the form of a MIME description. If you look back to Listing 4-1, however, you won't see any evidence of a MIME type being used. The only line of code in that example that loads content into the JEditorPane is this one:

 pane.setPage(url);

The setPage method is given only a URL. Using this, it reads the file to be loading using the URLConnection class in the java.net package. URLConnection has a method called getContentType that returns the MIME type of the file that it is reading. The means by which it determines the MIME type depends on exactly how the file is being read. If the file being retrieved from a Web server using the HTTP protocol, the content type will be determined by the Web server and will be returned as part of the HTTP header information. Web servers are typically configured to recognize content types based on the suffix of the name of the actual file being read so that, for example, files whose names end with .htm or html are expected to hold HTML. For such files, the Web server would return the content type text/html. In other cases, such as when the file is on the local file system and is not retrieved via a Web server, the process of determining the file type might also involve looking at the filename suffix for certain well-known cases. If this doesn't work, another common technique is to look for easily recognized patterns stored at the start of the file's data. Graphic Interchange Format (GIF) files, for example, start with the characters G, I, and F in that order.

Core Note

The workings of the URLConnection class and Java networking in general are beyond the scope of this book. If you want to find out more about these topics, I recommend that you read Core Java 2, Volume 2: Advanced Features, by Cay Horstmann and Gary Cornell, which discusses networking in detail.

However the content type is determined, the setPage method retrieves it from the URLConnection and passes it to setContentType to initialize the JEditorPane with the correct EditorKit.

Loading Document Content

You've seen how JEditorPane selects and installs the appropriate EditorKit and you've also seen a couple of ways to load the document data itself. In fact, as with the EditorKit, there are several layers of API involved in document loading and you need to be able to choose the one most appropriate to your circumstances. In this section, we'll show you all the different possibilities and how they relate to each other.

The JEditorPane examples that you've seen so far both have editing disabled, so that the document content cannot be changed. However, as its name suggests, you can use JEditorPane as an editor for whichever type of document it contains. If you allow the user to change the document, you'll also need to be able to write the amended content out in a form appropriate to the original document encoding, which could mean writing out HTML tags. In this section, you'll also see how to arrange for JEditorPane to write data to an output stream.

Document Loading

So far in this chapter, you have seen three different ways to load a Document into a JEditorPane:

At construction time, passing a url object or a URL specification.
Using the setPage method, passing a URL or a URL specification.
Using the setText method, providing the content directly in the form of a string.

These three ways of initializing a JEditorPane are not all independent of each other. There is a hierarchy of methods that can be used to load some content; which one you choose will depend in part on the form in which you have the data and also on the current state of the JEditorPane. Using setText, for example, requires that you first obtain the data in the form of a String and that you already have the appropriate EditorKit selected. Figure 4-4 shows the relationship between the various methods involved in setting up a JEditorPane.

Figure 4-4. `JEditorPane` document loading methods.

Figure 4-4 shows the flow of control from the point at which you invoke a method that changes the content of the component up to the point at which the new Document object is installed. To keep the diagram simple, not all the methods that get called are shown only the more important ones have been included. All the methods shown in the diagram are part of the public API of JEditorPane or EditorKit; the ones that you might reasonably want to use directly are highlighted in bold for ease of reference.

Much of what happens in the upper part of the diagram should already be familiar to you, because we have already seen the use of both setPage and setText to install text. Loading a page using the constructor is the same as creating a JEditorPane with the default constructor and then explicitly invoking setPage. At the next level down, however, there are a couple of things that haven't been covered yet in this chapter. Ultimately, all text loading passes through a method called read, either in JEditorPane or its superclass JTextComponent. There are three slightly different versions of the read method:

Void read(InputStream in, Document doc) throws IOException
public void read(InputStream in, Object obj) throws IOException
public void read (Reader reader, Object obj) throws IOException

The way in which the first of these works is fairly obvious from its parameter list the content is read from the given InputStream and used to populate the Document supplied as the second argument, which is then installed into the JEditorPane. Notice that you can't directly access this method, because it has package scope, which is why it is not highlighted in Figure 4-4 as being a useful override point or a useful method to invoke. The other two variants, however, are slightly different. The second form still requires a document source in the form of an InputStream, but expects an Object instead of a Document as its second argument. The third form is identical, except that it takes a Reader as its input source. What is the actual type of the Object argument supplied to both these methods and how is it used? It turns out that the second method is simply a wrapper around the third one that first takes special action if the Object argument is an HTMLDocument and the JEditorPane has an HTMLEditorKit installed. If this is the case, it directly loads the HTML from the InputStream into the HTMLDocument, just as if the first variant had been invoked. Otherwise, it creates a Reader from the InputStream (by wrapping it with an InputStreamReader, which maps an 8-bit incoming byte stream into a 16-bit Unicode character stream) and invokes the third read method, which is actually implemented by JTextComponent.

The JTextComponent read method creates a new Document of the type expected by the installed EditorKit, and then uses the Reader it is passed to get the data to install into it. The Object argument is simply stored as a property of the Document called Document.StreamDescriptionProperty. How this is used (if at all) depends on the actual Document implementation itself. As you'll see later, HTMLDocument interprets this property as the base URL from which to resolve relative URL references within the HTML page that it is loading.

Core Note

All the read methods take their input from either an 8-bit Inputstream or a Unicode-based character source accessed via a Reader. If the constructor or the setPage method are used to load a new document, the URLConnection getlnputStream method is used to get the Inputstream that the read method needs. What about the setText method, which is given the document content in the form of a String? Fortunately, the java.io package includes a class called StringReader that allows a String to be viewed as a Unicode character stream: The setText method creates a StringReader with the supplied document text as its source and calls the third variant of read shown earlier. You can find out more about the java.io package in Core Java 2, Volume 1: Fundamentals, by Cay Horstmann and Gary Cornell.

All three of the read methods assume that you have already initialized the JEditorPane with an EditorKit suitable for the type of document being read; in addition, the first method requires you to have a Document of the correct type, whereas the other two will create a new Document of the appropriate type for you. As you can see if you refer to Figure 4-4, these methods all end up invoking an EditorKit method called read, which is defined as follows:

 public void read (Reader reader, Document doc, int offset)               throws IOException, BadLocationException

This method is where the work of loading the document and building the Element structure with the appropriate attributes actually gets done. What happens in this method depends on the actual EditorKit being used. Here's what the three editor kits supplied with Swing actually do:

`DefaultEditorKit` and `JEditorPane.PlainEditorKit`	These `EditorKits` expect to read plain, unformatted text straight into a `PlainDocument.` Therefore, there is very little to do other than read the data stream and insert it directly into the model using the `insertString` method. However, conventions for marking the end of a line of text vary from platform to platform. UNIX platforms delimit lines with a single newline character (`\n`), whereas DOS and Microsoft Windows use the two-character carriage return/newline sequence (`\r\n`). The text components avoid the complication of having to cope with both conventions by replacing a `\r\n` pair with a single `\n` before inserting the text into the model. To mark the content as having been processed in this way, the document property `EndOfLineStringProperty` is set to the `String` value `\r\n.` If this mapping was not performed, this property will have the value `null.`
`HTMLEditorKit`	This editor kit expects to read HTML from its input source. As it reads the HTML, it parses it into tags and builds an `HTMLDocument.` The details of this are covered in detail in "The Swing HTML Package".
`RTFEditorKit`	Reads the input source and interprets it as Rich Text Format. You can use this editor kit to load documents saved from Microsoft Word or WordPad, as long as they were specifically saved in RTF format.

The offset argument theoretically causes the text to be loaded into the Document at the specified location, moving any existing content at that position to make room for it. When this method is called indirectly from the higher-level methods shown in Figure 4-4, the offset is always given as zero, implying that the content will be loaded at the start of the model (and in most cases this is academic because the model is initially empty). If you directly invoke the EditorKit read method, however, you can specify a non-zero offset provided that it does not lie beyond the current end of the model (an offset equal to the current size of the model is, of course, valid).

Core Alert

At the time of writing, RTFEditorKit ignores the offset argument and always loads new content at the start of the document.

At this point, you've seen almost all there is to know about the mechanics of loading text into a JEditorPane. While this might have been academically interesting, you could be forgiven for wondering whether there was any real point in looking at the details of this process when the higher-level API seems to work well enough in most cases. The reason for taking such a close look at the mechanics of the Document loading process is that it makes it easier to understand the next three topics in this chapter loading Document content asynchronously, handling input that is not encoded in your system's default character set, and handling RTF documents. Even though the basic API that you saw earlier in this chapter will allow you to make use of the first two of these features without having to worry about what goes on "under the hood," having a proper understanding of the Document loading process will help when it comes to going beyond what is possible using the supplied API alone.

Asynchronous Page Loading

Figure 4-4 implies that the setPage method loads a new document into the JEditorPane synchronously, so that the document is completely loaded when setPage returns control to its caller. This is not always true, however, as you can see by typing the command:

 Java AdvancedSwing.Chapter4.EditorPaneExample3

When this example program starts, you'll see a blank JEditorPane and a text field into which you can type the URL of a file to load. There is an example HTML file to load in the same directory as the example programs for this chapter. If you installed the examples in the directory c:\AdvancedSwing\Examples, then you can load this file using the URL:

 file:///c:\AdvancedSwing\Examples\AdvancedSwing\Chapter4\    EditorPaneExample2.html

When you press return, you'll see that the field labeled "State" changes to "Loading" and shortly afterward to "Loaded." At the same time, the "Type" field shows that the type of the loaded file is "text/html." However, at this point the JEditorPane will still be blank it will take a few seconds for the HTML page to completely load, at which point your screen will look something like Figure 4-5.

Figure 4-5. `JEditorPane` asynchronous page loading.

You can see the code that handles the event generated when you press the return key in Listing 4-2. Ignoring the details for now, you can see that fundamentally all that happens here is that the label of the "State" field is set to "Loading ," the URL for the file to be loaded is obtained from the text field and passed to the setPage method and, finally, when the setPage method returns, the "State" field's label is changed to "Loaded" and the "Type" field is set to the loaded document's content type.

The fact that the document is not completely loaded when setPage returns demonstrates that at least some of the processing initiated by setPage is carried out asynchronously. Here's what actually happens.

Listing 4-2 Demonstrating Asynchronous Page Loading with `setPage`

 // Change page based on text field textField.addActionListener(new ActionListener() {    public void actionPerformed(ActionEvent evt) {       String url = textField.getText();       try {          // Try to display the page          loadingState.setText("Loading...");          loadingstate.paintImmediately(0, 0,                loadingState.getSize().width,                loadingState.getSize().height);          loadedType.setText("");          loadedType.paintImmediately(0, 0,                loadedType.getSize().width,                loadedType.getSize().height);          pane.setPage(url);          loadingState.setText("Loaded");          loadedType.setText(pane.getContentType());       } catch (IOException e) {          JOptionPane.showMessageDialog(pane,             new String[] {                "Unable to open file",                url             }, "File Open Error",             JOptionPane.ERROR_MESSAGE);          loadingState.setText("Failed");       }    } });

When setPage is called, it gets a URLConnection to the file to be loaded and uses its getContentType method to obtain the content type of the document to be read. In this case, the content type will be text/html.
The JEditorPane is switched to the right mode for loading the document by calling setContentType. As you know, this selects the correct EditorKit and installs the appropriate Document type, which, in this case, will be an HTMLDocument.
If the Document is a subclass of AbstractDocument, its getAsynchronousLoadPriority method is called.This method returns an integer that determines whether the document will be loaded synchronously or asynchronously. If the value returned from this method is negative, the document will be loaded synchronously and will complete before setPage returns. Otherwise, loading will be asynchronous.
If asynchronous loading is required, the new document is installed into the JEditorPane (by calling setDocument); at this point, the Document will be empty so the JEditorPane will remain blank. Then, a new thread is created. This thread uses the EditorKit read (InputStream in, Document doc) method to load the document from the InputStream corresponding to the URLConnection opened in step 1 of this process. The dispatching priority of this thread is set to the value returned by the getAsynchronousLoadPriority method.
Having created the background loading thread and started it, setPage returns to its caller.

Whether the document will have completely loaded when setPage returns depends entirely on the value returned by the getAsynchronousLoadPriority method. The implementation of this method in AbstractDocument looks for a Document property called AsyncLoadPriority whose value is expected to be of type Java.lang.Integer. By default, this property is not set, in which case getAsynchronousLoadPriority will return -1 and synchronous loading will be used. There are two ways to arrange for asynchronous loading to take place

Implement and use a Document subclass that overrides the getAsynchronousLoadPriority method to return a non-negative value.
Setting the AsyncLoadPriority property of the Document to be loaded to a non-negative value using the AbstractDocument setAsynchronousLoadPriority method.

The second of these two methods is more convenient than the first, because you don't need to subclass the Document implementation for the type of content being loaded. The HTMLEditorKit in the Swing HTML package is the only one of the standard EditorKits that supports asynchronous loading, which it does by calling setAsynchronousLoadPriority in its createDefaultDocument method. As a result, all HTML pages are loaded asynchronously (with thread dispatching priority 4 in the current implementation), while all other types load synchronously.

Although asynchronous loading is usually a useful feature because it allows the user interface to continue to be responsive while a page loaded from a slow network is still loading, it can have its drawbacks. Suppose you want to know when the new page has finished loading, perhaps because you want to count the number of words in the document or do something more complex such as display a list of the links that it contains in a separate pane. If the document were loaded synchronously, you could scan the document counting the words or looking for links immediately after the setpage method returns, but this won't work for HTML documents. In an example shown earlier, you saw how to extract the content type of a document being loaded by registering a PropertyChangeListener for a bound property of the JEditorPane called document and calling the JEditorPane getContentType method in the listener's propertyChange method. Unfortunately, you can't use that technique in this case because the HTMLDocument is installed before page loading begins, so your listener will be called long before the page has loaded. Fortunately, there is another JEditorPane bound property, called page that holds the URL of the currently loaded page. This property changes when page loading is complete, so you can use the PropertyChangeEvent that this property generates to receive notification that the HTML document has been read and the Document structure and attributes have been completely built.

Core Note

This property is set for all types of documents, not just for HTML documents, so this technique can be used whenever the setPage method is called to load a document. It also works whether the document was loaded synchronously or asynchronously.

To demonstrate the use of this property, suppose you want to display a busy cursor while a page is being loaded. If you display this cursor just before the setPage method is called and revert to the previous cursor when it returns, you won't get the right effect if asynchronous loading is used because, as you saw above, setPage returns long before the page is properly displayed. The correct approach is to register a PropertyChangeListener for the page property and switch the cursor back when this property changes, as shown in Listing 4-3.

Listing 4-3 Using the `Page` Property to Detect the End of Document Loading

 package AdvancedSwing.Chapter4; import java.awt.*; import java.awt.event.*; import java.beans.*; import java.io.*; import java.net.*; import javax.swing.*; import javax.swing.text.*; public class EditorPaneExample4 extends JFrame {    public EditorPaneExample4() {       super("JEditorPane Example 4");       pane = new JEditorPane();       pane.setEditable(false); // Read-only       getContentPane().add(new JScrollPane(pane), "Center");       // Build the panel of controls       JPanel panel = new JPanel();       panel.setLayout(new GridBagLayout());       GridBagConstraints c = new GridBagConstraints();       c.gridwidth = 1;       c.gridheight = 1;       c.anchor = GridBagConstraints.EAST;       c.fill = GridBagConstraints.NONE;       c.weightx = 0.0;       c.weighty = 0.0;       JLabel urlLabel = new JLabel("URL: ", JLabel.RIGHT);       panel.add(urlLabel, c);       JLabel loadingLabel = new JLabel(                                     "State: ", JLabel.RIGHT);       c.gridy = 1;       panel.add(loadingLabel, c);       JLabel typeLabel = new JLabel("Type: ", JLabel.RIGHT);       c.gridy = 2;       panel.add(typeLabel, c);       c.gridx = 1;       c.gridy = 0;       c.gridwidth = 1;       c.weightx = 1.0;       c.fill = GridBagConstraints.HORIZONTAL;       textField = new JTextField(32);       panel.add(textField, c);       loadingState = new JLabel(spaces, JLabel.LEFT);       loadingState.setForeground(Color.black);       c.gridy = 1;       panel.add(loadingState, c);       loadedType = new JLabel(spaces, JLabel.LEFT);       loadedType.setForeground(Color.black);       c.gridy = 2;       panel.add(loadedType, c);       getContentPane().add(panel, "South");       // Change page based on text field       textField.addActionListener(new ActionListener() {          public void actionPerformed(ActionEvent evt) {             String url = textField.getText();             try {                // Check if the new page and the old                // page are the same.                URL newURL = new URL(url);                URL loadedURL = pane.getPage();                if (loadedURL !=                        null && loadedURL.sameFile(newURL)) {                   return;                }                // Try to display the page                textField.setEnabled(false);                         // Disable input                textField.paintImmediately(0, 0,                         textField.getSize().width,                         textField.getSize().height);                setCursor(Cursor.getPredefinedCursor(                         Cursor.WAIT_CURSOR));                         // Busy cursor                loadingState.setText("Loading...");                loadingState.paintImmediately(0, 0,                        loadingState.getSize().width,                        loadingState.getSize().height);               loadedType.setText("");               loadedType.paintImmediately(0, 0,                        loadedType.getSize().width,                        loadedType.getsize().height);               pane.setPage(url);               loadedType.setText(pane.getContentType());            } catch (Exception e){               System.out.println(e);               JOptionPane.showMessageDialog(pane,                  new String[] {                     "Unable to open file",                     url                  }, "File Open Error",                  JOptionPane.ERROR_MESSAGE);               loadingState.setText("Failed");               textField.setEnabled(true);               setCursor(Cursor.getDefaultCursor());            }         }      });      // Listen for page load to complete      pane.addPropertyChangeListener(                              new PropertyChangeListener() {        public void propertyChange(PropertyChangeEvent evt) {            if (evt.getPropertyName().equals("page")) {               loadingState.setText("Page loaded.");               textField.setEnabled(true);                                   // Allow entry of new URL               setCursor(Cursor.getDefaultCursor());            }        }      });    }    public static void main(String[] args) {       JFrame f = new EditorPaneExample4();       f.addWindowListener(new WindowAdapter() {          public void windowClosing(WindowEvent evt) {              System.exit (0);           }       }) ;       f .setSize(500, 400);       f.setVisible (true);    }    private static final String spaces = " ";    private JEditorPane pane;    private JTextField textField;    private JLabel loadingState;    private JLabel loadedType; }

The two areas of interest in this listing are the actionPerformed method of the ActionListener attached to the JTextField, and the PropertyChangeListener. The actionPerformed method is similar to the one shown in Listing 4-2, but there are a couple of important differences:

A URL object is created from the filename typed into the input field. This URL is compared with the URL of the object currently loaded in the JEditorPane and, if they match, the load is not performed.
Before setPage is called, the JTextField is disabled to prevent further user input and the cursor is switched to the platform-specific WAIT_CURSOR.

The second of these two differences is the motivation behind this example because it provides the user with feedback regarding the state of the application. The other change looks like a simple optimization but, in fact, there is slightly more to it than that. To see why this change is necessary, look at the propertychange method, which is called when a bound property of the JEditorPane changes. In this example, this method checks whether the page property has been changed and, if it has, it reverts the cursor to the default and re-enables the JTextField, reversing the steps taken by the actionPerformed method. A PropertyChangeEvent for the page property is generated when the setPage method completes the loading of a page, either synchronously or asynchronously.

However, the event will be generated only if the page being loaded is not the same as the current one. In fact, the setPage method won't even start loading a page that is already installed. Because of this, if we hadn't checked in advance that the page was about to change, the state changes made in the actionPerformed method would never be reversed because the propertyChange method would not be called.

If you try this example using the command

 Java AdvancedSwing.Chapter4.EditorPaneExample4

and then type the same URL into the text field as you used with the last example, you should see that the input field is disabled and the cursor changes as soon as you press the RETURN key. These changes are reversed only when the page has completely loaded and the status changes to reflect that.

Character Set Handling

As you've seen, all operations that cause content to be loaded into a JEditorPane eventually result in an invocation of the underlying EditorKit's read method. This method has two variants, one of which uses an Inputstream as the input source, the other using a Reader. At a slightly higher level, JEditorPane also has a pair of read methods that take input from either an Inputstream or a Reader. As mentioned previously, if you supply an Input-stream as the source to the JEditorPane read method, it will be converted to a Reader by wrapping it with an InputStreamReader object. The conversion from an 8-bit Inputstream to a Unicode character Reader is necessary because the Swing text package works internally only with 16-bit Unicode character encodings.

The file systems of most computer systems in the world do not store text files in Unicode format; instead, they use a more compact 8-bit encoding such as ASCII, or the ISO Latin-1 superset of ASCII that also includes many special characters not available with ASCII itself. The actual encoding of characters used on a particular system depends on the language requirements of the users of that system. Because 256 different values is nowhere near enough to simultaneously handle all the character sets in use today, there are many standard 8-bit encodings that encompass various character sets in use around the world. For example, in Western Europe, the usual encoding used on the Windows platform goes by the name of "Cpl252," or the slightly more descriptive "Windows (Western Europe) Latin-1." If you are in Russia, however, you might find that your files are encoded using "Cpl251," otherwise known as "Windows Cyrillic." Both of these encodings have the capacity to represent 256 different characters, but a given 8-bit value in Cpl251 does not necessarily stand for the same character as it does in Cpl252. The same situation exists for other encodings. In other words, there is no global one-to-one mapping between the 8-bit value stored in a file on a computer system and the actual real-world character that it represents what you understand by a byte with value 0xC0 depends on where you are in the world.

Unicode, however, uses a 16-bit encoding and so can handle far more characters 65536, to be precise. Because of its larger capacity, Unicode assigns a unique value to each character in all the languages that it supports, so, for example, there is a reserved Unicode value for each and every Cyrillic character, which differs from the values used by any ASCII character or by any character in the Windows Latin-1 character set. Whereas the interpretation of the 8-bit encoding 0xC0 is locale-dependent, each 16-bit Unicode character has only one possible interpretation.

Now let's look at this from the perspective of the problem faced by JEditorPane or its EditorKits. When you use the setPage method to load a Web page or an ordinary file into a JEditorPane, it opens a URLConnection to the Web server that owns the page or to the local file system and reads the data using the stream returned by the URLConnection getInputstream method. As its name suggests, this method actually returns an Inputstream, so the data delivered from the remote system or the local file system will be represented in some 8-bit encoding. But which 8-bit encoding? To convert the incoming Inputstream to the corresponding Reader that can deliver the corresponding 16-bit Unicode characters, JEditorPane uses an InputStreamReader. This class has two constructors:

 public InputStreamReader (Inputstream in); public InputStreamReader (Inputstream in, String encoding);

An InputStreamReader created using the first constructor assumes that its input is provided in the default encoding of the platform that it is running on. On a Windows system in the United Kingdom, an object constructed in this way would expect to receive a byte stream encoded according to the Cpl252 encoding scheme, while an object instantiated from the same code in Russia would assume it was receiving a Cp1251-encoded input stream. If the input stream encoding matches the expectation of the InputStreamReader, the correct Unicode characters will be read into the JEditorPane. If it does not, however, the results will be wrong.

The second constructor is more general. Instead of using the local encoding, it allows you to specify the particular encoding that the input stream uses, using one of the well-known encoding names recognized by the java.io package, a full list of which can be found in the documentation that accompanies the JDK. Not surprisingly, Cpl251 and Cpl252 are legal values for this parameter.

The problem is, given an Inputstream, how can you tell which encoding it is using? Unfortunately, without further information, you can't. If you pass it an Inputstream, JEditorPane just assumes that it will be fed characters in the Latin-1 (8859_1) encoding, which is the appropriate default for the United States and will work perfectly well in Western Europe too, as long as some of the special characters used in some European languages are not encountered. As long as you're only using the JEditorPane to load documents held on your own system or on other systems that use the same encoding, you'll probably never notice any problem. But this really is quite restrictive if you're going to be loading pages over the Internet, because those pages could have come from anywhere. If you load a page from a Web server in Russia, you'd better be prepared to handle Cyrillic characters encoded according to Cpl251. But how can you do this if JEditorPane can't tell which character set it is receiving? There is a way to arrange for JEditorPane to use an InputstreamReader configured for the correct encoding, but it requires cooperation on the part of the owner of the Web site from which you are reading an HTML page.

The only way to change the encoding that the JEditorPane uses is to supply it as a part of the content type that gets passed to the setcontentType method. As noted earlier, the content type has the format:

 MIME type ; parameters

The only parameter that setContentType recognizes (at the time of writing) is charset, which takes a character encoding as its value. Using this method, you might arrange for Cpl251 encoding to be used with the following call:

 pane.setContentType ("text/html;charset=Cpl251");

The call to setContentType takes place in the JEditorPane setPage method and uses the content type returned from the URLConnection getContentType method. If you're loading a local file (using the file: protocol prefix), the content type will never contain a charset parameter, because the Sun implementation of the code that reads local files just guesses the content type from the filename and returns a bare MIME type. If you are reading an HTML page (or some other content) from a Web server, the charset parameter will not be included unless you can arrange for the Web server to return it. However, there is a way to do that.

HTML provides a tag called META that contains information about the document itself rather than actually contributing to its content. This tag has an attribute called http-equiv, which can be used to name an HTTP header field that will be set by the Web server to the value of the accompanying content attribute. You can use this to have JEditorPane change its encoding to Cpl251 (the Cyrillic character encoding) by placing the following tags at the top of the Web page:

 <HTML> <HEAD> <META HTTP-EQUIV="Content-Type" content="text/html; charset=cp1251"> <TITLE>Document Title</TITLE> </HEAD> <BODY> <!-- Content goes here --> </BODY> </HTML>

The HTTP-EQUIV tag causes the Web server to replace its usual Content-Type header with the following:

 ContentType: text/html; charset=cp1251

Now, the URLConnection getContentType method will return the value

 text/html; charset=cp1251

which will be passed to setContentType and cause the Cpl251 encoding to be selected. Obviously, you can't require that all Web page authors include a META tag with the proper HTTP-EQUIV attribute. However, Web browsers are in the same position as JEditorPane they also need to know how the Web page is encoded. Because of this, you'll find that most Web pages that need to specify a particular character encoding will do so and, in fact, most commercial HTML editors will supply the correct META tag automatically, without the owner of the Web site needing to be aware of it.

The META tag that you have just seen works because the Web browser replaces its usual ContentType header with the one supplied by the HTTP-EQUIV attribute. What about when you load an HTML file locally using a URL that starts with file:? When you do this, there is no Web browser involved the file is read straight from the local disk: As we said earlier, its content type will be text/html and the character set will be assumed to be 8859-1. Nevertheless, this arrangement still works for local files, even though there is no Web browser to set up the correct HTTP headers and nothing reading the local file that expects to see them. This clever trick is actually performed by the HTML parsing code in the Swing HTML package that we'll be looking very closely at later in this chapter. Here's how it works.

When a file: URL is used, the HTML page is read from the local disk and the encoding is, indeed, initially set to 8859-1. As the file content is read, it is passed to a parser in the HTML package, which scans the HTML tags and builds the corresponding Document model with the appropriate text package attributes. As it does this, it will encounter the META tag and its HTTP-EQUIV attribute and extract the content type and the new character set. If a charset parameter is found in the new content type, the parser throws a ChangedCharSetException, which is caught by the JEditorPane read method that originally initiated the process of reading the file. When it receives this exception, it extracts the name of the new encoding and uses it to create a new inputstreamReader with the proper encoding. The part of the Document model that has been built so far is then discarded and the HTML page is read again from the beginning, this time with the proper translation into Unicode being performed. A property called IgnoreCharsetDirective with value true is set on the Document to prevent this process repeating itself forever as a result of the META tag being read again on the second pass through the Web page.

The need to set the proper encoding arises only if you use the setPage method or one of the lower-level JEditorPane read methods that take an Inputstream as the data source. If you are working at such a low level and you already know the correct encoding, you can construct your own Reader with the correct encoding from the Inputstream (using the second InputstreamReader constructor shown earlier) and call the read method with your Reader. The setText method, of course, does not need to concern itself about character encoding issues, because its input source is, by definition, a String consisting of Unicode characters, so no translation is necessary.

Loading RTF Documents

The examples that you've seen so far in this chapter and the discussion surrounding them have centered exclusively on the use of plain text or HTML documents but, as you know, JEditorPane can handle any document type for which it has an EditorKit and a Document class available. The third type of document for which the Swing text package provides support is RTF, which is understood by, among other things, Microsoft Word and the simpler Wordpad editor found on the Windows platform. Loading an RTF file into a JEditorPane does not require any special code or different action on the part of the user. Among the examples that accompany this book is an RTF file that you can load using the same command as you used to view an HTML file:

 java AdvancedSwing.Chapter4.EditorPaneExample4

and then typing a URL that corresponds to the file into the input field. If you installed the example code in the directory c:\Advanced-Swing\Examples, the appropriate URL would be:

 file:///c:\AdvancedSwing\Examples\AdvancedSwing\    Chapter4\LM.rtf

When you press RETURN, the RTF document will be loaded and should look something like Figure 4-6. At the time of writing, the RTF support in the Swing text package is not complete. One problem with it is visible here the document that you have loaded actually contains an image, but the RTF package doesn't display it. Hopefully, this situation will improve in later Swing releases.

Figure 4-6. Viewing an RTF document with `JEditorPane.`

Core Note

There was a bug in Swing 1.1 and early versions of Java 2 that prevented JEditorPane from loading RTF files. If, when you try to load LM.rtf you get an exception with the message "RTF is an 8-bit format," you should install a later version of Swing or a more up-to-Jate Java 2 release, such as version 1.2.2 or later.

The JEditorPane Control

A Simple HTML Viewer

Figure 4-1. Using JEditorPane to display a Web page.

Listing 4-1 A Simple HTML Viewer Implemented Using JEditorPane

Initializing a JEditorPane

Content Type and Editor Kit Selection

Installation of an Editor Kit

Table 4-1. Mapping from Content Type to Editor Kit and Document Class

Figure 4-2. Using JEditorPane to display HTML and plain text.

The setContentType Method

Table 4-2. Mapping from MIME Type to Editor Kit

Figure 4-3. The EditorKit registry.

Configuration from the Input Source

Loading Document Content

Document Loading

Figure 4-4. JEditorPane document loading methods.

Asynchronous Page Loading

Figure 4-5. JEditorPane asynchronous page loading.

Listing 4-2 Demonstrating Asynchronous Page Loading with setPage

Listing 4-3 Using the Page Property to Detect the End of Document Loading

Character Set Handling

Loading RTF Documents

Figure 4-6. Viewing an RTF document with JEditorPane.

The `JEditorPane` Control

Figure 4-1. Using `JEditorPane` to display a Web page.

Listing 4-1 A Simple HTML Viewer Implemented Using `JEditorPane`

Initializing a `JEditorPane`

Figure 4-2. Using `JEditorPane` to display HTML and plain text.

The `setContentType` Method

Figure 4-3. `The EditorKit` registry.

Figure 4-4. `JEditorPane` document loading methods.

Figure 4-5. `JEditorPane` asynchronous page loading.

Listing 4-2 Demonstrating Asynchronous Page Loading with `setPage`

Listing 4-3 Using the `Page` Property to Detect the End of Document Loading

Figure 4-6. Viewing an RTF document with `JEditorPane.`