Without doubt, the most commonly used markup language today is hypertext markup language (HTML), used for creating web pages. HTML consists of text with tags that define characteristics about the text. HTML is not hard to write, and you could use Emacs or any other editor to write the tags and the text. An HTML tag generally looks like this: <tagname>text being tagged</tagname> For your convenience, several modes are available for writing HTML in Emacs, including HTML mode, HTML helper mode, html menus, and a variety of SGML[1] tools including sgml mode and psgml mode. Of these tools, we've chosen to describe HTML mode, a variant of sgml mode, which is included in GNU Emacs, and HTML helper mode, which is a popular add-on. If you are writing XHTML, a stricter version of HTML that can be validated, you should consider XHTML mode, described briefly in this section, or psgml mode, covered later in the XML section of this chapter.
Serious web developers may want to investigate some of the cutting edge development going on to make Emacs even more powerful. Check out HTMLModeDeluxe (http://www.emacswiki.org/cgi-bin/wiki/HtmlModeDeluxe) and the Emacs WebDev Environment by Darren Brierton (http://www.dzr-web.com/people/darren/projects/emacs-webdev). Both of these tools support mmm mode (where mmm stands for "multiple major modes"). Using this feature, the cursor changes major mode depending on the section of the page you are editing. When you edit a script, the mode changes automatically to support that type of authoring. Both are excellent tools for building complex web pages. In the following sections, we are not going to teach you to write HTML. (For more information on writing HTML, see HTML and XHTML: The Definitive Guide by Chuck Musciano and Bill Kennedy, O'Reilly) Rather, we're going to teach you the rudiments of using HTML mode and HTML helper mode to help you create HTML documents. 8.3.1 Using HTML ModeTo start HTML mode, type M-x html-mode (or simply open an HTML file). Most authors use a standard template when they write HTML. You may already have one. If you don't, HTML mode is happy to supply one for you. Simply start by typing C-c C-t (for sgml-tag) or by selecting Insert Tag from the SGML menu. If you enter the <html> tag that signifies the start of an HTML document, Emacs inserts a basic template in your buffer.
Note that Emacs automatically creates a first-level header that is equal to the title you entered. It also inserts a hyperlink so that readers can email you. Depending on your spam tolerance, you may want to delete that line. Also, Emacs is just guessing at your name and email address. You can set these explicitly by adding two lines to your .emacs file. Change Mr. Dickens' information to settings appropriate for you. (setq user-mail-address "cdickens@great-beyond.com") (setq user-full-name "Charles Dickens") You could approach HTML mode in a couple of ways. You could learn the key bindings for various tags, or you could simply use the sgml-tag command for everything. It depends how many bindings you want to learn. A mixed approach may be best, where you learn keystrokes for the most common tags and use sgml-tag for less common tags. Key bindings are intuitive in HTML mode. Like most specialized editing modes, many functions are bound to C-c C-something. We've seen C-c C-t to insert a tag. You won't be too surprised to find that to move forward to the next tag you type C-c C-f and to move back to the previous tag you type C-c C-b. To insert an <href> tag, type C-c C-h. You see what we mean. HTML mode is designed for writing HTML, not XHTML. XHTML is stricter, requiring all tags to have a closing tag. The common <p> tag is a salient example. HTML authors would never use the closing tag </p> that XHTML requires. HTML mode inserts a lone <p> tag even when given a command, such as sgml-tag, that normally inserts a tag pair. If you want to write XHTML, use XHTML mode instead. Emacs starts this mode itself if your file contains a reference to an XHTML document type definition. Other than completion of tags, XHTML mode is very similar to HTML mode described here.[2]
Being able to hide the tags is a helpful feature. To hide HTML tags, type C-c Tab; use the same command to display the tags again. Let's say that we've inserted some of our dickens file into the dickens.html file we were just working with.
You can keep typing text, concentrating on what you're writing rather than being distracted by the markup. Emacs protects you from deleting tags when you're writing by making hidden text read-only. If you move the cursor onto a hidden tag, Emacs displays it in the minibuffer. Of course, the whole purpose of writing HTML is to display it in a web browser. Typing C-c C-v (for browse-url-of-buffer) opens the default web browser to view the web page you're writing. If you'd like to look at the file in a web browser each time you save, you can turn on a function called html-autoview-mode, invoked by pressing C-c C-s. When you save the file, Emacs automatically opens it in the default browser. 8.3.1.1 Character encoding in HTML modeWhat if you want to include special characters or characters from other character sets in your web page? The short answer is that you can enter a character's encoding explicitly. For example, to enter a capital U with an umlaut, you can type Ü. Many characters can also be represented as named entities, which are certainly easier to remember than numbers. For example, the named entity for a capital U with an umlaut is Ü. But HTML mode does provide more support than this. We'll take the simplest case first. Let's say you can create a character with your keyboard; for a common case, take the ampersand, a character that must be encoded since it has a special meaning in HTML. Type C-c C-n & Enter. Emacs inserts the entity for an ampersand, &. You can insert entities for a wide variety of keyboard characters this way. But let's say that you are inserting characters that are not on your keyboard. For example, perhaps you are in the U.S. writing up a list of contributors from Europe and many of their names have accent marks. The ISO Latin-1 character set will handle this. If you have a keyboard that already emits Latin-1 characters and Latin-1 is your default coding system for keyboard input, inserting such characters is relatively straightforward. Simply press C-c 8 to turn on a minor mode called SGML name entity mode. Emacs says sgml name entity mode is now on.[3] C-c 8 toggles this state. Type Latin-1 characters as you normally would and Emacs inserts the named entities associated with those characters.
For those of us with other keyboard encodings, however, there's a bit more to do. To get bindings to insert entities into your HTML file, we discuss two options. The first is ISO accents mode. This mode provides support, as the name implies, for accented text. Whether you're typing umlauts, cedillas, circumflexes, acute, or grave marks, ISO accents mode is up to the task. The other option is to use the C-x 8 prefix to insert a wide range of entities, including currency signs, mathematical symbols, and copyright signs (as well as all the accented characters ISO accents mode supports). 8.3.1.1.1 Using ISO accents modeTo use ISO accents mode to insert entities in your file, type C-c 8 to turn on SGML name entity mode, then M-x iso-accents-mode Enter to turn on that mode. In ISO accents mode, certain characters (including /, ~, ', ", `, and ^) are interpreted as prefixes to create accented characters. SGML name entity mode captures these keystrokes and automatically inserts the appropriate HTML entity. For example, typing 'a produces the HTML entity for á, á. For specific key bindings, see Table 8-2. 8.3.1.1.2 Using the C-x 8 prefixYou can also insert a wide range of entities using C-x 8 after you do some setup.[4] First enter SGML name entity mode by typing C-c 8. Next specify Latin-1 as your character set by typing C-x Enter k latin-1 Enter. You can then enter a large number of entities by typing commands prefixed with C-x 8. For example, to insert the entity for a yen symbol, type C-x 8 Y. Watch the minibuffer. The literal character will appear in the minibuffer as the entity is inserted. Both ISO accents mode and the C-x 8 prefixes allow you to type a single undo command (C-_) to translate the entity back into the literal character.
Table 8-2 provides a list of accented characters and the bindings that help insert them. Table 8-3 lists other named entities including punctuation marks and symbols.
Table 8-4 lists HTML mode commands.
8.3.2 Using HTML Helper ModeHTML helper mode, written by Nelson Minar and now maintained by Gian Uberto Lauri, offers great flexibility in writing HTML. You can enable various hand-holding features depending on your level of expertise and preferences. Why would you choose HTML helper mode over Emacs's own HTML mode? Although HTML mode makes it easy to write basic HTML, it provides little support for programmatic, interactive web pages. HTML helper mode supports ASP, JSP (and JDE, the Java Development Environment, discussed in Chapter 9), and PHP, to name a few more advanced features. If you're writing HTML in Emacs, you're likely to be a developer of such pages rather than a more text-oriented author. For this reason, HTML helper mode continues to be popular among Emacs users. Html helper mode is not part of Emacs by default. You can download it from its homepage at http://www.nongnu.org/baol-hth. Download the file into a directory such as ~/elisp, move to that directory, and then type: % tar xvzf html-helper-mode.tar.gz The system unpacks the tar file for you. (Of course, if you are installing on Windows, you can simply use WinZip to decompress and unpack the file.) The tar file contains several components, including:
8.3.2.1 Starting HTML helper modeBefore you can start HTML helper mode, you have to load it into Emacs. (For a complete discussion of this topic, see "Building Your Own Lisp Library" in Chapter 11; we describe it briefly here.) Begin by typing M-x load-file Enter. Emacs asks which file to load and you enter ~/elisp/html-helper-mode.el and press Enter, adjusting the path to reflect the location where you installed html-helper-mode.el. You enter the mode by typing M-x html-helper-mode Enter. HTML helper appears on the mode line. Making HTML helper mode part of your startup is easier. Put the following lines in your .emacs file: (setq load-path (cons "~/elisp " load-path)) (autoload 'html-helper-mode "html-helper-mode" "Yay HTML" t) In the first line, insert the complete path for the directory in which html-helper-mode.el is located in quotation marks, replacing ~/elisp to the correct value for your system. The second line tells Emacs to load HTML helper mode automatically when you start Emacs. If you want to use HTML helper mode for editing HTML files by default, add this line to .emacs as well: (setq auto-mode-alist (cons '("\\.html?$" . html-helper-mode) auto-mode-alist)) If you edit other types of files with HTML helper mode, you may want to add lines to include all the types of files you edit. Adding more lines is the easiest way. For example, to make HTML helper mode the default for PHP files, add this line to .emacs: (setq auto-mode-alist (cons '("\\.php$" . html-helper-mode) auto-mode-alist)) 8.3.2.2 A brief tour of HTML helper modeThe main reason people like HTML helper mode is that it provides easy menu access to a wide variety of options. Realizing that having a crowded menu with many submenus could overwhelm new users, the authors created an option called Turn on Novice Menu. Selecting this option from the HTML menu provides a barebones menu, as shown in Figure 8-1. Novice HTML writers can use these options to create a basic HTML document without worrying about what forms, JSPs, PHP, and the like mean. Figure 8-1. HTML helper mode's Novice menu (Mac OS X)Selecting Turn on Expert Menu from the HTML menu returns the larger menu with its numerous submenus, as shown in Figure 8-2. Figure 8-2. HTML helper mode's Expert menu (Mac OS X)8.3.2.3 Inserting an HTML templateHTML helper mode inserts a template for you every time you create a new HTML file.
The template contains all the basic HTML elements. The entire document is surrounded by <html></html> tags. Then the head and the body are separated. Following an <hr> tag that tells the browser to insert a horizontal line, called a horizontal rule, the <address> tag leaves a place for the author to put in his or her email address. In these days of spam, it's unlikely you'll want to do that. (You can leave the <address> tag blank or delete it.) If you do want to include an email address, enter a line like this in your .emacs file (substituting your own email address, of course): (setq html-helper-address-string "<a href=\"mailto:cdickens@great-beyond.com "\>Charles Dickens</a>")
Normally you begin filling out the template by entering title and a level-one header (these are often the same). You can then begin writing paragraphs of text. Before you start typing, press M-Enter. Emacs inserts <p></p> and positions the cursor between them. You can see from the ending paragraph tag that HTML helper mode is working toward XHTML compliance.
8.3.2.4 Putting tags around a regionWhen editing HTML files, you often spend a lot of time marking up existing text. If you preface any of the tag commands with C-u, Emacs inserts the tags around a region rather than putting them at the cursor position.[7] To demonstrate, we'll start a new HTML file and insert text from our dickens file.
If you were really doing this properly, you'd type something like "A Tale of Two Cities, Chapter 1 as the title and the first-level header. But for now, you just want to see how to mark up a region of existing text. Begin by marking the Dickens paragraph as a region and type C-u M-Enter.
8.3.2.5 Using completionHTML helper mode supports completion. You type the beginning of a tag and press M-Tab (for tempo-complete-tag).[8] If there's more than one possibility, a window of possible completions appears. Let's say you are working on a bulleted list.
Note, however, that completion is sometimes case-sensitive. For example, typing <s M-Tab shows the following completions: <select <span class= <span style = <strike> <strong> <samp> Notice that the <script> tag is missing. But if you try typing <S M-Tab, the script tag and its attributes are inserted, as in: <SCRIPT TYPE="text/javascript"> </SCRIPT> The distinction between upper- and lowercase shows that HTML helper mode is moving toward XHTML compliance, but hasn't quite arrived. XHTML requires that all tags be lowercase. On the positive side, note that the attribute is in quotation marks, another XHTML requirement. 8.3.2.6 Turning on promptingSome HTML tags require you to input certain attributes. For example, when you enter a hyperlink, you have to specify the URL of the link and the text that the user will select. If you type C-c C-a l (the lowercase letter "L") to enter a link, HTML helper mode inserts: <a href=""></a> with the cursor on the second quotation mark so you can type in the URL. HTML helper mode offers additional help if you turn on prompting. Add this line to your .emacs file: (setq tempo-interactive t) Note that HTML helper mode prompts only for required attributes; if you want to input optional attributes, you have to add them by hand. Whether you consider prompting useful or intrusive is a matter of personal taste. If you are a beginning HTML author, prompting may help you remember to enter all the necessary information for each tag. If you find you don't like it, simply delete the line you added to the .emacs file. 8.3.2.7 Character encoding in HTML helper modeHTML helper mode supports entry of only the most common character entities. However, it does make it easy to insert these entities. Simply type C-c before the character in question. For example, type C-c < to enter the escape code for a less-than sign (<). Character entities are also available by selecting HTML Insert Character Entities. Table 8-5 lists bindings for inserting character entities in HTML helper mode.
Table 8-6 lists the key bindings for HTML helper mode. There are key bindings for advanced HTML features such as forms as well as for some of the HTML 3.0 features. Some tags would normally appear on different lines (for example, in the case of a list); in this table, they are shown on one line.
|