The C Program Transformation Process

 < Day Day Up > 



The C++ Program Transformation Process

C++ programs are translated into executable modules via a nine phase process. Each phase is briefly discussed below and illustrated in figure 4-6.

Phase 1

In phase 1, physical source file characters are mapped to the basic source character set. The basic source character set includes the following 91 graphical characters:

a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ~ ! = , \ " ’

In addition to these there are five non-graphical characters: space, horizontal tab, vertical tab, form feed, and new line.

Next, all trigraph sequences are replaced by corresponding single-character internal representations. Table 4-1 gives the trigraphs and their single replacement characters.

Table 4-1: Trigraph Replacement

Trigraph

Replacement

??=

#

??/

\

??’

^

??(

[

??)

]

??!

|

??<

{

??>

}

??-

~

Trigraphs are used by programmers who are programming in C++ on terminals that lack the special characters required by the basic source character set.

Phase 2

In phase 2, new-line characters and preceding backslashes are deleted.

Phase 3

In phase 3, the source file is decomposed into preprocessing tokens and sequences of white-space characters. Each comment is replaced by a single space character.

Phase 4

In phase 4, the preprocessing directives are executed and macros are expanded.

Phase 5

In phase 5, each source character set member, escape sequence, or universal-character-name in character or string literals is converted to a member of the execution character set. The execution character set is the basic source character set plus the control characters plus a null character.

You were introduced to the basic source character set in phase 1. An escape sequence can be either simple, octal, or hexadecimal. Table 4-2 gives the valid escape sequences.

Table 4-2: Escape Sequences

Character

Abbreviation

Escape Sequence

newline

NL (LF)

\n

horizontal tab

HT

\t

vertical tab

VT

\v

backspace

BS

\b

carriage return

CR

\r

form feed

FF

\f

alert

BEL

\a

backslash

\

\\

question mark

?

\?

single quote

\’

double quote

"

\"

octal number

ooo

\ooo

hexadecimal number

hhh

\xhhh

A universal character name provides a way to name other characters. A universal character name is formed by a backslash character followed by a lower case u or upper case U, followed by a sequence of 4 or 8 hexadecimal characters. The following is an example of a universal character name:

\uAAAAAAAA

Phase 6

In phase 6, adjacent character or string literals are concatenated.

Phase 7

In phase 7, preprocessing tokens are converted to tokens. The tokens are then syntactically and semantically analyzed and translated.

Phase 8

In phase 8, translated translation units and instantiation units are combined.

A translation unit consists of a source file, its headers and any source files included via the #include preprocessor directive, minus any source lines skipped by conditional compilation.

If a translation unit instantiates template functions or classes then the required templates are located and the instantiations performed.

Phase 9

In phase 9, external object and function references are resolved, and library components are linked. All output from the translation is combined into a program image. Once the C++ source code has been translated into an executable module targeted for a specific processor it can be executed. The entry point for all C++ programs is the main() function.

click to expand
Figure 4-6: C++ Translation Phases



 < Day Day Up > 



C++ for Artists. The Art, Philosophy, and Science of Object-Oriented Programming
C++ For Artists: The Art, Philosophy, And Science Of Object-Oriented Programming
ISBN: 1932504028
EAN: 2147483647
Year: 2003
Pages: 340
Authors: Rick Miller

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net