This section describes the coding approach, conventions, and style I chose to use in this book. If you're reading this book as a technical end user , you may want to skip ahead to the next section.
General Coding Approach and Conventions
The Java and C++ code presented in this book uses object-oriented techniques. The DOM and all its various parts are object-oriented, as are the Java file operations and the C++ file operations. (I have chosen to use the C++ classes instead of the old-style C libraries.) That said, most of what we programmers really care about is procedural in nature, that is, the code that lives and works inside methods. We care most about how to call methods that manipulate DOM objects. We don't care as much about everything else.
I have not gone out of my way to make this code object-oriented. In the beginning it is fairly simple and not necessarily very heavily object-oriented. However, as the design progresses and matures through the book, I do use more object-oriented techniques when they promote reuse and extension. I'm enough of an old-school programmer to be a bit concerned about the performance implications of declaring and freeing a lot of objects dynamically. I generally tend to avoid doing so when I can. However, we're valuing reusability and extensibility over performance in this design. So, there are cases when we do create a lot of objects dynamically at runtime rather than declaring them statically at compile time. If the code lives on and it turns out to be a dog at runtime, we can investigate more efficient designs. A modular, object-oriented approach also helps us in that regard.
I have not chosen to construct elaborate object models for the non-XML entities manipulated in these programs. In addition there aren't very many class diagrams or other things as found in the Unified Modeling Language ( UML ). We're going to keep it simple and focus on the essentials.
That said, here are a few other notes and general rules on coding approach and style.
Additional C++ Considerations
A few words are in order regarding my approach to using C++. In particular, I want to point out handling of strings, exceptions, and constants.
The final ISO and ANSI C++ standard has extremely useful string classes (at last!). These are much easier to use and less prone to runtime exceptions than old C-style char arrays, char pointers, and the C string library. However, these string classes are still not universally supported "out of the box" by many major C++ compilers. They weren't supported natively by Visual C++ 6.0, which I'm using for this book. Even though some add-on open source and proprietary class libraries do support these classes, many developers use only what is standard from the compiler vendor. In addition, a lot of the legacy C++ code currently in production doesn't use these string classes. Finally, neither MSXML nor the Apache Xerces C++ API that is its best alternative use the ISO/ANSI string classes. MSXML uses COM strings (as we'll discuss in Chapter 2 and Appendix C), and the Xerces C++ API uses its own XML char class. They both support conversions to and from char arrays, not ISO/ANSI strings. So, for all these reasons I'm sticking with char arrays, char pointers, and the old C string library.
In the early days of C++ compilers, throwing exceptions for common runtime errors was discouraged because exception handling wasn't very efficient. The general view is that compilers have gotten a lot more efficient in this regard. However, a lot of the standard C++ library functions still use status codes or return values rather than throw exceptions for every little thing the way that Java does. For this reason I generally use the approach of returning status values from functions rather than throwing exceptions.
Regarding constants, some programmers believe that using #define rather than const to define constants is "evil" [Cline 2002]. I certainly prefer the Java style of defining constant class members . However, I found it extremely awkward to define constants that could be shared across classes using this approach with Visual C++ 6.0. There may be better ways to do it with compilers that provide better support for the final ISO/ANSI C++ standard. I'm sure there are people who are more clever about C++ programming than I am. At any rate, I've taken the easy way out in this book and used old C style #defines.
The general picture you should draw from this discussion is that the C++ code presented here does not necessarily represent current best practices in C++ coding. It instead represents "good" practices (I hope) as of the mid-1990s. However, I'm not really too concerned about that. If your legacy C++ applications are very old, they probably don't reflect current best practices either. Your code may look very similar to mine. On the other hand, if you are using more up-to-date compilers and following current best practices (for example, using the string class instead of char arrays), by all means keep doing what you're doing and don't regress. You should, however, still be able to follow my code and update the techniques to your current best practices as appropriate.