6.6 FLOATING-POINT TYPES


6.6 FLOATING-POINT TYPES

C++ and Java have in common the floating-point types

      float      double 

C++ typically uses four bytes for a float and eight for a double. Although the C++ standard does not specify how many bytes exactly should be devoted to each type, it does stipulate that a double be allocated at least as many bytes as a float. On the other hand, Java stipulates that float will have exactly 4 bytes assigned to it and a double 8 bytes. In addition, C++ supports long double for extended precision which is usually allocated 16 bytes.

The exact meaning of float, double, and long double is implementation-defined in C++. That is, the number of bits reserved for the exponent and the fraction may vary from implementation to implementation. On the other hand, Java stipulates that the float and the double types conform to the IEEE 754 standard. Most modern implementations of C++ also conform to this standard.

Under the IEEE 754 standard, a floating-point number consists of three parts: a sign, an exponent, and a fraction (also known as the mantissa). The number of bits reserved for the exponent determines how large and small the overall magnitude of a number can be, while the number of bits reserved for the fraction determines its precision. For float, the exponent gets 8 bits, while the fraction gets 23. As a result, the smallest positive value of a float is 1.17 × 1038, its largest value 3.40 × 1038, with a precision of 6 decimal digits. On the other hand, the smallest positive value of a double is 2.22 × 10308, its largest value 1.79 × 10308, and its precision 15 decimal digits.

Additionally, under the IEEE 754 standard for floating-point numbers, a floating point number can overflow to infinity, which one could represent by a symbolic constant such as inf or underflow to zero (become too small for float or double). Furthermore, the result from an ambiguous arithmetic operation, such as adding a +inf to a inf, can be represented by another symbolic constant, which can be conveniently denoted NaN, for "Not a Number". Note again, the symbolic constant for infinity, inf, and the symbolic constant for an invalid number, NaN, only apply to floating-point numbers. Also useful to remember is the fact that it is possible for an arithmetic operation involving the inf symbolic constant to yield a regular number. For example, if x is a positive finite number, then x divided by +inf will yield +0.0.

By default, a floating-point literal is of type double in both C++ and Java. However, when suffixed with either the letter F or the letter f, it will be stored as a float in both C++ and Java. If a floating-point literal needs to be stored as a long double in C++, it must be suffixed with the letter L.

Before ending this section, we would like to mention that in C++, the boolean, the character, and the integer types are collectively called the integral types, and the integral and the floating-point types are collectively called the arithmetic types.

On the other hand, in Java, only the character and the integer types are collective called the integral types, and the integral and the floating-point types are collectively called the numeric types.

For solving a majority of problems in C++, you are likely to use bool for logical values, char for characters, int for integer values, and double for floating-point values. You'd do the same in Java except that you'd use boolean for logical values.




Programming With Objects[c] A Comparative Presentation of Object-Oriented Programming With C++ and Java
Programming with Objects: A Comparative Presentation of Object Oriented Programming with C++ and Java
ISBN: 0471268526
EAN: 2147483647
Year: 2005
Pages: 273
Authors: Avinash Kak

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net