DB2 Universal Database's Data Types If you stop and think about it, most of the "data" you encounter on a day-to-day basis falls into distinct categories. The money you buy coffee with and the change you get back is numerical in nature; the email messages you read and the replies you send back are composed of character strings; and many of the things you do, such as attending meetings, eating dinner, and going to bed revolve around time. Most of the data that gets stored in a DB2 UDB database can be categorized in a similar manner. To ensure that all data is stored as efficiently as possible, DB2 UDB comes equipped with a rich assortment of built-in data types. (In fact, 19 different built-in data types are available.) DB2 UDB also provides facilities that can be used to create an infinite number of user -defined data types, which can in turn be used to store complex, nontraditional data that might be found in a complex computing environment. The built-in data types available with DB2 UDB are classified according to the type of data they have been designed to hold: -
Numeric data -
Character string data -
Date/Time data -
Large object data In addition to these "traditional" types of data, special data types used with DataLinks and DB2 UDB Extenders are also available. Numeric Data Types As the name suggests, numeric data types are used to store numeric values ” specifically , numeric values that have a sign and a precision. The sign is considered positive if the value is greater than or equal to zero and negative if the value is less than zero, while the precision is the actual number of digits used to present the value. Numeric data is stored using a fixed amount of storage space, and the amount of space required increases as the precision of the number goes up. Numeric data types include: Small integer. The small integer data type is used to store numeric values that have a precision of 5 digits or less. The range for small integer values is “32,768 to 32,767 and 2 bytes of storage space is required for every small integer value stored. (Positive numbers have one less value in their range because they start at the value 0 while negative numbers start at “1.) The term SMALLINT is used to denote the small integer data type. Integer. The integer data type is used to store numeric values that have a precision of 10 digits. The range for integer values is “2,147,483,648 to 2,147,483,647, and 4 bytes of storage space is required for every integer value stored. The terms INTEGER and INT are used to denote the integer data type. Big integer. The big integer data type is used to store numeric values that have a precision of 19 digits. The range for big integer values is “9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 and 8 bytes of storage space is required for every big integer value stored. The term BIGINT is used to denote the big integer data type. (This data type is typically used on systems that provide native support for 64-bit integers; on such systems, processing large numbers that have been stored as big integers is much more efficient, and any calculations performed are more precise.) Decimal. The decimal data type is used to store numeric values that contain both whole and fractional parts , separated by a decimal point. The exact location of the decimal point is determined by the precision and the scale of the value (the scale is the number of digits used by the fractional part). The maximum precision allowed for decimal values is 31 digits, and the corresponding scale must be a positive number less than the precision of the number. The amount of storage space needed to store a decimal value can be determined by solving the following equation: Precision · 2 (truncated) + 1 = Bytes required. (For example, the value 67.12345 has precision of 7, 7 · 2 is 3, + 1 makes 4; therefore, 4 bytes are required to store the value 67.12345.) The terms DECIMAL, DEC, NUMERIC, and NUM are used to denote the decimal data type. Single-precision floating-point. The single-precision floating-point data type is used to store a 32-bit approximation of a real number. This number can be zero, or it can fall within the range “3.402E+38 to “1.175E “37 or 1.175E “37 to 3.402E+38. Each single-precision floating-point value can be up to 24 digits in length, and 4 bytes of storage space is required for every value stored. The terms REAL and FLOAT are used to denote the single-precision floating-point data type. Double-precision floating-point. The double-precision floating-point data type is used to store a 64-bit approximation of a real number. This number can be zero, or it can fall within the range “1.79769E+308 to “2.225E “307 or 2.225E “307 to 1.79769E+308. Each double-precision floating-point value can be up to 53 digits in length, and 8 bytes of storage space is required for every value stored. The terms DOUBLE, DOUBLE PRECISION, and FLOAT are used to denote the double-precision floating-point data type. Character String Data Types Character string data types are used to store values comprised of one or more alphanumeric characters. Together, these characters may form a word, a sentence , a paragraph, or a complete document. A variety of character string data types are available. Deciding on which one to use for a given situation primarily depends on the storage requirements of the data value to be stored. Character string data types include: Fixed-length character string. The fixed-length character string data type is used to store character string values that are between 1 and 254 characters in length. The amount of storage space needed to store a fixed-length character string value can be determined by solving the following equation: ( Number of characters x 1) = Bytes required. (A fixed amount of storage space is allocated, even if all of the space allocated is not needed ”short strings are padded with blanks.) The terms CHARACTER and CHAR are used to denote the fixed-length character string data type. Varying-length character string. The varying-length character string data type is used to store character string values that are up to 32,672 characters in length. However, the actual length allowed is governed by the tablespace page size used. For tables that reside in tablespaces that use 4K pages, varying-length character string values cannot be more than 4,092 characters in length; for tables that reside in a tablespaces that use 8K pages, varying-length character string values cannot be more than 8,188 characters in length, and so on. The amount of storage space needed to store a varying-length character string value can be determined by solving the following equation: ( Number of characters x 1) + 4 = Bytes required. (Only the amount of storage space actually needed, plus four bytes for an "end-of-string" marker, is allocated ”strings are not blank padded.) The terms CHARACTER VARYING, CHAR VARYING, and VARCHAR are used to denote the varying-length character string data type. Long varying-length character string. The long varying-length character string data type is used to store character string values that are up to 32,700 characters in length, regardless of the tablespace page size used. The amount of storage space needed to store a long varying-length character string value can be determined by solving the following equation: ( Number of characters x 1) + 24 = Bytes required. The term LONG VARCHAR is used to denote the long varying-length character string data type. Fixed-length double-byte character string. The fixed-length double-byte character string data type is used to store DBCS (double-byte character set) character string values that are up to 127 characters in length. (Most Asian character sets are double-byte character sets.) The amount of storage space needed to store a fixed-length double-byte character string value can be determined by solving the following equation: ( Number of characters x 2) = Bytes required. The term GRAPHIC is used to denote the fixed-length double-byte character string data type. Varying-length double-byte character string. The varying-length double-byte character string data type is used to store DBCS character string values that are up to 16,336 characters in length. Again, the actual length allowed is governed by the tablespace page size used. For tables that reside in tablespaces that use 4K pages, varying-length double-byte character string values cannot be more than 2,046 characters in length; for tables that reside in a tablespaces that use 8K pages, varying-length double-byte character string values cannot be more than 4,094 characters in length; and so on. The amount of storage space needed to store a varying-length double-byte character string value can be determined by solving the following equation: ( Number of characters x 2) + 4 = Bytes required. The term VARGRAPHIC is used to denote the varying-length double-byte character string data type. Long varying-length double-byte character string. The long varying-length double-byte character string data type is used to store DBCS character string values that are up to 16,350 characters in length, regardless of the tablespace page size used. The amount of storage space needed to store a long varying-length double-byte character string value can be determined by solving the following equation: ( Number of characters x 2) + 24 = Bytes required. The term LONG VARGRAPHIC is used to denote the long varying-length character string data type. Date/Time Data Types Data/Time data types are used to store values that represent dates and times. From a user perspective, these values appear to be character strings; however, they are physically stored as binary packed strings. Date/Time data types include: Date. The date data type is used to store three-part values (year, month, and day) that represent calendar dates. The range for the year portion is 0001 to 9999; the range for the month portion is 1 to 12; and the range for the day portion is 1 to 28, 29, 30 or 31, depending upon the month value specified and whether or not the year specified is a leap year. Externally, date values appear to be fixed-length character string values 10 characters in length. However, only 4 bytes of storage space is required for every date value stored. The term DATE is used to denote the date data type. Time. The time data type is used to store three-part values (hours, minutes, and seconds) that represent time, using a 24- hour clock. The range for the hours portion is 0 to 24; the range for the minutes portion is 0 to 59; and the range for the seconds portion is 0 to 59. Externally, time values appear to be fixed-length character string values 8 characters in length. However, only 3 bytes of storage space is required for every time value stored. The term TIME is used to denote the time data type. Timestamp. The timestamp data type is used to store seven-part values (year, month, day, hours, minutes, seconds, and microseconds) that represent a specific calendar date and time (using a 24-hour clock). The range for the year portion is 0001 to 9999; the range for the month portion is 1 to 12; the range for the day portion is 1 to 28, 29, 30 or 31, depending upon the month value specified and whether or not the year specified is a leap year; the range for the hours portion is 0 to 24; the range for the minutes portion is 0 to 59; the range for the seconds portion is 0 to 59; and the range for the microseconds portion is 0 to 999,999. Externally, timestamp values appear to be fixed-length character string values 26 characters in length (this string is displayed in the form YYYY-MM-DD-HH.MM.SS.NNNNNN, which translates to Year-Month-Day-Hour.Minute.Second.Microseconds). However, only 10 bytes of storage space is required for every timestamp value stored. The term TIMESTAMP is used to denote the timestamp data type. Because the representation of date and time values varies throughout the world, the actual string format used to present a date or a time value is dependant upon the territory code assigned to the database being used. Table 6-1 shows the date and time string formats that are available with DB2 UDB. Table 6-1. DB2 UDB Date and Time Formats Format Name | Abbreviation | Date String Format | Time String Format | International Standards Organization | ISO | YYYY-MM-DD | HH.MM.SS | IBM USA Standard | USA | MM/DD/YYYY | HH:MM AM or PM | IBM European Standard | EUR | DD.MM.YYYY | HH.MM.SS | Japanese Industrial Standard | JIS | YYYY-MM-DD | HH:MM:SS | Site Specific | LOC | Based on database territory and country code | Based on database territory and country code | For date formats, YYYY = Year, MM = Month, and DD = Day; for time formats, HH = Hour, MM = Minute, and SS = Seconds Adapted from Tables 5 and 6 on Pages 101 and 102 of the DB2 SQL Reference “ Volume 1 manual. | Large Object (LOB) Data Types Large object (LOB) data types are used to store large amounts of unstructured data. Large object data types include the following: Binary large object. The binary large object data type is used to store binary data values (such as documents, graphic images, pictures, audio, and video) that are up to 2 gigabytes in size. The terms BINARY LARGE OBJECT and BLOB are used to denote the binary large object data type. The amount of storage space set aside to store a binary large object value is determined by the length specification provided when a binary large object data type is defined. For example, 800 bytes of storage space would be set aside for a BLOB(800) definition. Character large object. The character large object data type is used to store SBCS (single-byte character set) or MBCS (multibyte character set) character string values that are between 32,700 and 2,147,483,647 characters in length. The terms CHARACTER LARGE OBJECT, CHAR LARGE OBJECT, and CLOB are used to denote the character large object data type. The amount of storage space set aside to store a character large object value is determined by the length specification provided when a character large object data type is defined. For example, 800 bytes of storage space would be set aside for a CLOB(800) definition. Double-byte character large object. The double-byte character large object data type is used to store DBCS (double-byte character set) character string values that are between 16,350 and 1,073,741,823 characters in length. The term DBCLOB is used to denote the double-byte character large object data type. The amount of storage space set aside to store a double-byte character large object value is determined by the length specification provided when a double-byte character large object data type is defined. For example, 800 bytes of storage space would be set aside for a DBCLOB(400) definition. DataLinks The DataLink data type is used to store encapsulated values that provide logical references to files that reside outside of the database and are controlled by a DB2 Data Links Server. (The DB2 Data Links Server is an add-on package that takes over control of the native file system on behalf of a DB2 UDB database.) DataLink values serve as "anchor values" that contain reference information allowing a database to establish and maintain a link to external data. The term DATALINK is used to denote the DataLink data type. In File System Migrator (FSM), NT File System (NTFS), Journaled File System (JFS) and UNIX File System (UFS) environments, DataLink values consist of the name of the Data Links Manager server that contains the file to be externally linked, along with the name of the file itself, combined to create a Uniform Resource Locator (URL) address. This address may then be combined with descriptive comment text that can be up to 200 characters in length. However, the final DataLink value produced cannot exceed 200 bytes (200 bytes of storage space is required for every DataLink value stored). Because DataLink values are encapsulated values that contain several pieces of information, they must be constructed , using the built-in function DLVALUE() . Once a DataLink encapsulated value has been created, specific information stored in that value can be updated or retrieved using one of the built-in DataLink functions provided ( DLCOMMENT() , DLLINKTYPE() , DLNEWCOPY() , DLPREVIOUSCOPY() , DLREPLACECONTENT() , DLURLCOMPLETE() , DLURLCOMPLETEONLY() , DLURLCOMPLETEWRITE() , DLURLPATH() , DLURLPATHONLY() , DLURLSCHEME() , and DLURLSERVER() ). Extenders The extender products that are available with DB2 UDB consist of unique sets of user-defined data types and user-defined functions that can be used to store and manipulate nontraditional data such as graphical images and audio/video clips. Since many of the data types provided by the DB2 Extenders are based on built-in data types, the size limits and storage requirements of an extender data type often match those of their built-in data type counterparts ”provided extender data is actually stored in the database. Some of the DB2 Extenders allow their data to be stored in external files that reside outside the database, while the location of the files themselves are stored in a database table. When this is the case, the storage requirements for an extender data type can be much lower than its built-in data type equivalent. User-Defined Data Types In Chapter 4, "Accessing DB2 UDB Data," we saw that user-defined data types (UDTs) are data types that are explicitly created by a database user. A user-defined data type can be a distinct data type that shares a common representation with one of the built-in data types provided with DB2 UDB, or it can be a structured type that consists of a sequence of named attributes, each of which have their own data type. Structured data types can also be created as subtypes of other structured types, thereby defining a type hierarchy. User-defined data types are subject to strong data typing, which means that even though they may share the same representation as other built-in or user-defined data types, the value of one user-defined data type is only compatible with values of that same type (or of other user-defined data types within the same data type hierarchy). As a result, user-defined data types cannot be used as arguments for most built-in functions. However, user-defined functions and operators that duplicate the functionality provided by the built-in functions can be created. |