Byte Ordering for Integer Binary Data on Big Endian and Little Endian Platforms


Definitions

Integer values for binary integer data are typically stored in one of three sizes: one-byte, two-byte, or four-byte. The ordering of the bytes for the integer varies depending on the platform (operating environment) on which the integers were produced.

The ordering of bytes differs between the 'big endian' and 'little endian' platforms. These colloquial terms are used to describe byte ordering for IBM mainframes (big endian) and for Intel-based platforms (little endian). In the SAS System, the following platforms are considered big endian: AIX, HP-UX, IBM mainframe, Macintosh, and Solaris. The following platforms are considered little endian: OpenVMS Alpha, Digital UNIX, Intel ABI, and Windows.

How Bytes are Ordered Differently

On big endian platforms, the value 1 is stored in binary and is represented here in hexadecimal notation. One byte is stored as 01, two bytes as 00 01, and four bytes as 00 00 00 01. On little endian platforms, the value 1 is stored in one byte as 01 (the same as big endian), in two bytes as 01 00, and in four bytes as 01 00 00 00.

If an integer is negative, the 'two's complement' representation is used. The high-order bit of the most significant byte of the integer will be set on. For example, -2 would be represented in one, two, and four bytes on big endian platforms as FE, FF FE, and FF FF FF FE respectively. On little endian platforms, the representation would be FE, FE FF, and FE FF FF FF. These representations result from the output of the integer binary value -2 expressed in hexadecimal representation.

Writing Data Generated on Big Endian or Little Endian Platforms

SAS can read signed and unsigned integers regardless of whether they were generated on a big endian or a little endian system. Likewise, SAS can write signed and unsigned integers in both big endian and little endian format. The length of these integers can be up to eight bytes.

The following table shows which format to use for various combinations of platforms. In the Sign? column, 'no' indicates that the number is unsigned and cannot be negative. 'Yes' indicates that the number can be either negative or positive.

Table 3.1: SAS Formats and Byte Ordering

Data created for

Data written by

Sign?

Format

big endian

big endian

yes

IB or S370FIB

big endian

big endian

no

PIB, S370FPIB,

S370FIBU

big endian

little endian

yes

S370FIB

big endian

little endian

no

S370FPIB

little endian

big endian

yes

IBR

little endian

big endian

no

PIBR

little endian

little endian

yes

IB or IBR

little endian

little endian

no

PIB or PIBR

big endian

either

yes

S370FIB

big endian

either

no

S370FPIB

little endian

either

yes

IBR

little endian

either

no

PIBR

Integer Binary Notation and Different Programming Languages

The following table compares integer binary notation according to programming language.

Table 3.2: Integer Binary Notation and Programming Languages

Language

2 Bytes

4 Bytes

SAS

IB2. , IBR2., PIB2., PIBR2., S370FIB2., S370FIBU2., S370FPIB2.

IB4., IBR4., PIB4., PIBR4., S370FIB4., S370FIBU4., S370FPIB4.

PL/I

FIXED BIN(15)

FIXED BIN(31)

FORTRAN

INTEGER*2

INTEGER*4

COBOL

COMP PIC 9(4)

COMP PIC 9(8)

IBM assembler

H

F

C

short

long




SAS 9.1 Language Reference Dictionary, Volumes 1, 2 and 3
SAS 9.1 Language Reference Dictionary, Volumes 1, 2 and 3
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 704

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net