Section 3.12. 32- to 64-Bit Migration | UNIX to Linux Porting: A Comprehensive Reference

3.12. 32- to 64-Bit Migration

64-bit Linux platforms are well on their way to replace aging 32-bit servers. A 64-bit application environment can drastically improve the performance of memory addressing and throughput for applications that manipulate very large data structures. Linux on IBM's PowerPC and AMD's 64-bit architecture provides a unique platform in that 32- and 64-bit applications can be run simultaneously without any performance loss. This advantage can be exploited by compiling applications for 64-bit when a 32-bit environment presents memory limitations for the application.

With gcc, use the m64 flag to generate 64-bit object files, as in the following example:

$ gcc -m64 sample.c -o sample.o

Note that on some platforms, such as the IBM PowerPC, the gcc compiler produces 32-bit objects even when the Linux platform is a 64-bit platform. On the AMD 64-bit architecture running 64-bit Linux, the gcc compiler produces 64-bit objects by default. Also, like UNIX platforms, 64-bit object code works only with other 64-bit object code. 32-bit and 64-bit object code cannot work in the same application space because of address conflicts.

The data type models between a 32-bit and 64-bit compilation environment are different. The C data type model for 32-bit applications is the ILP32 model. The letters stand for the int (I) and long (L) types, and pointers (P). The number 32 denotes that these data types are all 32-bit data types. The data type model for 64-bit applications is the LP64 data model. Except for the int, both the long (L) and pointer (P) types grow to 64 bits. The remaining C integer types and the floating-point types are the same in both data type models.

3.12.1. Common Migration Mistakes

Data type mismatches are common among code incompatibilities due to endianness and 32- to 64-bit issues. It is common to encounter current 32-bit applications that assume that the int type, long type, and pointers are the same size. Because the sizes of long and pointer change in the LP64 data model, this change alone can lead to the principal cause of ILP32-to-LP64 migration problems. Opportunities to isolate the source of these incompatibilities should be exploited as early as the analysis step of the porting process.

3.12.1.1. Assuming int and ptr Are the Same in LP64

In an LP64 environment, pointer (ptr) data types are 64 bits long. Failure to account for this difference from a 32-bit environment will result in at least a compiler warning (or worse, undefined application behavior).

Consider the following example:

char *p; char *q; p = (char *) malloc(sizeof(long)*4); q = (char *) ((int)p & 0x4000); .. $ gcc -m64 1.c -o foo 1.c: In function 'main': 1.c:9: warning: cast from pointer to integer of different size 1.c:9: warning: cast to pointer from integer of different size

Fix this by changing int to a long, or even better, use uintptr_t from stdint.h.

3.12.1.2. Overlooking int and long Data Type Size Difference

It is easy to overlook the error created by programmers when they assumed the long and int data types are the same size when they wrote their code in an ILP32 environment.

Consider Example 3-7.

Example 3-7. Listing of bad_1.c

main() { long trigger = 1<<31; printf("%lx\n", trigger); }

When compiled and run as a 32-bit application:

gcc bad_1.c -o foo $ ./foo 80000000

When compiled and run as a 64-bit application:

gcc -m64 bad_1.c -o foo $ ./foo ffffffff80000000

The call sizeof returns an integer of type size_t. Because size_t has changed to 64 bits in an LP64 environment, be careful not to pass this value to a function expecting an int as a parameter. Otherwise, truncation may occur.

3.12.1.3. Ignoring Signed Extensions

Example 3-7 also demonstrates a sign extension problem caused when converting to LP64. ISO C integral promotion rules states that a character, a short integer, or an integer bit field, all either signed or unsigned, or an object of enumeration type, may be used in an expression wherever an integer may be used. If an int can represent all the values of the original type, the value is converted to int; otherwise, the value is converted to unsigned int.

To fix this problem, change 1<<31 to 1L<<31.

3.12.1.4. Not Checking for String Conversions

String functions, such as printf, sprintf, scanf, and sscanf use format strings that may need to be prepended by the long size specification, percentl, for long arguments and percentp for pointer arguments. Failure to use these specifications for an LP64 environment can result in unexpected formatting results.

3.12.2. Best Practices

A best-practice approach for porting a 32-bit application to a 64-bit application suggests that the port be done in two separate phases. The first phase consists of the port from the native platform where the application currently runs (AIX, Solaris, or HP-UX) to Linux. And the second phase consists of migrating the ported 32-bit application to a 64-bit application on the Linux platform.