Preparing the Code for Migration


Programmers often write code for specific operating systems and then port it to others by using a combination of conditional compilation ( #ifdef preprocessor commands) or wrapper functions. To port platform-specific code, read more about these steps in the following sections:

  • Defining the Target Environment

  • Analyzing the Application to be Migrated

  • Making the Source POSIX-Compliant

  • Introducing and Rationalizing Abstraction Layers

  • Using Compile-Time Macros

Defining the Target Environment

A UNIX developer preparing to migrate an application to the Microsoft Windows operating system has the same choices as the developer of a Win32-based application. That is, the Win32 developer must first decide the style of the application. Will it be a text based application or will it be a Windows GUI-based application?

The UNIX developer follows a similar decision-making process. Will the user invoke the application from the command line and create output files? Will the application be a filter that uses stdin and stdout for input and output? Or will the application need graphical capabilities based on X-Windows or another GUI?

Analyzing the Application to be Migrated

Before migration, the developer analyzes the application at a high level to determine the most appropriate migration approach. (For more information about this analysis, see Chapter 4, Assessment and Analysis. ) After beginning the migration, the developer can conduct a more detailed analysis to determine how to migrate the code and scripts.

While the code is still on the UNIX platform, the developer can perform a preliminary analysis to identify any areas of the code that should be treated with special care.

ANSI C/C++ and Fortran Applications

Migrated (ported) code needs to run on a different platform. The developer needs to look at the impediments to this, such as:

  • Differences in compliance to language standards between the UNIX implementations and Windows; examples include:

    • Lack of compliance with adopted scoping rules involving local variable declaration within for loops .

    • Standard Template Library (STL) implementation.

  • Differences between configuration and build tools; examples include:

    • Visual Studio interfaces run only on Windows (lack of UNIX implementations prevent Visual Studio from being a cross-platform IDE/build tool).

    • The nmake command does not conform to either the Berkeley make or GNU gmake commands.

    • UNIX systems use the autoconf or automake scripts to automatically create makefiles; the Windows equivalent would be the generation of a nmake file from the Visual Studio IDE.

  • Lack of support for a compiler, lack of Fortran 90/95 support.

The steps in creating the Windows development environment as described in Chapter 7, Creating the Development Environment, should help with the differences between configuration and build tools. Verify the conversion to a Visual Studio-based project from the UNIX application s makefile-based build. In particular, be sure that:

  • The same flags are passed to the compiler (as a general rule, convert - < anything > to / <anything>).

  • The necessary #defines are the same.

  • The header file and library file paths needed are properly set for the Visual Studio environment.

Because the Visual Studio compiler options are very different from their UNIX/GNU equivalents, migration tools that convert between them across platforms are useful.

ANSI C/C++ with Cross-Platform Library Applications

In addition to the concerns discussed in the previous section, there are a few additional provisions when running ANSI C/C++ applications with cross-platform library applications:

  • Analyze and verify that the cross-platform library has the same set of function definitions, return types, argument types, argument ordering, and so on, on the UNIX platforms as on the Windows implementation. They should be the same; otherwise , the library s utility as a cross-platform library is in jeopardy. For example, Rogue Wave Software s Source Pro C++ has a completely compatible set of function definitions, return types, argument types, argument ordering, and so on.

  • Make sure that anything found with -I < anything > from the UNIX application linker command line is listed in the Visual Studio Linker options on the Link tab under Additional library path .

Native UNIX Applications Ported to Interix

As with ANSI C/C++ and Fortran applications, UNIX applications targeted for an Interix port have certain impediments that the developer needs to plan for. For example:

  • Interix only provides GNU C/C++ and Fortran 77 compilers. If a Fortran 90/95 application is to be ported to Interix, third-party compilers need to be considered . For example, Compaq or Intel compilers support Fortran 90/95.

  • Certain make incompatibilities exist (this is an issue between Berkeley make and GNU make ). Interix make is based on Berkeley make . Consider porting gmake to Interix (see Table 8.1, on the next page).

Table 8.1 shows Interix 3.0 ports of compilers and tools that are out of date.

Table 8.1: Out-of-date Interix 3.0 Ports of Compilers and Tools

Tool/library

Interix version

UNIX version

Comment

autoconf / automake

Not provided out-of-the-box

 

Can be ported by developer from GNU Web site.

gmake (GNU)

Not provided out-of-the-box

3.79.1

Can be ported by developer from GNU Web site. Recommended for an application that makes significant use of gmake (for example, Apache).

gcc

cygnus-2.7.2

2.96

Name spaces are mostly broken in this version of the C++ compiler.

tar

 

1.13.19

 

ar

cygnus-2.8.1

2.10.91

 

make (Berkeley)

R5

R6

Usually not a problem.

X11

     

Motif

1.

1.2

 

For example, Martin Walenta s Trading Toolkit requires:

  • gcc version 2.95.2

  • tar version 1.12

  • ar version 2.10.90

The application should be analyzed to determine its level of dependency on a specific version of a tool. A decision should also be made to modify the application s dependency or port the version of the tool required before porting the application.

Native UNIX Applications Rewritten to Win32

If the UNIX application has a dependency on UNIX shell scripts (for example, ksh or csh ), consider including Interix in the ported environment to support this functional requirement.

Making the Source POSIX-Compliant

Writing standard POSIX code is the best strategy for producing migratable code because all UNIX platforms are compliant with POSIX.1 and POSIX.2 standards.

The following sections look at some key areas where you should make your code POSIX-compliant to ease the migration of the code.

Strictly Conforming POSIX.1 Applications

A strictly conforming POSIX.1 application requires only the facilities described in the POSIX.1 standard and applicable language standards. A strictly conforming POSIX.1 application:

  • Does not rely on any behavior described in ISO/IEC 9945-1 as unspecified or implementation-defined.

  • Uses only those facilities described in the standard. However, because the behavior of some of those facilities varies across implementations, such an application might need modification to run on different platforms.

Applications at this level should be able to move across implementations with just a recompilation.

For more information about the POSIX.1 programming environment, see these resources:

  • Lewine,  Donald. POSIX Programmer s Guide . O Reilly & Associates, 1991.

  • Stevens,  W. Richard. Advanced Programming in the UNIX Environment . Addison-Wesley, 1992.

  • Zlotnick,  Fred, The POSIX.1 Standard: A Programmer s Guide . Benjamin/Cummings, 1991.

System Information

Different systems provide different ways of obtaining information about the system. On BSD systems, the sysctl interface provides access to system information. On System V systems, the sysinfo call provides system information. However, these interfaces are nonstandard, and their implementation varies between UNIX implementations.

For example, on Solaris sysinfo gets and sets system information strings, such as:

 #include <sys/systeminfo.h> long sysinfo(int command, char *buf, long count); 

On Linux, sysinfo returns information on overall system statistics, such as:

 #include <sys/sysinfo.h> int sysinfo(struct sysinfo *info); 

Developers can access system information by using POSIX routines. The POSIX system information routines return strings, paths, and numeric values, including two-valued Boolean conditions. The header file limits.h contains macros that define system limits, as shown in Table 8.2.

Table 8.2: Macros that Define Systems Limits

Macro

Description

confstr

Retrieves string values. The only portable POSIX.2-compliant value for name is _CS_PATH, a value for PATH guaranteed to find the standard utilities. Prototype is:
size_t confstr (int name , char *buf , size_t len ).

fpathconf, pathconf

Retrieves configurable path variables , such as the maximum size of a file name or the maximum link count. Used in System V programming. Prototypes are:
long pathconf (const char *path , int name )
long fpathconf (int fd , int name ).

sysconf

Retrieves system information that can provide answers to questions such as: Is job control available? Are POSIX options supported? What are the limits for bc ? What is the maximum number of bytes allowed for an argument to exec ? Prototype is:
long sysconf (int name ).

Note  

Note: The Interix Software Development Kit (SDK) also provides both uname and gethostbyname .

The following code example uses the POSIX sysconf function:

 /* syssample.C: This program illustrates usage of sysconf functions*/ #include <stdio.h> #include <unistd.h> int main( void ) { /* retrieve the system information */    long sinfo;    sinfo = sysconf(_SC_VERSION);    printf("Version supported: %d\n",sinfo);    sinfo = sysconf(_SC_LINE_MAX);    printf("Maximum line length: %d\n",sinfo);    return; } 

The following output is obtained from the program on Interix:

 % ./syssample Version supported: 199009 Maximum line length: 2048 

Advisory File Locking

Table 8.3 shows the traditional application programming interfaces (APIs) for file locking.

Table 8.3: APIs for File Locking

API

Standard

fcntl

POSIX. However, POSIX.1 specifies only the F_DUPFD, F_GETFD, F_SETFD, F_GETFL, F_SETFL, F_GETLK, F_SETLK and F_SETLKW operations.

lockf

System V. (Built on fcntl.)

flock

BSD. (Built on fcntl.)

Note  

Any lock created by Interix is advisory; that is, it is not enforced by the operating system. These locks have no effect in the Win32 environment. Advisory locks allow cooperating processes to perform consistent operations on files, but consistency is not guaranteed (that is, processes may still access files without using advisory locks, possibly resulting in inconsistencies).

Current Working Directory

Legacy code might use getwd to determine the current working directory. It is better to use the POSIX getcwd interface, which includes a size argument to prevent buffer overflows on the returned directory path.

Error Messages

System error messages are available in POSIX through the strerror and perror calls. Traditional systems expose the underlying sys_errlist array and store the number of array elements in the variable sys_nerr .

The gets Function

The gets function is known to be a security risk. This is because the length of an input line can overflow the size of the buffer, resulting in indeterminate behavior, or a deliberate attempt to deliver code to the application for execution. For this reason, the use of gets is strongly discouraged, and it is strongly recommended that fgets is used. The Interix implementation of gets prints out the following warning whenever the gets function is called:

 warning: this program uses gets(), which is unsafe. 
Note  

You can disable the warning by setting the environment variable DISABLE_GETS_WARNING.

Terminal I/O

The Interix Software Development Kit (SDK) extends the POSIX.1 set of flags for c_iflag to include IMAXBEL and VBELTIME. For c_cc , VMIN and VTIME do not have the same values as VEOF and VEOL. When you create a portable application, however, you should take into consideration that VMIN and VTIME can be identical to VEOF and VEOL on a POSIX.1 system.

The new functions shown in Table 8.4 replace the terminal I/O ioctl calls, which include ioctl(fd , TIOCSETP , buf) and ioctl(fd , TIOCGETP , buf) or stty and gtty . They were changed because the data type of the final argument for terminal I/O ioctl calls depends on an action that makes type checking impossible .

Table 8.4: New Functions that Replace Terminal I/O ioctl Calls

Function

Description

tcgetattr()

Fetches attributes ( termios structure)

tcsetattr()

Sets attributes ( termios structure)

cfgetispeed()

Gets input speed

cfgetospeed()

Gets output speed

cfsetispeed()

Sets input speed

cfsetospeed()

Sets output speed

Tcdrain()

Waits for all output to be transmitted

tcflow()

Suspends transmit or receive

Tcflush()

Flushes pending I/O

tcsendbreak()

Sends BREAK character

tcgetpgrp ()

Gets foreground process group identifier (ID)

tcsetpgrp()

Sets foreground process group ID

If you need to get the window size, the TIOCGWINSZ command for ioctl and the winsize structure are both supported.

The TERMIO terminal hardware structure in the System V and the STTY terminal hardware structure in BSD have been replaced in POSIX with the TERMIOS structure and a new set of access calls.

The TERMIOS model is very similar to the System V model. Two modes exist: canonical and noncanonical. Canonical input is line-based , like BSD cooked mode. Noncanonical mode is character-based, like BSD raw or cbreak mode. The Interix subsystem includes a true, noncanonical mode, with support for cc_c[VMIN] and cc_c[VTIME] .

The TERMIOS structure is defined in termios.h as follows:

 struct termios {    tcflag_t c_iflag;  /* input mode         */    tcflag_t c_oflag;  /* output mode        */    tcflag_t c_cflag;  /* control mode       */    tcflag_t c_lflag;  /* local mode         */    speed_t c_ispeed;  /* input speed        */    speed_t c_ospeed;  /* output speed       */    cc_t c_cc[NCCS];   /* control characters */ }; 

Signals

The POSIX.1 committee introduced new signal semantics because of problems with BSD and System V signal implementations.

When the System V3 signal function catches a signal, the action associated with the signal is reset to the default. In BSD 4.3, it is not reset. In the International Standards Organization/American National Standards Institute (ISO/ANSI) C standard, the signal function either resets the default or does an implementation-defined blocking of the signal.

The POSIX sigaction call does not reset the default if the handler returns normally. The Interix SDK follows the POSIX signal semantics.

Because an Interix SDK process has a signal mask, it can block any (or all) of the signals from arriving, except for SIGKILL or SIGSTOP, which cannot be caught or ignored. A process starts with a signal mask inherited from its parent. If any signals are generated and then blocked by the signal mask, they go into the set of pending signals.

In code that uses the signal function, the signal is still masked and remains masked until the mask is clear.

Note  

This can be a significant problem if the code uses a longjmp call from the handler routine. Using the sigaction call directly with siglongjmp call corrects some unexpected behaviors.

Time

For portability, use the POSIX.1 utime function instead of utimes , which is no longer supported. The arguments and semantics for these functions are slightly different.

The syntax for the utimes function is as follows:

 int utimes (const char *  path  , const struct timeval *  times  ): 

The utime function uses the second argument, times , which is a struct utimbuf :

 int utime(const char *  path  , const struct utimbuf *  times  ); 

The utimes function succeeds when used with writable files, but utime used with a non-null argument does not succeed unless the user executing the process owns the file.

The utimbuf structure is defined in utime.h. It contains the following two members :

 time_t actime; /* access time */ time_t modtime; /* modification time */ 

Processes and Threads

The setpgrp interface is obsolete. In UNIX implementations, setpgrp functionality is duplicated by other functions. Both BSD and System V have a setpgrp call, but each has different semantics and behaviors.

The System V setpgrp call changes the caller s process group ID to its own process ID, and releases the controlling terminal of the calling process. The System V call takes no arguments.

The BSD setpgrp call detaches a process from its process group, but does not change the controlling terminal. The BSD call takes two arguments, a process ID and a process group ID.

The System V setpgrp call is properly replaced by the POSIX setsid call. When a process calls setsid successfully, it creates a new session, which in turn contains a new process group that contains one process ” the calling process. The calling process is now the session and process group leader. The process also disposes of its controlling terminal. (In fact, the setsid call is used to dispose of a controlling terminal.)

The BSD setpgrp call is replaced by the POSIX setpgid call, which has identical semantics.

Introducing and Rationalizing Abstraction Layers

Writing a custom library of functions for each platform can help to abstract nonportable code. In this case, the application always calls the private version of the function, which is linked to a platform-specific library. Creating and linking the platform-specific libraries entails a great deal of work, but it could be the appropriate method for a large body of source code.

For example, different libraries (such as func_Ix.lib and func_Ux.lib) can implement the func1 custom library function (instead of a system call) for the application that is to be ported. Then, when compiling on a particular system, the appropriate library is linked.

Using Compile-Time Macros

Many applications use #ifdef statements to isolate platform-specific sections of code. Some older code written using #ifdef statements is based on assumptions about the platform that are no longer valid. For example, code built around #ifdef BSD usually tries to include sgtty.h rather than termios.h. Labeling blocks of code with the platform name often amounts to aiming for a moving target. For example, BSD 4.4 has some different APIs than BSD 4.3, but they are both BSD.

POSIX specifies its own set of compile-time macros and manifest constants, which are not defined in either System V or BSD systems. These macros and constants are unique to POSIX and are defined in the specified include files.




UNIX Application Migration Guide
Unix Application Migration Guide (Patterns & Practices)
ISBN: 0735618380
EAN: 2147483647
Year: 2003
Pages: 134

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net