Section 5.11. Selected AIXLinux System API Comparisons


5.11. Selected AIX/Linux System API Comparisons

This section takes a selected set of system APIs and compares and contrasts them with AIX and Linux in a man-page format. The APIs were chosen based on how much difference existed between AIX and Linux. Unlike in the previous section, in which the only differences that existed were ERRNO returns, the APIs listed in this section show differences that go beyond just ERRNOs. In some cases, the information returned by AIX and Linux differs significantly and must be addressed early during the porting cycle.

Like a man page, each API listed in this section lists the API name, prototype for AIX and Linux, parameter listing, and the return value of the routine. In addition, a "Detail Comparison" section is included in each API section explaining the differences in the API between AIX and Linux (with examples and code snippets). If the API is not compatible between AIX and Linux, it is clearly marked with a "**Not compatible**" identifier. In such cases, examples and analytical reasoning are given as to why it is not compatible. Some are identified as "**Compatible**" but with differences, which will also be explained.

In some cases, additional operating system-specific (that is, AIX or Linux) information pertinent to the porting process follows the "Detail Comparison" section.

5.11.1. getfsent(), getfsfile(), getfstype(), getfsspec()

These calls retrieve information about a filesystem. See "Additional Data: Linux-Specific" for Linux-specific information.

5.11.1.1. AIX Prototype

#include <fstab.h> struct fstab *getfsent(); struct fstab *getfsspec ( char *special ); struct fstab *getfsfile( char *file ); struct fstab *getfstype( char* type ); void setfsent( ); void endfsent( ); 


5.11.1.2. Linux Prototype

#include <fstab.h> void endfsent(void); struct fstab *getfsent(void); struct fstab *getfsfile(const char *mount_point); struct fstab *getfsspec(const char *special_file); int setfsent(void); 


5.11.1.3. Detail Comparison

**Not Compatible**

AIX and Linux implement their filesystem differently, and as a result different information must be returned. Both platforms return an fstab struct, but information is different.

AIX

struct fstab{     char *fs_spec;      /* block special device name */     char *fs_file;      /* file system path prefix */     char *fs_type;      /* read/write, etc see above defines */     int fs_check;       /* true=0, false=-1, else "check" val */     int fs_freq;        /* not used */     int fs_passno;      /* not used */ }; 


Linux

struct fstab {    char *fs_spec;         /* block device name */    char *fs_file;         /* mount point */    char *fs_vfstype;      /* filesystem type */    char *fs_mntops;       /* mount options */    const char *fs_type;   /* rw/rq/ro/sw/xx option */    int fs_freq;           /* dump frequency, in days */    int fs_passno;         /* pass number on parallel dump */ }; 


5.11.1.4. Additional Data: Linux-Specific

Here the field fs_type contains (on a *BSD system) one of the five strings: rw, rq, ro, sw, xx (read-write, read-write with quotas, read-only, swap, ignore).

The function setfsent() opens the file when required and positions it at the first line.

The function getfsent() parses the next line from the file (after opening it when required).

The function endfsent() closes the file when required.

The function getfsspec() searches the file from the start and returns the first entry found for which the fs_spec field matches the special_file argument.

The function getfsfile() searches the file from the start and returns the first entry found for which the fs_file field matches the mount_point argument.

5.11.1.5. Return Value

Upon success, the functions getfsent(), getfsfile(), and getfsspec() return a pointer to a struct fstab, whereas setfsent() returns 1. Upon failure or end of file, these functions return NULL and 0, respectively.

5.11.2. ioctl()

The ioctl function manipulates the underlying device parameters of special files. In particular, many operating characteristics of character special files (for example, terminals) may be controlled with ioctl requests. The argument d must be an open file descriptor.

An ioctl request has encoded in it whether the argument is an in parameter or out parameter and the size of the argument argp in bytes. Macros and defines used in specifying an ioctl request are located in the file sys/ioctl.h.

5.11.2.1. AIX Prototype

#include <sys/ioctl.h> #include <sys/types.h> #include <unistd.h> int ioctl (int FileDescriptor, int Command, void *Argument); 


5.11.2.2. Linux Prototype

#include <sys/ioctl.h> int ioctl(int d, int request, ...) 


5.11.2.3. Detail Comparison

**Not Compatible**

In Linux, command values are generally driver-specific. This function is not considered portable.

AIX

In AIX, the ioctl subroutine performs a variety of control operations on the object associated with the specified open file descriptor. This function is typically used with character or block special files and sockets generic device support such as the termio general terminal interface.

Parameters

  • FileDescriptor. Specifies the open file descriptor for which the control operation is to be performed.

  • Command. Specifies the control function to be performed. The value of this parameter depends on which object is specified by the FileDescriptor parameter.

  • Argument. Specifies additional information required by the function requested in the Command. The data type of this parameter (a void pointer) is object-specific and is typically used to point to an object device-specific data structure. However, in some device-specific instances, this parameter is used as an integer. This parameter is passed on to the object associated with the specified open file descriptor. Although normally of type int, this parameter can be used as a pointer to a device-specific structure for some devices.

Linux

The ioctl function manipulates the underlying device parameters of special files. In particular, many operating characteristics of character special files (for example, terminals) may be controlled with ioctl requests.

  • Argument d. Must be an open file descriptor.

  • Request. The second argument is a device-dependent request code.

  • Third argument(s). The third argument is an untyped pointer to memory. It is traditionally char *argp (from the days before void * was valid C).

An ioctl request has encoded in it whether the argument is an in parameter or out parameter, and the size of the argument argp in bytes. Macros and defines used in specifying an ioctl request are located in the file sys/ioctl.h.

5.11.2.4. Return Value

Usually, on success, 0 is returned. A few ioctls use the return value as an output parameter and return a nonnegative value on success. On error, 1 is returned, and ERRNO is set appropriately.

5.11.2.5. Errors
  • EBADF. d is not a valid descriptor.

  • EFAULT. argp references an inaccessible memory area.

  • ENOTTY. d is not associated with a character special device.

  • ENOTTY. The specified request does not apply to the kind of object that the descriptor d references.

  • EINVAL. Request or argp is not valid.

5.11.2.6. ERRNO(s) Not Supported in Linux
  • EINTR. A signal was caught, and the process had not enabled restartable subroutines for the signal.

  • ENODEV. The FileDescriptor parameter is associated with a valid character or block special file, but the supporting device driver does not support the ioctl function.

  • ENXIO. The FileDescriptor parameter is associated with a valid character or block special file, but the supporting device driver is not in the configured state.

5.11.3. read(), write()

To read from a file descriptor into a buffer or to write to a file descriptor from a buffer, use the read()/write() commands.

5.11.3.1. AIX Prototype

#include <unistd.h> ssize_t read (int FileDescriptor, void *Buffer, size_t NBytes); ssize_t write(int fd, const void *buf, size_t NBytes); 


5.11.3.2. Linux Prototype

#include <unistd.h> ssize_t read(int fd, void *buf, size_t NBytes); ssize_t write(int fd, const void *buf, size_t NBytes; 


5.11.3.3. Detail Comparison

**Compatible**

Functions are source-compatible, but ERROR return values differ. Refer to the "Errors" section that follows. On most UNIX systems, O_NONBLOCK is used on socket descriptors, in which read returns 1 and sets ERRNO EAGAIN, which signifies to the user that additional data is available on the descriptor. Linux, on the other hand, considered current available filesystems and disks to be fast enough and deemed O_NONBLOCK unnecessary. As a result, this O_NONBLOCK may not be implemented on some flavors of Linux.

5.11.3.4. Additional Info: AIX/Linux
  • FileDescriptor. A file descriptor identifying the object to be read and written to.

  • Buffer. Points to the buffer.

  • NBytes. Specifies the number of bytes associated with the file descriptor for read or write.

  • FileDescriptor. In the case of a write, identifies the size of the buffer.

If NBytes is 0, read() returns 0 and has no other results. If NBytes is greater than SSIZE_MAX, the result is unspecified.

5.11.3.5. Linux Restrictions

On NFS filesystems, reading small amounts of data updates the timestamp only the first time; subsequent calls may not do so. This is caused by client-side attribute caching, because most if not all NFS clients leave atime updates to the server, and client-side reads satisfied from the client's cache will not cause atime updates on the server because there are no server-side reads. UNIX semantics can be obtained by disabling client-side attribute caching, but in most situations this substantially increases server load and decreases performance.

5.11.3.6. Return Value

On success, the number of bytes read is returned (0 indicates end of file), and the file position is advanced by this number. It is not an error if this number is smaller than the number of bytes requested. This may happen, for example, because fewer bytes are actually available right now (maybe because we were close to end of file, or because we are reading from a pipe or a terminal), or because read() was interrupted by a signal. On error, 1 is returned, and ERRNO is set appropriately. In this case, it is left unspecified whether the file position (if any) changes.

5.11.3.7. Errors
  • EINTR. The call was interrupted by a signal before any data was read.

  • EAGAIN. Nonblocking I/O has been selected using O_NONBLOCK, and no data was immediately available for reading.

  • EIO. I/O error. This happens, for example, when the process is in a background process group, tries to read from its controlling tty, and either it is ignoring or blocking SIGTTIN or its process group is orphaned. It may also occur when there is a low-level I/O error while reading from a disk or tape.

  • EISDIR. fd refers to a directory.

  • EBADF. fd is not a valid file descriptor or is not open for reading.

  • EINVAL. fd is attached to an object that is unsuitable for reading.

  • EFAULT. buf is outside your accessible address space.

Other errors may occur, depending on the object connected to fd. POSIX allows a read that is interrupted after reading some data to return 1 (with ERRNO set to EINTR) or to return the number of bytes already read.

Conforming To

SVr4, SVID, AT&T, POSIX, X/OPEN, BSD 4.3

ERRNO(s) Not Supported in Linux

  • EBADMSG. The file is a STREAM file that is set to control-normal mode, and the message waiting to be read includes a control part.

  • EDEADLK. A deadlock would occur if the calling process were to sleep until the region to be read was unlocked.

  • EOVERFLOW. An attempt was made to read from a regular file where NBytes was greater than 0 and the starting offset was before the end of file and was greater than or equal to the offset maximum established in the open file description associated with file descriptor.

  • ENXIO. A request was made of a nonexistent device or the request was outside the capabilities of the device.

  • ESPIPE. fd is associated with a pipe or FIFO.

  • ETIMEDOUT. The connection timed out.

5.11.4. confstr()

The confstr() function provides a method for applications to get configuration-defined string values. Its use and purpose are similar to the sysconf() function; however, confstr() is used where string values (rather than numeric values) are returned.

5.11.4.1. AIX Prototype

#include <unistd.h> size_t confstr (int name, char * buf, size_t len); 


5.11.4.2. Linux Prototype

#define _POSIX_C_SOURCE 2 or #define _XOPEN_SOURCE #include <unistd.h> size_t confstr(int name, char * buf, size_t len); 


5.11.4.3. Detail Comparison

**Compatible**

Functions are source-compatible, but _POSIX_C_SOURCE or _XOPEN_SOURCE must be turned on in Linux.

5.11.5. opendir()

The opendir() function opens a directory stream corresponding to the directory named by the name argument.

5.11.5.1. AIX Prototype

#include <dirent> DIR *opendir(const char *name); struct dirent *readdir (DIR *DirectoryPointer); 


5.11.5.2. Linux Prototype

#include <dirent.h> DIR *opendir(const char *name); struct dirent *readdir(DIR *dir); 


5.11.5.3. Detail Comparison

**Compatible**

Functions are source-compatible, but there are differences between the dirent structures in the two systems:

Linux

AIX

struct dirent

struct dirent

d_ino

d_ino

d_off

d_offset

d_reclen

d_reclen

d_type

d_namelen

d_name[]

d_name[]


5.11.5.4. ERRNO

When called in AIX, this function can return these ERRNOs not documented in Linux:

  • EBADF. Indicates that the DirectoryPointer parameter argument does not refer to an open directory stream (readdir).

  • ENAMETOOLONG. Indicates that the string length pointed to by the DirectoryPointer parameter argument exceeds the PATH_MAX value, or a path name component is longer than the NAME_MAX value while the POSIX_NO_TRUNC value is in effect (opendir).

5.11.6. readdir()

readdir() is a deprecated function. Instead, you should use geTDents(). Refer to the Linux documentation for more information on geTDents().

5.11.7. fcntl()

The fcntl subroutine performs controlling operations on the open file specified by the FileDescriptor (fd) parameter. If Network File System (NFS) is installed on your system, the open file can reside on another node. The fcntl subroutine is used to do the following:

  • Duplicate open file descriptors

  • Set and get the file-descriptor flags

  • Set and get the file-status flags

  • Manage record locks

  • Manage asynchronous I/O ownership

  • Close multiple files

The following commands are supported for all file types and are passed via the cmd parameter:

  • F_DUPFD

    Returns a new file descriptor that is the lowest numbered available (that is, not already open) file descriptor greater than or equal to the specified argument, which is of type int. The new file descriptor refers to the same open file description as the original file descriptor and shares any locks. The FD_CLOEXEC flag associated with the new file descriptor is cleared to keep the file open across calls to one of the exec() family of functions. The return value is the new file descriptor on success or 1 on error.

  • F_SETFD

    Sets the file descriptor flags for the specified file descriptor. The argument is the new set of flags, as a variable of type int. File descriptor flags are associated with a single file descriptor and do not affect other file descriptors that refer to the same file. The return value is 0 on success or 1 on error. The following file descriptor flags may be set. Any additional bits set from the flags specified for F_GETFD are ignored. If any bits not defined here are specified, behavior is undefined.

  • FD_CLOEXEC

    If set, the file descriptor is closed when one of the exec() family of functions is called. If not set, the file descriptor is inherited by the new process image.

  • F_GETFD

    Gets the file descriptor flags for the specified file descriptor. This command takes no argument. File descriptor flags are associated with a single file descriptor and do not affect other file descriptors that refer to the same file. The return value is the current file descriptor flags on success, or 1 on error. In addition to the flags specified for F_SETFD, the following flags may be returned:

    FD_MANDATORYLOCK

    Mandatory locking is enabled for the file referred to by the specified file descriptor.

    FD_ADVISORYLOCK

    Advisory locking is enabled for the file referred to by the specified file descriptor.

    FD_DIRECTORY

    The specified file descriptor refers to a directory.

  • F_SETFL

    Sets the file status flags for the specified file descriptor. The argument is the new set of flags, as a variable of type int. These flags are as specified for the oflag argument to open(), along with the additional values specified later. Bits corresponding to the file access mode and any other oflag bits not listed here are ignored. If any bits not defined here or in open() are set, behavior is undefined. The return value is 0 on success, or 1 on error. The following file status flags can be changed with F_SETFL:

    O_APPEND

    Valid only for file descriptors that refer to regular files. The file pointer is moved to the end of the file before each write.

    O_ASYNC

    Valid only for file descriptors that refer to sockets and communications ports. If enabled for a file descriptor, and an owning process/process group has been specified with the F_SETOWN command to fcntl(), a SIGIO signal is sent to the owning process/process group when input is available on the file descriptor.

    O_BINARY

    Sets the file descriptor to binary mode.

    O_LARGEFILE

    Sets the file descriptor to indicate a large-file-aware application.

    O_NDELAY

    Sets the file descriptor to no-delay mode.

    O_NONBLOCK

    Sets the file descriptor to nonblocking mode. The distinction between nonblocking mode and no-delay mode is relevant only for a few types of special files such as pipes and FIFOs. Refer to read() and write() for more information.

    O_SYNC

    Sets the file descriptor to synchronous-write mode. Writes do not return until file buffers have been flushed to disk.

    O_TEXT

    Sets the file descriptor to text mode.

    FAPPEND

    A synonym for O_APPEND.

    FASYNC

    A synonym for O_ASYNC.

    FNDELAY

    A synonym for O_NDELAY.

  • F_GETFL

    Gets the file status flags and file access modes for the specified file descriptor. These flags are as specified for the oflag argument to open(), along with the additional values described for F_SETFL. File status flags and file access modes are associated with the file description and do not affect other file descriptors that refer to the same file with different open file descriptions. The return value is the current file status flags and file access modes on success, or 1 on error. The following macros can be used to access fields of the return value:

    O_ACCMODE

    Extracts the access-mode field, which is one of O_RDONLY, O_RDWR, or O_WRONLY. Refer to the documentation for open() for more information.

    The following commands are supported only for sockets and communication ports:

  • F_SETOWN

    Sets the owning process ID or process group ID for the specified file descriptor. The owning process or process group can receive SIGURG signals for out-of-band data or sockets and/or SIGIO signals for readable data on sockets or communications ports. The argument is the process ID or the negative of the process group ID for the owner, as a variable of type pid_t. The return value is 0 on success or 1 on error.

    To receive SIGURG signals, the process should establish a SIGURG handler prior to setting ownership of the file descriptor. A SIGURG signal is generated whenever out-of-band data is received.

    To receive SIGIO signals, the process should establish a SIGIO handler prior to setting ownership of the file descriptor, and then must enable O_ASYNC with the F_SETFL command to fcntl(). A SIGIO signal is generated whenever there is data to be read on the file descriptor.

  • F_GETOWN

    Gets the owning process ID or process group ID for the specified file descriptor. The return value is the owner ID on success, or 1 on error. Behavior is undefined if no owner has been established with F_SETOWN.

    The following commands are used for file locking. Locks may be advisory or mandatory. These command are supported only for regular files.

  • F_GETLK

    Gets the first lock that blocks a lock description for the file to which the specified file descriptor refers. The argument is a pointer to a variable of type struct flock, described later. The structure is overwritten with the returned lock information. If no lock is found that would prevent this lock from being created, the structure is unchanged except for the lock type, which is set to F_UNLCK. The return value is 0 on success or 1 or error.

  • F_GETLK64

    Equivalent to F_GETLK, but takes a struct flock64 argument rather than a struct flock argument.

  • F_SETLK

    Sets or clears a file segment lock for the file to which the specified file descriptor refers. The argument is a pointer to a variable of type struct flock, described later. F_SETLK is used to establish shared (or read) locks (F_RDLCK) or exclusive (or write) locks (F_WRLCK), as well as to remove either type of lock (F_UNLCK). The return value is 0 on success or 1 on error. If the lock cannot be immediately obtained, fcntl() returns 1 with ERRNO set to EACCES.

  • F_SETLK64

    Equivalent to F_SETLK, but takes a struct flock64 argument rather than a struct flock argument.

  • F_SETLKW

    This command is the same as F_SETLK except that if a shared or exclusive lock is blocked by other locks, the thread waits until the request can be satisfied. If a signal that is to be caught is received while fcntl() is waiting for a region, fcntl() is interrupted. Upon return from the signal handler, fcntl() returns 1 with ERRNO set to EINTR, and the lock operation is not done.

  • F_SETLKW64

    Equivalent to F_SETLKW, but takes a struct flock64 argument rather than a struct flock argument.

When a shared lock is set on a segment of a file, other processes can set shared locks on that segment or a portion of it. A shared lock prevents any other process from setting an exclusive lock on any portion of the protected area. A request for a shared lock fails if the file descriptor was not opened with read access.

An exclusive lock prevents any other process from setting a shared lock or an exclusive lock on any portion of the protected area. A request for an exclusive lock fails if the file descriptor is not opened with write access.

The flock and flock64 structure contains the following fields:

  • l_type

    Specifies the type of lock request. Valid settings are as follows:

    F_RDLCK

    Creates a shared lock.

    F_WRLCK

    Creates an exclusive lock.

    F_UNLCK

    Removes a lock.

  • l_whence

    Specifies the starting offset of the lock segment in the file. Valid settings are as follows:

    SEEK_SET

    l_start specifies a position relative to the start of the file.

    SEEK_CUR

    l_start specifies a position relative to the current file offset.

    SEEK_END

    l_start specifies a position relative to the end of the file.

    On a successful return from an F_GETLK or F_GETLK64 command for which a lock was found, the l_whence field is set to SEEK_SET.

  • l_start

    Specifies the relative offset of the start of the lock segment. This setting is used with l_whence to determine the actual start position.

  • l_len

    Specifies the number of consecutive bytes in the lock segment. This value may be negative.

  • l_pid

    On a successful return from an F_GETLK or F_GETLK64 command for which a lock was found, this field contains the process ID of the process holding the lock.

If l_len is positive, the affected area starts at l_start and ends at (l_start + l_len - 1). If l_len is negative, the area affected starts at (l_start + l_len) and ends at (l_start - 1). Locks may start and end beyond the current end of a file but must not be negative relative to the beginning of the file. Setting l_len to 0 sets a lock that can extend to the largest possible value of the file offset for that file. If such a lock also has l_start set to 0 and l_whence set to SEEK_SET, the whole file is locked.

There can be at most one type of lock set of each byte in the file. Before a successful return from an F_SETLK, F_SETLK64, F_SETLKW, or F_SETLKW64 request when the calling process has previously existing locks on bytes in the region specified by the request, the previous lock type for each byte in the specified region is replaced by the new lock type. As specified earlier, an F_SETLK, F_SETLK64, F_SETLKW, or F_SETLKW64 request fails or blocks, respectively, when another process has existing locks on bytes in the specified region and the type of any of those locks conflicts with the type specified in the request.

All locks associated with a file for a given process shall be removed when a file descriptor for that file is closed by that process or when the process holding that file descriptor terminates. Locks are not inherited by a child process.

A potential for deadlock occurs if a process controlling a locked region is put to sleep by attempting to lock another process's locked region. If the system detects that sleeping until a locked region is unlocked would cause a deadlock, fcntl() returns 1 with ERRNO set to EDEADLK.

5.11.7.1. AIX Prototype

#include <fcntl.h> int fcntl (int FileDescriptor, int cmd, ...); 


5.11.7.2. Linux Prototype

#include <unistd.h> #include <fcntl.h> int fcntl(int fd, int cmd); int fcntl(int fd, int cmd, long arg); int fcntl(int fd, int cmd, struct flock *lock); 


5.11.7.3. Detail Comparison

**Compatible**

The user should take note of the different open modes and command values between AIX and Linux. In the preceding section, the command description is mainly Linux-specific. Refer to the "File Bits" section at the end of this chapter.

5.11.7.4. Return Value

For a successful call, the return value depends on the operation performed:

  • F_DUPFD. The new descriptor.

  • F_GETFD. Value of flag.

  • F_GETFL. Value of flags.

  • F_GETOWN. Value of descriptor owner.

  • F_GETSIG. Value of signal sent when read or write becomes possible, or 0 for traditional SIGIO behavior.

  • All other commands. 0.

On error, 1 is returned, and ERRNO is set appropriately.

5.11.7.5. Errors
  • EACCES or EAGAIN. Operation is prohibited by locks held by other processes. Or, operation is prohibited because the file has been memory mapped by another process.

  • EBADF. fd is not an open file descriptor, or the command was F_SETLK or F_SETLKW and the file descriptor open mode does not match the type of lock requested.

  • EDEADLK. It was detected that the specified F_SETLKW command would cause a deadlock.

  • EFAULT. The lock is outside your accessible address space.

  • EINTR. For F_SETLKW, the command was interrupted by a signal. For F_GETLK and F_SETLK, the command was interrupted by a signal before the lock was checked or acquiredmost likely when locking a remote file (for example, locking over NFS), but it can sometimes happen locally.

  • EINVAL. For F_DUPFD, arg is negative or is greater than the maximum allowable value. For F_SETSIG, arg is not an allowable signal number.

  • EMFILE. For F_DUPFD, the process already has the maximum number of file descriptors open.

  • ENOLCK. Too many segment locks are open, the lock table is full, or a remote locking protocol failed (for example, locking over NFS).

  • EPERM. Attempted to clear the O_APPEND flag on a file that has the append-only attribute set.

5.11.7.6. ERRNOs Not Supported in Linux
  • ESRCH. The value of the Command parameter is F_SETOWN, and the process ID specified as the Argument parameter is not in use.

  • EOVERFLOW. The Command parameter was F_GETLK and the block lock could not be represented in the flock structure.

  • ETIMEDOUT. The connection timed out.

5.11.8. llseek(), lseek(), lseek64()

Sometimes you do not want to start at the beginning of an open file to start reading or writing to it. llseek(), lseek(), and lseek64() reposition the read/write pointer in a file that is associated with a file descriptor (fd).

5.11.8.1. AIX Prototype

#include <sys/types.h> #include <unistd.h> offset_t llseek(int fd, offset_t offset, int whence); offset_t lseek(int fd, offset_t offset, int whence); offset_t lseek64(int fd, offset_t offset, int whence); 


5.11.8.2. Linux Prototype

#include <sys/types.h> #include <unistd.h> off_t lseek(int fildes, off_t offset, int whence); _syscall5(int, _llseek, uint, fd, ulong, hi, ulong, lo, loff_t *, res, uint, wh); int _llseek(unsigned int fd, unsigned long offset_high, unsigned long offset_low, loff_t *result, unsigned int whence); 


5.11.8.3. Detail Comparison

**Compatible**

Although llseek is implemented in Linux, if you are writing new code lseek should be used instead of llseek for greater portability between UNIX platforms.

5.11.8.4. Additional Info: AIX/Linux, lseek Only
  • fd. Specifies a file descriptor obtained from a successful open or fcntl subroutine.

  • offset. Specifies a value, in bytes, that is used in conjunction with the Whence parameter to set the file pointer. A negative value causes seeking in the reverse direction.

  • whence. Specifies how to interpret the offset parameter by setting the file pointer associated with the fd parameter to one of the following variables:

    SEEK_SET

    Sets the file pointer to the value of the offset parameter.

    SEEK_CUR

    Sets the file pointer to its current location plus the value of the offset parameter.

    SEEK_END

    Sets the file pointer to the size of the file plus the value of the offset parameter.

5.11.8.5. Return Value

Upon successful completion, lseek returns the resulting offset location as measured in bytes from the beginning of the file. Otherwise, a value of (off_t)1 is returned and ERRNO is set to indicate the error.

5.11.8.6. Errors
  • EBADF. fd is not an open file descriptor.

  • ESPIPE. fd is associated with a pipe, socket, or FIFO.

  • EINVAL. whence is not a proper value.

5.11.8.7. ERRNOs Not Supported in Linux
  • EOVERFLOW. The resulting offset is larger than can be returned properly.

Conforming To

SVr4, POSIX, BSD 4.3

5.11.9. uname()

Occasionally, shell programmers as well as application developers need to know information about the system they are running on at runtime. This information could be used to set specific operating system environment variables and could aid in developing install packages for specific systems. To get the name and information about the current kernel and the machine the kernel is running on, use the uname() function.

5.11.9.1. AIX Prototype

#include <sys/utsname.h> int uname (struct utsname *Name); 


5.11.9.2. Linux Prototype

#include <sys/utsname.h> int uname(struct utsname *buf); 


5.11.9.3. Details Comparison

**Not Compatible**

AIX does not define domainname as part of the utsname structure. The domainname enTRy is technically not compatible because it is defined in Linux only if the program is compiled with _GNU_SOURCE.

5.11.10. syslog(), closelog(), openlog()

These commands provide messaging services to the system logger:

  • closelog() closes the descriptor being used to write to the system logger. The use of closelog() is optional.

  • openlog() opens a connection to the system logger for a program. The string pointed to by ident is prepended to every message and is typically set to the program name. The option argument specifies flags that control the operation of openlog() and subsequent calls to syslog(). The facility argument establishes a default to be used if none is specified in subsequent calls to syslog(). Values for option and facility are given below. The use of openlog() is optional; it is automatically called by syslog() if necessary, in which case ident defaults to NULL.

  • syslog() generates a log message, which is distributed by syslogd. The priority argument is formed by ORing the facility and the level values (explained below). The remaining arguments are a format, as in printf, and any arguments required by the format, except that the two-character sequence %m is replaced by the error message string strerror(errno). A trailing newline is added when needed.

5.11.10.1. AIX Prototype

#include <syslog.h> void openlog(const char *ident, int LogOption, int Facility); void syslog(int Priority, const char *Value,... ); int closelog(void); 


5.11.10.2. Linux Prototype

#include <syslog.h> void openlog( char *ident, int LogOption, int facility); void syslog( int priority, char *format, ...); void closelog( void ); 


5.11.10.3. Details Comparison

**Compatible**

The facilities will require additional code changes under Linux to make use of the following:

LOG_PERROR

LOG_AUTHPRIV

5.11.11. swapoff(), swapon()

These commands are usually used for device programming for enabling paging on a device. swapon sets the swap area to the file or block device specified by path. swapoff stops swapping to the file or block device specified by path.

5.11.11.1. AIX Prototype

#include <sys/vminfo.h> int swapon (char *PathName); int swapoff(char *PathName); 


5.11.11.2. Linux Prototype

#include <unistd.h> #include <asm/page.h> #include <sys/swap.h> int swapon(const char *path, int swapflags); int swapoff(const char *path); 


5.11.11.3. Details Comparision

**Compatible**

Functions are compatible, but the swapflags must be added for Linux.

5.11.11.4. Return Value

On success, 0 is returned. On error, 1 is returned, and ERRNO is set appropriately.

5.11.11.5. Errors

Many other errors can occur if path is not valid:

  • EPERM. The user is not the super user, or more than MAX_SWAPFILES (defined to be 8 in Linux 1.3.6) are in use.

  • EINVAL is returned if path exists but is neither a regular path nor a block device.

  • ENOENT is returned if path does not exist.

  • ENOMEM is returned if there is insufficient memory to start swapping.

5.11.11.6. ERRNO(s) Not Implemented in Linux
  • EINTR. A signal was received during processing of a request.

  • ENOTBLK. A block device is required.

  • ENOTDIA. A component of the PathName prefix is not a directory.

  • ENXIO. No such device address.

5.11.12. acct()

Use this command to switch process accounting on or off.

5.11.12.1. Linux/AIX Prototype

#include <unistd.h> int acct (char *Path); 


5.11.12.2. Detail Comparison

**Compatible**

5.11.12.3. Return Value

On success, 0 is returned. On error, 1 is returned, and ERRNO is set appropriately.

5.11.12.4. Errors
  • EACCES. Write permission is denied for the specified file.

  • EACCES. The argument filename is not a regular file.

  • EFAULT. filename points outside your accessible address space.

  • EIO. Error writing to the file filename.

  • EISDIR. filename is a directory.

  • ELOOP. Too many symbolic links were encountered in resolving filename.

  • ENAMETOOLONG. filename was too long.

  • ENOENT. The specified filename does not exist.

  • ENOMEM. Out of memory.

  • ENOSYS. BSD process accounting was not enabled when the operating system kernel was compiled. The kernel configuration parameter controlling this feature is CONFIG_BSD_PROCESS_ACCT.

  • ENOTDIR. A component used as a directory in filename is not in fact a directory.

  • EPERM. The calling process has no permission to enable process accounting.

  • EROFS. filename refers to a file on a read-only file system.

  • EUSERS. There are no more free file structures or we ran out of memory.

5.11.12.5. ERRNO(s) Not Implemented in Linux
  • EBUSY. An attempt is made to enable accounting when it is already enabled.

5.11.13. mmap(), mmap64(),[17] munmap()

[17] Used on 64-bit machines and for Large File Support.

These commands are memory-mapping functions. There comes a time when you want to read and write to and from files so that the information is shared between processes. Think of it this way: two processes both open the same file and both read and write from it, thus sharing the information. Wouldn't it be easier if you could just map a section of the file to memory and get a pointer to it? Then you could simply use pointer arithmetic to get (and set) data in the file.

Another great advantage to using memory-mapped files is to speed up performance. It would take a lot of I/O resources if more than one process needs to manipulate a large file. You would have to open, read and write to the file, and close it. If the file is already mapped in memory, access to the file is much quicker, which greatly improves I/O access.

mmap(), mmap64(), and munmap() enable you to map and unmap a file in memory to be shared by two or more processes for I/O processing, and they are really easy to use. A few simple calls, mixed with a few simple rules, and you are mapping like a mad person.

5.11.13.1. AIX Prototype

#include <sys/types.h> #include <sys/mman.h> void *mmap (void *addr, size_t len, int prot, int flags, int fildes, off_t off); void *mmap64 (void *addr, size_t len, int prot, int flags, int fildes, off_t off); int munmap (void *addr, size_t len); 


5.11.13.2. Linux Prototype

#include <unistd.h> #include <sys/mman.h> #ifdef _POSIX_MAPPED_FILES void * mmap(void *start, size_t length, int prot, int flags, int fd, off_t offset); void * mmap64(void *start, size_t length, int prot, int flags, int fd, off_t off); int munmap(void *start, size_t length); #endif 


5.11.13.3. Detail Comparison

**Compatible**

Values for prot are identical to AIX. In AIX, mmap() and munmap() are system calls.

The mmap function asks to map length bytes starting at offset from the file (or other object) specified by the file descriptor fd into memory, preferably at address start. If MAP_FIXED is used, the start address (addr) is a hint only and is usually specified as 0. The actual place where the object is mapped is returned by mmap and is never 0.

The prot argument describes the desired memory protection (and must not conflict with the open mode of the file). It is either PROT_NONE or is the bitwise OR of one or more of the other PROT_* flags:

  • PROT_EXEC. Pages may be executed.

  • PROT_READ. Pages may be read.

  • PROT_WRITE. Pages may be written.

  • PROT_NONE. Pages may not be accessed.

The flags parameter specifies the type of the mapped object, mapping options, and whether modifications made to the mapped copy of the page are private to the process or are to be shared with other references. Bits (below) are used to tell the system how the file will be mapped:

  • MAP_FIXED

    If start address addr is anything other than 0, do not select a different address than the one specified. If the nonzero specified address cannot be used, mmap will fail. If MAP_FIXED is specified, start must be a multiple of the page size. Use of this option is discouraged since it may prevent a system from making the most effective use of its resources.

  • MAP_SHARED

    Share this mapping with all other processes that map this object. Storing to the region is equivalent to writing to the file. The file may not actually be updated until msync or munmap is called.

  • MAP_PRIVATE

    Create a private copy-on-write mapping. Stores to the region do not affect the original file. It is unspecified whether changes made to the file after the mmap call are visible in the mapped region.

You must specify exactly one of MAP_SHARED and MAP_PRIVATE.

The preceding three flags are described in POSIX.1b (formerly POSIX.4) and SUSv2. Linux also is aware of the following nonstandard flags:

  • MAP_DENYWRITE

    This flag is ignored. (Long ago, it signaled that attempts to write to the underlying file should fail with ETXTBUSY, but this was a source of denial-of-service attacks.)

  • MAP_EXECUTABLE

    This flag is ignored.

  • MAP_NORESERVE

    (Used together with MAP_PRIVATE.) Do not reserve swap space pages for this mapping. When swap space is reserved, you have the guarantee that it is possible to modify this private copy-on-write region. When it is not reserved, you might get SIGSEGV upon a write when no memory is available.

  • MAP_LOCKED

    This flag is ignored.

  • MAP_GROWSDOWN

    Used for stacks. Indicates to the kernel VM system that the mapping should extend downward in memory.

  • MAP_ANONYMOUS

    The mapping is not backed by any file; the fd and offset arguments are ignored. This flag in conjunction with MAP_SHARED has been implemented since Linux 2.4.

  • MAP_ANON

    Alias for MAP_ANONYMOUS. Deprecated.

  • MAP_FILE

    Compatibility flag. Ignored.

Some systems document the additional flags MAP_AUTOGROW, MAP_AUTORESRV, MAP_COPY, and MAP_LOCAL.

fd should be a valid file descriptor, unless MAP_ANONYMOUS is set, in which case the argument is ignored.

offset should be a multiple of the page size as returned by getpagesize.

Memory mapped by mmap is preserved across fork(), with the same attributes.

A file is mapped in multiples of the page size. For a file that is not a multiple of the page size, the remaining memory is zeroed when mapped and writes to that region are not written out to the file. The effect of changing the size of the underlying file of a mapping on the pages that correspond to added or removed regions of the file is unspecified.

The munmap system call deletes the mappings for the specified address range and causes further references to addresses within the range to generate invalid memory references. The region is also automatically unmapped when the process is terminated. That said, closing the file descriptor does not unmap the region.

The address start must be a multiple of the page size. All pages containing a part of the indicated range are unmapped, and subsequent references to these pages will generate SIGSEGV. It is not an error if the indicated range does not contain any mapped pages.

For file-backed mappings, the st_atime field for the mapped file may be updated at any time between the mmap() and the corresponding unmapping; the first reference to a mapped page will update the field if necessary.

The st_ctime and st_mtime fields for a file mapped with PROT_WRITE and MAP_SHARED will be updated after a write to the mapped region, and before a subsequent msync(), with the MS_SYNC or MS_ASYNC flag, if one occurs.

5.11.13.4. Return Value

On success, mmap returns a pointer to the mapped area. On error, MAP_FAILED (1) is returned, and ERRNO is set appropriately. On success, munmap returns 0, on failure 1, and ERRNO is set (probably to EINVAL).

5.11.13.5. Errors
  • EBADF. fd is not a valid file descriptor (and MAP_ANONYMOUS was not set).

  • EACCES. The file descriptor (fd) refers to a nonregular file. Or MAP_PRIVATE was requested, but fd is not open for reading. Or MAP_SHARED was requested and PROT_WRITE is set, but fd is not open in read/write (O_RDWR) mode. Or PROT_WRITE is set, but the file is append-only.

  • EINVAL. Start, length, or offset are too large or are not aligned on a PAGESIZE boundary.

  • ETXTBSY. MAP_DENYWRITE was set, but the object specified by fd is open for writing.

  • EAGAIN. The file has been locked, or too much memory has been locked.

  • ENOMEM. No memory is available, or the process's maximum number of mappings would have been exceeded.

  • ENODEV. The underlying filesystem of the specified file does not support memory mapping.

Use of a mapped region can result in these signals:

  • SIGSEGV. Attempted write into a region specified to mmap as read-only.

  • SIGBUS. Attempted access to a portion of the buffer that does not correspond to the file (for example, beyond the end of the file, including the case where another process has truncated the file).

5.11.13.6. ERRNO(s) Not Implemented in Linux
  • EFBIG. The mapping requested extends beyond the maximum file size associated with fildes.

  • EMFILE. The application has requested SPEC1170-compliant behavior,[18] and the number of mapped regions would exceed an implementation-dependent limit (per process or per system).

    [18] SPEC1170-complaint means that addr is 0, so let the system allocate a region in memory and return the pointer. If not, use the value specified by addr.

  • ENXIO. The addresses specified by the range (off, off+len) are invalid for the fildes parameter.

  • EOVERFLOW. The mapping requested extends beyond the offset maximum for the file description associated with fildes.

Conforming To

SVr4, POSIX.1b (formerly POSIX.4), 4.4BSD, SUSv2. SVr4 documents additional error codes ENXIO and ENODEV. SUSv2 documents additional error codes EMFILE and EOVERFLOW.

5.11.14. pread(), pwrite()

This command reads from or writes to a file descriptor at a given offset.

5.11.14.1. Linux Prototype

b#define _XOPEN_SOURCE 500 #include <unistd.h> ssize_t pread(int fd, void *buf, size_t count, off_t offset); ssize_t pwrite(int fd, const void *buf, size_t count, off_t offset); 


5.11.14.2. AIX Prototype

#include <unistd.h> ssize_t pread(int fildes, void *buf, size_t nbyte, off_t offset); ssize_t pwrite (int FileDescriptor, const void *Buffer, size_t NBytes, off_t Offset); 


5.11.14.3. Detail Comparison

**Compatible**

pread() reads up to count bytes from file descriptor fd at offset offset (from the start of the file) into the buffer starting at buf. The file offset is not changed.

pwrite() writes up to count bytes from the buffer starting at buf to the file descriptor fd at offset offset. The file offset is not changed.

The file referenced by fd must be capable of seeking.

5.11.14.4. Additional Data: Linux-Specific

Functions are source-compatible, but note the #define _XOPEN_SOURCE 500 for Linux. _XOPEN_SOURCE 500 does not have to be explicitly defined by the user for all versions of Linux.

5.11.14.5. Return Value

On success, the number of bytes read or written is returned (0 indicates that nothing was written, in the case of pwrite, or end of file, in the case of pread), or 1 on error, in which case ERRNO is set to indicate the error.

5.11.14.6. Errors

pread can fail and set ERRNO to any error specified for read or lseek. pwrite can fail and set ERRNO to any error specified for write or lseek.

5.11.14.7. ERRNO(s) Not Implemented in Linux
  • EBADMSG. The file is a STREAM file that is set to control-normal mode, and the message waiting to be read includes a control part.

  • EDEADLK. A deadlock would occur if the calling process were to sleep until the region to be read was unlocked.

  • EOVERFLOW. An attempt was made to read from a regular file where NBytes was greater than 0 and the starting offset was before the end of file and was greater than or equal to the offset maximum established in the open file description associated with FileDescriptor.

  • ENXIO. A request was made of a nonexistent device, or the request was outside the capabilities of the device.

  • ESPIPE. fildes is associated with a pipe or FIFO.

  • ETIMEDOUT. The connection timed out.

Conforming To

Unix98

5.11.15. stat(), lstat(), fstat()

These functions return information about the specified file. You do not need any access rights to the file to get this information, but you need search rights to all directories named in the path leading to the file.

5.11.15.1. AIX Prototype

#include <sys/stat.h> int stat(const char * Path, struct stat * Buffer); int lstat (const char * Path, struct stat * Buffer); int fstat (int FileDescriptor, struct stat * Buffer); 


5.11.15.2. Linux Prototype

#include <sys/types.h> #include <sys/stat.h> #include <unistd.h> int stat(const char *file_name, struct stat * buf); int fstat(int filedes, struct stat * buf); int lstat(const char *file_name, struct stat * buf); 


5.11.15.3. Detail Comparison

**Compatible**

Functions are source-compatible. Code that references dev_ts may not work correctly unless the macros major(), minor(), and makedev() are used. These are found in sys/sysmacros.h.

5.11.15.4. ERRNO(s) Not Implemented in AIX
  • ESTALE. The root or current directory of the process is located in a virtual filesystem that has been unmounted.

  • ETIMEDOUT. The connection timed out.

  • EIO. An input/output (I/O) error occurred while reading from the filesystem.

5.11.16. ptrace()

The ptrace system call provides a means by which a parent process may observe and control the execution of another process and examine and change its core image and registers. It is primarily used to implement breakpoint debugging and system call tracing.

5.11.16.1. AIX Prototype

#include <sys/reg.h> #include <sys/ptrace.h> #include <sys/ldr.h> int ptrace(int Request, int Identifier, int *Address, int Data, int *Buffer); 


5.11.16.2. AIX5L Prototype

#define _LINUX_SOURCE_COMPAT #include <sys/ptrace.h> long int ptrace(enum ptrace_request request, pid_t pid, void *addr, void *data); 


5.11.16.3. Linux Prototype

#include <sys/ptrace.h> long int ptrace(enum ptrace_request request, pid_t pid, void *addr, void *data); 


5.11.16.4. Detail Comparison

**Not Compatible**

Although the service the API provides is similar between AIX and Linux, the way the tracing service is implemented differs. In Linux, kernel-level thread debugging is not possible with the ptrace as in AIX. And, in AIX, code built with the _LINUX_SOURCE_COMPAT directive is more compatible with the Linux ptrace.

The parent can initiate a trace by calling fork and having the resulting child do a PTRACE_TRACEME, followed (typically) by an exec. Alternatively, the parent may commence trace of an existing process using PTRACE_ATTACH.

While being traced, the child will stop each time a signal is delivered, even if the signal is being ignored. (The exception is SIGKILL, which has its usual effect.) The parent will be notified at its next wait and may inspect and modify the child process while it is stopped. The parent then causes the child to continue, optionally ignoring the delivered signal (or even delivering a different signal instead).

When the parent is finished tracing, it can terminate the child with PTRACE_KILL or cause it to continue executing in a normal, untraced mode via PTRACE_DETACH.

The value of request determines the action to be performed:

  • PTRACE_TRACEME

    Indicates that this process is to be traced by its parent. Any signal (except SIGKILL) delivered to this process will cause it to stop and its parent to be notified via wait. Also, all subsequent calls to exec by this process will cause a SIGTRAP to be sent to it, giving the parent a chance to gain control before the new program begins execution. A process probably should not make this request if its parent is not expecting to trace it. (pid, addr, and data are ignored.)

    The preceding request is used only by the child process; the rest are used only by the parent. In the following requests, pid specifies the child process to be acted on. For requests other than PTRACE_KILL, the child process must be stopped.

  • PTRACE_PEEKTEXT, PTRACE_PEEKDATA

    Read a word at the location addr in the child's memory, returning the word as the result of the ptrace call. Linux does not have separate text and data address spaces, so the two requests are currently equivalent. (The argument data is ignored.)

  • PTRACE_PEEKUSR

    Reads a word at offset addr in the child's USER area, which holds the registers and other information about the process (see sys/user.h). The word is returned as the result of the ptrace call. Typically, the offset must be word-aligned, although this might vary by architecture. (data is ignored.)

  • PTRACE_POKETEXT, PTRACE_POKEDATA

    Copy the word data to location addr in the child's memory. As above, the two requests are currently equivalent.

  • PTRACE_POKEUSR

    Copies the word data to offset addr in the child's user area. As above, the offset must typically be word-aligned. To maintain the integrity of the kernel, some modifications to the user area are disallowed.

  • PTRACE_GETREGS, PTRACE_GETFPREGS

    Copy the child's general purpose or floating-point registers, respectively, to location data in the parent. See sys/user.h for information on the format of this data. (addr is ignored.)

  • PTRACE_SETREGS, PTRACE_SETFPREGS

    Copy the child's general purpose or floating-point registers, respectively, from location data in the parent. As for PTRACE_POKEUSER, some general-purpose register modifications may be disallowed. (addr is ignored.)

  • PTRACE_CONT

    Restarts the stopped child process. If data is nonzero and not SIGSTOP, it is interpreted as a signal to be delivered to the child; otherwise, no signal is delivered. Thus, for example, the parent can control whether a signal sent to the child is delivered or not. (addr is ignored.)

  • PTRACE_SYSCALL, PTRACE_SINGLESTEP

    Restart the stopped child as for PTRACE_CONT, but arrange for the child to be stopped at the next entry to or exit from a system call, or after execution of a single instruction, respectively. (The child will also, as usual, be stopped upon receipt of a signal.) From the parent's perspective, the child will appear to have been stopped by receipt of a SIGTRAP. So, for PTRACE_SYSCALL, for example, the idea is to inspect the arguments to the system call at the first stop, and then do another PTRACE_SYSCALL and inspect the return value of the system call at the second stop. (addr is ignored.)

  • PTRACE_KILL

    Sends the child a SIGKILL to terminate it. (addr and data are ignored.)

  • PTRACE_ATTACH

    Attaches to the process specified in pid, making it a traced "child" of the current process; the behavior of the child is as if it had done a PTRACE_TRACEME. The current process actually becomes the parent of the child process for most purposes (for example, it will receive notification of child events and appears in ps output as the child's parent), but a getppid by the child will still return the pid of the original parent. The child is sent a SIGSTOP, but will not necessarily have stopped by the completion of this call; use wait to wait for the child to stop. (addr and data are ignored.)

  • PTRACE_DETACH

    Restarts the stopped child as for PTRACE_CONT, but first detaches from the process, undoing the reparenting effect of PTRACE_ATTACH and the effects of PTRACE_TRACEME. Although perhaps not intended, under Linux a traced child can be detached in this way regardless of which method was used to initiate tracing. (addr is ignored.)

5.11.16.5. ERRNO(s) Not Implemented in Linux
  • ENOTSUP. The request is not supported.

  • EINVAL. The debugger and the traced process are the same; or the Identifier parameter does not identify the thread that caused the exception.

5.11.17. setgid(), setregid()

These commands are used to set the user and group permission on a particular file or device.

5.11.17.1. AIX/Linux Prototype

#include <unistd.h> int setgid(gid_t gid); int setregid(gid_t rgid, gid_t egid); 


5.11.17.2. Detail Comparison

**Compatible**

Functions are source-compatible, but in AIX setgid sets the effective group ID of the current process. In Linux if the caller is the super user, the real and saved group IDs are also set.

setreuid sets real and effective user IDs of the current process. Unprivileged users may only set the real user ID to the real user ID or the effective user ID, and may only set the effective user ID to the real user ID, the effective user ID, or the saved user ID.

5.11.17.3. Return Value

On success, 0 is returned. On error, 1 is returned, and ERRNO is set appropriately.

5.11.17.4. Errors
  • EPERM. The user is not the super user (does not have the CAP_SETGID capability), and gid does not match the effective group ID or saved set-group-ID of the calling process.

5.11.17.5. ERRNO(s) Not Implemented in Linux
  • EINVAL. Indicates that the value of the gid parameter is invalid.

5.11.18. sync()

Reading from a disk is very slow compared to accessing (real) memory. In addition, it is common to read the same part of a disk several times during relatively short periods of time. For example, one might first read an e-mail message, then read the letter into an editor when replying to it, and then make the mail program read it again when copying it to a folder. Or, consider how often the command ls might be run on a system with many users. By reading the information from disk only once and then keeping it in memory until no longer needed, one can speed up all but the first read. This is called disk buffering, and the memory used for this purpose is called the buffer cache. sync() commits buffer cache to disk.

5.11.18.1. AIX Prototype

#include <unistd.h> void sync(void); 


5.11.18.2. Linux Prototype

#include <unistd.h> int sync(void); 


5.11.18.3. Detail Comparison

Functions are not quite compatible. In AIX, sync() is a void system call. In Linux, sync() always returns 0. Any code that tests or assigns the return value of sync() will cause compile errors.

5.11.19. wait3(), wait4()

Often in multiprocess-programming you want to suspend execution of one process until an event of some sort happens either in the current process or in another process. The wait3 and wait4 functions control the running of processes.

The wait3 function suspends execution of the current process until a child has exited, or until a signal is delivered whose action is to terminate the current process or to call a signal handling function. If a child has already exited by the time of the call (a so-called zombie process), the function returns immediately. Any system resources used by the child are freed.

The wait4 function suspends execution of the current process until a child as specified by the pid argument has exited, or until a signal is delivered whose action is to terminate the current process or to call a signal handling function. If a child as requested by pid has already exited by the time of the call (a zombie process), the function returns immediately. Any system resources used by the child are freed.

5.11.19.1. AIX Prototype

#define _ALL_SOURCE #include <sys/types.h> #include <sys/resource.h> #include <sys/wait.h> pid_t wait3(int *StatusLocation, int Options, struct rusage *ResourceUsage); pid_t wait4(pid_t pid, int *status, int options, struct rusage *rusage); 


5.11.19.2. Linux Prototype

#define _USE_BSD #include <sys/types.h> #include <sys/time.h> #include <sys/resource.h> #include <sys/wait.h> pid_t wait3(int *status, int options, struct rusage *rusage); pid_t wait4(pid_t pid, int *status, int options, struct rusage *rusage); 


5.11.19.3. Detail Comparison

**Compatible**

The value of pid can be one of the following:

  • < 1

    Wait for any child process whose process group ID is equal to the absolute value of pid.

  • 1

    Wait for any child process; this is equivalent to calling wait3.

  • 0

    Wait for any child process whose process group ID is equal to that of the calling process.

  • > 0

    Wait for the child whose process ID is equal to the value of pid.

The value of options is a bitwise OR of zero or more of the following constants:

  • WNOHANG, which means to return immediately if no child is there to be waited for.

  • WUNTRACED, which means to also return for children that are stopped and whose status has not been reported.

If status is not NULL, wait3 or wait4 stores status information in the location pointed to by status.

This status can be evaluated with the following macros. (These macros take the stat buffer (an int) as an argument, not a pointer to the buffer!)

  • WIFEXITED(status) is nonzero if the child exited normally.

  • WEXITSTATUS(status) evaluates to the least significant 8 bits of the return code of the child that terminated, which may have been set as the argument to a call to exit() or as the argument for a return statement in the main program. This macro can only be evaluated if WIFEXITED returned nonzero.

  • WIFSIGNALED(status) returns true if the child process exited because of a signal that was not caught.

  • WTERMSIG(status) returns the number of the signal that caused the child process to terminate. This macro can only be evaluated if WIFSIGNALED returned nonzero.

  • WIFSTOPPED(status) returns true if the child process that caused the return is currently stopped; this is only possible if the call was done using WUNtrACED.

  • WSTOPSIG(status) returns the number of the signal that caused the child to stop. This macro can only be evaluated if WIFSTOPPED returned nonzero.

If rusage is not NULL, the struct rusage as defined in sys/resource.h or bit/resource.h it points to will be filled with accounting information.

5.11.19.4. Return Value

The process ID of the child that exited is 1 on error (in particular, when no unwaited-for child processes of the specified kind exist) or 0 if WNOHANG was used and no child was available yet. In the latter two cases, ERRNO will be set appropriately.

5.11.19.5. Errors
  • ECHILD. No unwaited-for child process as specified exists.

  • EINTR. If WNOHANG was not set and an unblocked signal or a SIGCHLD was caught.

5.11.19.6. ERRNO(s) Not Implemented in Linux
  • EFAULT. The StatusLocation or ResourceUsage parameter points to a location outside of the address space of the process.

5.11.20. getcwd()

The getcwd() function copies the absolute path name of the current working directory to the array pointed to by buf, which is of length size. If the current absolute path name would require a buffer longer than size elements, NULL is returned, and ERRNO is set to ERANGE; an application should check for this error and allocate a larger buffer if necessary.

As an extension to the POSIX.1 standard, getcwd() allocates the buffer dynamically using malloc() if buf is NULL on call. In this case, the allocated buffer has the length size unless size is less than 0, when buf is allocated as big as necessary. It is possible (and, indeed, advisable) to free() the buffers if they have been obtained this way.

get_current_dir_name, which is only prototyped if __USE_GNU is defined, will malloc() an array big enough to hold the current directory name. If the environment variable PWD is set, and its value is correct, that value will be returned.

getwd, which is only prototyped if __USE_BSD is defined, will malloc() an array big enough to hold the absolute path name of the current working directory.

5.11.20.1. AIX Prototype

#include <unistd.h> char *getcwd (char *Buffer, size_t Size); char *getwd(char *buf); 


5.11.20.2. Linux Prototype

#include <unistd.h> char *getcwd(char *buf, size_t size); char *get_current_dir_name(void); char *getwd(char *buf); 


5.11.20.3. Detail Comparison

**Compatible**

AIX does not provide get_current_dir_name().

5.11.20.4. Errors
  • EACCES. Permission to read or search a component of the filename was denied.

  • EFAULT. buf points to a bad address.

  • EINVAL. The size argument is 0, and buf is not a null pointer.

  • ENOENT. The corrent working directory has been unlinked.

  • ERANGE. The size argument is less than the length of the working directory name. You need to allocate a bigger array (buf) and try again.

5.11.20.5. ERRNO(s) Not Implemented in Linux
  • ENOMEM. Indicates that insufficient storage space is available.

5.11.21. mount(), umount(), vmount()

A file hierarchy is usually made available to the filesystem through the mount command. The umount command detaches the filesystem(s) mentioned from the file hierarchy. A filesystem is specified by giving the directory where it has been mounted. Giving the special device on which the filesystem lives may also work, but this is obsolete, mainly because it will fail in case this device was mounted on more than one directory.

Note that a filesystem cannot be unmounted when it is "busy"for example, when there are open files on it, or when some process has its working directory there, or when a swap file on it is in use. The offending process could even be umount.

5.11.21.1. AIX Prototype

#include <sys/vmount.h> int vmount (struct vmount *VMount, int size) int mount(char *Device, char *Path, int Flags) int umount (char *Device) 


5.11.21.2. Linux Prototype

#include <sys/mount.h> int mount(const char *specialfile, const char *dir, const char * filesystemtype, unsigned long rwflag, const void *data); int umount(const char *specialfile); int umount(const char *dir); int umount2(const char *target, int flags); 


5.11.21.3. Detail Comparison

**Not Compatible**

Semantics differ between AIX and Linux.

Specifically, AIX users must convert their application usage of this API to that of Linux. Older applications that run on earlier versions of AIX (4.3.3 or before) will have an easier conversion as they most likely would have used an older mount subroutine instead of vmount. But that comparison is only based on the equal number of arguments. The semantics still differ.

umount functions are compatible.

5.11.21.4. Additional Data: Linux-Specific

These functions are Linux-specific and should not be used in programs intended to be portable.

mount attaches the filesystem specified by source (which is often a device name, but can also be a directory name or a dummy) to the directory specified by target.

umount and umount2 remove the attachment of the (topmost) filesystem mounted on target.

Only the super user may mount and unmount filesystems. Since Linux 2.4, a single filesystem can be visible at multiple mount points, and multiple mounts can be stacked on the same mount point.

Values for the filesystemtype argument supported by the kernel are listed in /proc/filesystems (minix, ext2, msdos, proc, nfs, iso9660, and so on). More types may become available when the appropriate modules are loaded.

The mountflags argument may have the magic number 0xC0ED (MS_MGC_VAL) in the top 16 bits (this was required in kernel versions prior to 2.4 but is no longer required and is ignored if specified), and various mount flags (as defined in linux/fs.h for libc4 and libc5 and in sys/mount.h for glibc2) in the low-order 16 bits:

  • MS_BIND

    (Linux 2.4 onward) Perform a bind mount, making a file or a directory subtree visible at another point within a file system. Bind mounts may cross file system boundaries and span chroot jails. The filesystemtype, mountflags, and data arguments are ignored.

  • MS_MANDLOCK

    Permit mandatory locking on files in this filesystem. (Mandatory locking must still be enabled on a per-file basis, as described in fcntl.)

  • MS_MOVE

    Move a subtree. source specifies an existing mount point, and target specifies the new location. The move is atomic: At no point is the subtree unmounted. The filesystemtype, mountflags, and data arguments are ignored.

  • MS_NOATIME

    Do not update access times for (all types of) files on this filesystem.

  • MS_NODEV

    Do not allow access to devices (special files) on this filesystem.

  • MS_NODIRATIME

    Do not update access times for directories on this filesystem.

  • MS_NOEXEC

    Do not allow programs to be executed from this filesystem.

  • MS_NOSUID

    Do not honor set-UID and set-GID bits when executing programs from this filesystem.

  • MS_RDONLY

    Mount filesystem read-only.

  • MS_REMOUNT

    Remount an existing mount. This allows you to change the mountflags and data of an existing mount without having to unmount and remount the file system. source and target should be the same values specified in the initial mount() call; filesystemtype is ignored.

  • MS_SYNCHRONOUS

    Make writes on this filesystem synchronous (as though the O_SYNC flag to open were specified for all file opens to this filesystem).

From Linux 2.4 onward, the MS_NODEV, MS_NOEXEC, and MS_NOSUID flags are settable on a per-mount point basis.

The data argument is interpreted by the different file systems. Typically, it is a string of comma-separated options understood by this filesystem. See mount for details of the options available for each filesystem type.

Linux 2.1.116 added the umount2() system call, which, like umount(), unmounts a target, but allows additional flags controlling the behavior of the operation:

  • MNT_FORCE. Force unmount even if busy (since 2.1.116; only for NFS mounts).

  • MNT_DETACH. Perform a lazy unmount: make the mount point unavailable for new accesses, and actually perform the unmount when the mount point ceases to be busy (since 2.4.11).

5.11.21.5. Return Value

On success, 0 is returned. On error, 1 is returned, and ERRNO is set appropriately.

5.11.21.6. Errors

The error values given next result from filesystem-type-independent errors. Each filesystem type may have its own special errors and its own special behavior. See the kernel source code for details.

  • EPERM. The user is not the super user.

  • ENODEV. Filesystemtype is not configured in the kernel.

  • ENOTBLK. Source is not a block device (and a device was required).

  • EBUSY. Source is already mounted. Or, it cannot be remounted read-only, because it still holds files open for writing. Or, it cannot be mounted on target because target is still busy (it is the working directory of some task, the mount point of another device, has open files, and so on). Or, it could not be unmounted because it is busy.

  • EINVAL. Source had an invalid superblock. Or, a remount was attempted, whereas source was not already mounted on target. Or, a move was attempted, whereas source was not a mount point, or was /. Or, an umount was attempted, whereas target was not a mount point.

  • ENOTDIR. The second argument, or a prefix of the first argument, is not a directory.

  • EFAULT. One of the pointer arguments points outside the user address space.

  • ENOMEM. The kernel could not allocate a free page to copy filenames or data into.

  • ENAMETOOLONG. A path name was longer than MAXPATHLEN.

  • ENOENT. A path name was empty or had a nonexistent component.

  • ELOOP. Too many links encountered during pathname resolution. Or, a move was attempted, whereas target is a descendant of source.

  • EACCES. A component of a path was not searchable.

    Or, mounting a read-only filesystem was attempted without giving the MS_RDONLY flag.

    Or, the block device Source is located on a filesystem mounted with the MS_NODEV option.

  • ENXIO. The major number of the block device source is out of range.

  • EMFILE. (In case no block device is required.) Table of dummy devices is full.

Portability

These functions are Linux-specific and should not be used in programs intended to be portable.

5.11.22. readv(), writev()

These commands read and write data into multiple buffers.

readv reads data from file descriptor filedes and puts the result in the buffers described by vector. The number of buffers is specified by count. The buffers are filled in the order specified. It operates just like read except that data is put in vector rather than a contiguous buffer.

writev writes data to file descriptor filedes and from the buffers described by vector. The number of buffers is specified by count. The buffers are used in the order specified. It operates just like write except that data is taken from vector rather than a contiguous buffer.

5.11.22.1. AIX Prototype

#include <sys/uio.h> ssize_t readv (int FileDescriptor, const struct iovec *iov, int iovCount); ssize_t writev (int FileDescriptor, const struct iovec *iov, int iovCount); 


5.11.22.2. Linux Prototype

#include <sys/uio.h> int readv(int filedes, const struct iovec *vector, int count); int writev(int filedes, const struct iovec *vector, int count); 


5.11.22.3. Detail Comparison

**Not Compatible**

Functions return different types. On some systems this may be as simple as casting the return value to an int.

Other than return values, semantics are the same.

5.11.22.4. Return Value

On success readv returns the number of bytes read. On success writev returns the number of bytes written. On error, 1 is returned, and ERRNO is set appropriately.

5.11.22.5. Errors
  • EINVAL. An invalid argument was given. For instance, count might be greater than MAX_IOVEC, or 0. filedes could also be attached to an object that is unsuitable for reading (for readv) or writing (for writev).

  • EFAULT. "Segmentation fault." Most likely vector or some of the iov_base pointers point to memory that is not properly allocated.

  • EBADF. The file descriptor filedes is not valid.

  • EINTR. The call was interrupted by a signal before any data was read/written.

  • EAGAIN. Nonblocking I/O has been selected using O_NONBLOCK, and no data was immediately available for reading. (Or the file descriptor filedes is for an object that is locked.)

  • EISDIR. filedes refers to a directory.

  • EOPNOTSUPP. filedes refers to a socket or device that does not support reading/writing.

  • ENOMEM. Insufficient kernel memory was available.

Other errors may occur, depending on the object connected to filedes.

5.11.22.6. ERRNO(s) Not Implemented in Linux
  • EBADMSG. The file is a STREAM file that is set to control-normal mode, and the message waiting to be read includes a control part.

  • EDEADLK. A deadlock would occur if the calling process were to sleep until the region to be read was unlocked.

  • EOVERFLOW. An attempt was made to read from a regular file where NBytes was greater than 0 and the starting offset was before the end of file and was greater than or equal to the offset maximum established in the open file description associated with the file descriptor.

  • ENXIO. A request was made of a nonexistent device or the request was outside the capabilities of the device.

  • ESPIPE. fildes is associated with a pipe or FIFO.

  • ETIMEDOUT. The connection timed out.

Conforming To

4.4BSD (the readv and writev functions first appeared in BSD 4.2), Unix98. Linux libc5 used size_t as the type of the count parameter.

5.11.23. select()

select enables synchronous I/O multiplexing on a file descriptor. It checks the specified file descriptors and message queues (fd_set) to see whether they are ready for reading (receiving) or writing (sending).

5.11.23.1. AIX Prototype

#include <sys/time.h> #include <sys/select.h> #include <sys/types.h> int select (int Nfdsmsgs, struct sellist *ReadList, struct sellist *WriteList, struct sellist *ExceptList, struct timeval *TimeOut); 


5.11.23.2. Linux Prototype

#include <sys/time.h> #include <sys/types.h> #include <unistd.h> int select(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); int pselect(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, const struct timespec *timeout, const sigset_t *sigmask); 


5.11.23.3. select(), pselect()

**Not Compatible**

Functions are source-compatible, but semantics differ.

In AIX, the select is used for I/O multiplexing as well as exception and event handling. This is not so in Linux. For event and exception notification in Linux, you must use both select and sigprocmask system calls. Following is a brief example of how to handle the equivalent AIX exception and event notification in Linux:

sigprocmask (SIG_BLOCK, &orig_sigmask, 0); r = select (n, &rd, &wr, &er, 0); sigprocmask (SIG_BLOCK, &sigmask, 0); 


In AIX, users call select with all three sets empty, n zero, and a non-null timeout as a fairly portable way to sleep with subsecond precision.

On Linux, the function select modifies timeout to reflect the amount of time not slept, which is different from most other implementations. This causes problems both when porting to Linux from other platforms and vice versa because the timeout is unusable after the call to select in Linux. Therefore, in Linux, consider timeout to be undefined after select returns.

The file sys/time.h contains compatibility defines for fd_set and the prototype for select(). The FD_* macros are also defined there. The sizes of the fd_set types differ, but code is source-compatible. In AIX, select is a system call.

In Linux, an fd_set is 128 bytes. In AIX, it is 4096 bytes. If your set is larger than 128 bytes in AIX, this will have to be shortened.

5.11.23.4. Return Value

On success, select and pselect return the number of descriptors contained in the descriptor sets, which may be zero if the timeout expires before anything interesting happens. On error, 1 is returned, and ERRNO is set appropriately; the sets and timeout become undefined, so do not rely on their contents after an error.

5.11.23.5. Errors
  • EBADF. An invalid file descriptor was given in one of the sets.

  • EINTR. A nonblocked signal was caught.

  • EINVAL. n is negative.

  • ENOMEM. select was unable to allocate memory for internal tables.

5.11.23.6. Example

#include <stdio.h> #include <sys/time.h> #include <sys/types.h> #include <unistd.h> int main(void) {  fd_set rfds;  struct timeval tv;  int retval;  /* Watch stdin (fd 0) to see when it has input. */  FD_ZERO(&rfds);  FD_SET(0, &rfds);  /* Wait up to five seconds. */  tv.tv_sec = 5;  tv.tv_usec = 0;  retval = select(1, &rfds, NULL, NULL, &tv);  /* Don't rely on the value of tv now! */  if (retval)  printf("Data is available now.\n");  /* FD_ISSET(0, &rfds) will be true. */  else  printf("No data within five seconds.\n");  return 0; } 


5.11.23.7. ERRNO(s) Not Implemented in Linux
  • EAGAIN. Allocation of internal data structures was unsuccessful.

  • EFAULT. The ReadList, WriteList, ExceptList, or TimeOut parameter points to a location outside of the address space of the process.

  • ETIMEDOUT. The connection timed out.

Conforming To

4.4BSD (the select function first appeared in 4.2BSD). Generally portable to/from non-BSD systems supporting clones of the BSD socket layer (including System V variants). However, note that the System V variant typically sets the timeout variable before exit, but the BSD variant does not.

The pselect function is defined in IEEE Std 1003.1g-2000 (POSIX.1g) and part of POSIX 1003.1-2001. It is found in glibc 2.1 and later. glibc 2.0 has a function with this name, but it does not take a sigmask parameter.

5.11.24. reboot()

Use reboot to enable or disable rebooting of a system.

5.11.24.1. AIX Prototype

#include <sys/reboot.h> void reboot (int HowTo, void *Argument) #define _LINUX_SOURCE_COMPAT #include <sys/reboot.h> int reboot (int flag); 


5.11.24.2. Linux Prototype

Under glibc, some of the constants involved have gotten symbolic names RB_*, and the library call is a one-argument wrapper around the three-argument system call:

#include <unistd.h> #include <sys/reboot.h> int reboot (int flag); 


For libc4 and libc5, the library call and the system call are identical, and since kernel version 2.1.30 there are symbolic names LINUX_REBOOT_* for the constants and a fourth argument to the call:

#include <unistd.h> #include <linux/reboot.h> int reboot(int magic, int magic2, int flag, void *arg); 


5.11.24.3. Detail Comparison

**Not Compatible**

In AIX, reboot is a system call.

This command is a platform-specific command and is generally not portable for obvious reasons. However, IBM has made this command portable from Linux to AIX by providing the following flags when _LINUX_SOURCE_COMPAT is defined (the list also shows the Linux to AIX mapping if enabled [only on AIX5L]):

LINUX_REBOOT_CMD_RESTART ->  RB_SOFTIPL LINUX_REBOOT_CMD_HALT -> RB_HALT_POWERED LINUX_REBOOT_CMD_POWER_OFF ->  RB_HALT LINUX_REBOOT_CMD_RESTART2 ->  RB_POWIPL LINUX_REBOOT_CMD_CAD_ON -> return(ENOSYS) LINUX_REBOOT_CMD_CAD_OFF ->  return(0) 


AIX did not implement CAD (Ctrl-Alt-Del) for Linux compatibility.

5.11.24.4. Return Value

On success, the system reboots. On error, 1 is returned, and ERRNO is set appropriately.

5.11.24.5. Errors
  • EINVAL. Bad magic numbers or flag.

  • EPERM. A nonroot user attempts to call reboot.

5.11.24.6. ERRNO(s) Not Implemented in Linux
  • ENOSYS. Function not supported (LINUX_REBOOT_CMD_CAD_ON only)

5.11.25. chroot()

chroot() is a UNIX system call that is often used to provide an additional layer of security when untrusted programs are run. The call to chroot() is normally used to ensure that code run after it can only access files at or below a given directory. Generally this is /, but the chroot() system call can change this. When chroot() is successfully called, the calling process has its idea of the root directory changed to the directory given as the argument to chroot(). The root directory is inherited by all children of the current process.

Only the super user may change the root directory.

5.11.25.1. AIX/Linux Prototype

#include <unistd.h> int chroot(const char *path); 


5.11.25.2. Details Comparison

**Compatible**

In AIX, this is a system call.

5.11.25.3. Return Value

On success, 0 is returned. On error, 1 is returned, and ERRNO is set appropriately.

5.11.25.4. Errors

Depending on the filesystem, other errors can be returned. The more general errors are listed here:

  • EPERM. The effective UID is not 0.

  • EFAULT. path points outside your accessible address space.

  • ENAMETOOLONG. path is too long.

  • ENOENT. The file does not exist.

  • ENOMEM. Insufficient kernel memory was available.

  • ENOTDIR. A component of path is not a directory.

  • EACCES. Search permission is denied on a component of the path prefix.

  • ELOOP. Too many symbolic links were encountered in resolving path.

  • EIO. An I/O error occurred.

5.11.25.5. ERRNO(s) Not Implemented in Linux
  • ESTALE. The root or current directory of the process is located in a virtual file system that has been unmounted.

  • ETIMEDOUT. The connection timed out.

Conforming To

SVr4, SVID, 4.4BSD, X/OPEN. This function is not part of POSIX.1. SVr4 documents additional EINTR, ENOLINK, and EMULTIHOP error conditions. X/OPEN does not document EIO, ENOMEM, or EFAULT error conditions. This interface is marked as legacy by X/OPEN.

5.11.26. fstatfs(), statfs()

statfs returns information about a mounted filesystem. path is the path name of any file within the mounted filesystem. buf is a pointer to a statfs structure. Refer to the "Detail Comparison" section for the definition of statfs structure

fstatfs returns the same information about an open file referenced by descriptor fd.

5.11.26.1. AIX/Linux Prototype

#include <sys/vfs.h> int statfs(const char *path, struct statfs *buf); int fstatfs(int fd, struct statfs *buf); 


5.11.26.2. Detail Comparison

Functions are compatible. In AIX, the sys/vfs.h file includes sys/statfs.h. Although compatible, the definitions of struct statfs are not identical. Refer to the following struct to identify differences.

Linux

struct statfs {  long f_type;     /* type of filesystem (see below) */  long f_bsize;    /* optimal transfer block size */  long f_blocks;   /* total data blocks in file system */  long f_bfree;    /* free blocks in fs */  long f_bavail;   /* free blocks avail to non-superuser */  long f_files;    /* total file nodes in file system */  long f_ffree;    /* free file nodes in fs */  fsid_t f_fsid;   /* file system id */  long f_namelen;  /* maximum length of filenames */  long f_spare[6]; /* spare for later */ }; File system types: linux/affs_fs.h:  AFFS_SUPER_MAGIC 0xADFF linux/efs_fs.h:  EFS_SUPER_MAGIC 0x00414A53 linux/ext_fs.h:  EXT_SUPER_MAGIC 0x137D linux/ext2_fs.h:  EXT2_OLD_SUPER_MAGIC 0xEF51  EXT2_SUPER_MAGIC 0xEF53 linux/hpfs_fs.h:  HPFS_SUPER_MAGIC 0xF995E849 linux/iso_fs.h:  ISOFS_SUPER_MAGIC 0x9660 linux/minix_fs.h:  MINIX_SUPER_MAGIC 0x137F /* orig. minix */  MINIX_SUPER_MAGIC2 0x138F /* 30 char minix */  MINIX2_SUPER_MAGIC 0x2468 /* minix V2 */  MINIX2_SUPER_MAGIC2 0x2478 /* minix V2, 30 char names */ linux/msdos_fs.h:  MSDOS_SUPER_MAGIC 0x4d44 linux/ncp_fs.h:  NCP_SUPER_MAGIC 0x564c linux/nfs_fs.h:  NFS_SUPER_MAGIC 0x6969 linux/proc_fs.h:  PROC_SUPER_MAGIC 0x9fa0 linux/smb_fs.h:  SMB_SUPER_MAGIC 0x517B linux/sysv_fs.h:  XENIX_SUPER_MAGIC 0x012FF7B4  SYSV4_SUPER_MAGIC 0x012FF7B5  SYSV2_SUPER_MAGIC 0x012FF7B6  COH_SUPER_MAGIC 0x012FF7B7 linux/ufs_fs.h:  UFS_MAGIC  0x00011954 linux/xfs_fs.h:  XFS_SUPER_MAGIC 0x58465342 linux/xia_fs.h:  _XIAFS_SUPER_MAGIC 0x012FD16D 


AIX

struct statfs {  int f_version;       /* version/type of statfs, 0 for now */  int f_type;          /* type of info, zero for now */  ulong_t f_bsize;     /* optimal file system block size */  fsblkcnt_t f_blocks; /* total data blocks in file system */  fsblkcnt_t f_bfree;  /* free block in fs */  fsblkcnt_t f_bavail; /* free blocks avail to non-superuser */  fsfilcnt_t f_files;  /* total file nodes in file system */  fsfilcnt_t f_ffree;  /* free file nodes in fs */ #if !defined(_KERNEL) && defined(__64BIT__)  fsid64_t f_fsid;      /* file system id */ #else  fsid_t f_fsid;        /* file system id */ #endif  int f_vfstype;       /* what type of vfs this is */  ulong_t f_fsize;     /* fundamental file system block size */  int f_vfsnumber;     /* vfs indentifier number */  int f_vfsoff;        /* reserved, for vfs specific data offset */  int f_vfslen;         /* reserved, for len of vfs specific data */  int f_vfsvers;        /* reserved, for vers of vfs specific data *  char f_fname[32];    /* file system name (usually mount pt.) */  char f_fpack[32];    /* file system pack name */  int f_name_max;      /* maximum component name length for posix */ }; 


Fields that are undefined for a particular filesystem are set to 0.

5.11.26.3. Return Value

On success, 0 is returned. On error, 1 is returned, and ERRNO is set appropriately.

5.11.26.4. Errors

For statfs:

  • ENOTDIR. A component of the path prefix of path is not a directory.

  • ENAMETOOLONG. path is too long.

  • ENOENT. The file referred to by path does not exist.

  • EACCES. Search permission is denied for a component of the path prefix of path.

  • ELOOP. Too many symbolic links were encountered in translating path.

  • EFAULT. buf or path points to an invalid address.

  • EIO. An I/O error occurred while reading from or writing to the filesystem.

  • ENOMEM. Insufficient kernel memory was available.

  • ENOSYS. The filesystem path does not support statfs.

For fstatfs:

  • EBADF. fd is not a valid open file descriptor.

  • EFAULT. buf points to an invalid address.

  • EIO. An I/O error occurred while reading from or writing to the filesystem.

  • ENOSYS. The filesystem fd is open but does not support statfs.

5.11.26.5. ERRNO(s) Not Implemented in Linux
  • ESTALE. The root or current directory of the process is located in a virtual filesystem that has been unmounted.

  • ETIMEDOUT. The connection timed out.

  • EIO. An input/output (I/O) error occurred while reading from the filesystem.

Conforming To

The Linux statfs was inspired by the 4.4BSD statfs (but they do not use the same structure).

5.11.27. poll()

poll() (alone with select()) indicates when a procedure is safe to execute on an open file descriptor without any delays. For instance, a programmer can use these calls to know when there is data to be read on a socket. By delegating responsibility to select() and poll(), you do not have to constantly check whether there is data to be read. Instead, select() and poll() can be placed in the background by the operating system and woken up when the event is satisfied or a specified timeout has elapsed. This process can significantly increase execution efficiency of a program.

5.11.27.1. AIX Prototype

#include <sys/poll.h> int poll(void *ListPointer, unsigned long Nfdsmsgs, long Timeout); 


5.11.27.2. Linux Prototype

#include <sys/poll.h> int poll(struct pollfd *ufds, unsigned int nfds, int timeout); 


5.11.27.3. Detail Comparison

**Not Compatible**

poll: wait for some event on a file descriptor.

Functions are source-compatible, but the pollfd structs differ.

For maximum performance and scalability, Linux introduced epoll in version 2.5, but in Linux 2.6 the function was deprecated and is no longer supported.

AIX

struct pollist {  struct pollfd fds[3];  struct pollmsg msgs[2];  } list; 


Linux

poll is a variation on the theme of select. It specifies an array of nfds structures of the following type (and a timeout in milliseconds):

struct pollfd {  int fd;  /* file descriptor */  short events; /* requested events */  short revents; /* returned events */ }; 


A negative value means infinite timeout. The field fd contains a file descriptor for an open file. The field events is an input parameter, a bitmask specifying the events the application is interested in. The field revents is an output parameter, filled by the kernel with the events that actually occurred, either of the type requested, or of one of the types POLLERR or POLLHUP or POLLNVAL. (These 3 bits are meaningless in the events field and are set in the revents field whenever the corresponding condition is true.) If none of the events requested (and no error) has occurred for any of the file descriptors, the kernel waits for timeout milliseconds for one of these events to occur. The following possible bits in these masks are defined in <sys/poll.h>:

#define POLLIN 0x0001 /* There is data to read */ #define POLLPRI 0x0002 /* There is urgent data to read */ #define POLLOUT 0x0004 /* Writing now will not block */ #define POLLERR 0x0008 /* Error condition */ #define POLLHUP 0x0010 /* Hung up */ #define POLLNVAL 0x0020 /* Invalid request: fd not open */ 


In asm/poll.h, too, the values POLLRDNORM, POLLRDBAND, POLLWRNORM, POLLWRBAND, and POLLMSG are defined.

5.11.27.4. Return Value

On success, a positive number is returned, where the number returned is the number of structures that have nonzero revents fields (in other words, those descriptors with events or errors reported). A value of 0 indicates that the call timed out and no file descriptors have been selected. On error, 1 is returned, and ERRNO is set appropriately.

5.11.27.5. Errors
  • EBADF. An invalid file descriptor was given in one of the sets.

  • ENOMEM. There was no space to allocate file descriptor tables.

  • EFAULT. The array given as argument was not contained in the calling program's address space.

  • EINTR. A signal occurred before any requested event.

5.11.27.6. ERRNO(s) Not Implemented in Linux
  • EAGAIN. Allocation of internal data structures was unsuccessful.

  • EINVAL. The number of pollfd structures specified by the Nfdsmsgs parameter is greater than the maximum number of open files, OPEN_MAX. This error is also returned if the number of pollmsg structures specified by the Nfdsmsgs parameter is greater than the maximum number of allowable message queues.

Conforming To

XPG4-UNIX

5.11.28. quotactl()

The quotactl() call manipulates disk quotas. cmd indicates a command to be applied to UID id or GID id. To set the type of quota, use the QCMD(cmd, type) macro. special is a pointer to a null-terminated string containing the path name of the block special device for the filesystem being manipulated. addr is the address of an optional, command-specific data structure that is copied in or out of the system. The interpretation of addr is given with each of the following commands.

5.11.28.1. AIX Prototype

#include <jfs/quota.h> int quotactl (char *Path, int Cmd, int ID, char *Addr); 


5.11.28.2. Linux Prototype

#include <sys/types.h> #include <sys/quota.h> int quotactl(int cmd, const char *special, qid_t id, caddr_t addr); 


5.11.28.3. Detail Comparison

**Not Compatible**

The header is in a different location for AIX and Linux. The arguments are in a different order. The Linux cmd includes all the AIX commands as a subset plus their own. The remaining documentation is from the Linux man page.

  • Q_QUOTAON

    Turn on quotas for a filesystem. addr points to the path name of the file containing the quotas for the filesystem. The quota file must exist; it is normally created with the quotacheck program. This call is restricted to the super user.

  • Q_QUOTAOFF

    Turn off quotas for a filesystem. addr and id are ignored. This call is restricted to the super user.

  • Q_GETQUOTA

    Get disk quota limits and current usage for user or group id. addr is a pointer to a mem_dqblk structure (defined in linux/quota.h). Only the super user may get the quotas of a user other than himself.

  • Q_SETQUOTA

    Set disk quota limits and current usage for user or group id. addr is a pointer to a mem_dqblk structure (defined in linux/quota.h). This call is restricted to the super user.

  • Q_SETQLIM

    Set disk quota limits for user or group id. addr is a pointer to a mem_dqblk structure (defined in linux/quota.h). This call is restricted to the super user.

  • Q_SETUSE

    Set current usage for user or group id. addr is a pointer to a mem_dqblk structure (defined in linux/quota.h). This call is restricted to the super user.

  • Q_SYNC

    Update the on-disk copy of quota usages for a filesystem. If special is null, all filesystems with active quotas are synced. addr and id are ignored.

  • Q_GETSTATS

    Get statistics and other generic information about the quota subsystem. addr should be a pointer to the dqstats structure (defined in linux/quota.h) in which data should be stored. special and id are ignored.

The new quota format also allows following additional calls:

  • Q_GETINFO

    Get information (such as grace times) about quotafile. addr should be a pointer to the mem_dqinfo structure (defined in linux/quota.h). id is ignored.

  • Q_SETINFO

    Set information about quotafile. addr should be a pointer to the mem_dqinfo structure (defined in linux/quota.h). id is ignored. This operation is restricted to super user.

  • Q_SETGRACE

    Set grace times in information about quotafile. addr should be a pointer to the mem_dqinfo structure (defined in linux/quota.h). id is ignored. This operation is restricted to super user.

  • Q_SETFLAGS

    Set flags in information about quotafile. These flags are defined in linux/quota.h. Note that there are currently no defined flags. addr should be a pointer to mem_dqinfo structure (defined in linux/quota.h). id is ignored. This operation is restricted to super user.

For XFS filesystems making use of the XFS Quota Manager (XQM), the preceding commands are bypassed and the following commands are used:

  • Q_XQUOTAON

    Turn on quotas for an XFS filesystem. XFS provides the ability to turn on/off quota limit enforcement with quota accounting. Therefore, XFS expects the addr to be a pointer to an unsigned int that contains either the flags XFS_QUOTA_UDQ_ACCT and/or XFS_QUOTA_UDQ_ENFD (for user quota) or XFS_QUOTA_GDQ_ACCT and/or XFS_QUOTA_GDQ_ENFD (for group quota), as defined in linux/xqm.h. This call is restricted to super user.

  • Q_XQUOTAOFF

    Turn off quotas for an XFS filesystem. As in Q_QUOTAON, XFS filesystems expect a pointer to an unsigned int that specifies whether quota accounting and/or limit enforcement needs to be turned off. This call is restricted to the super user.

  • Q_XGETQUOTA

    Get disk quota limits and current usage for user id. addr is a pointer to an fs_disk_quota structure (defined in linux/xqm.h). Only the super user may get the quotas of a user other than himself.

  • Q_XSETQLIM

    Set disk quota limits for user id. addr is a pointer to an fs_disk_quota structure (defined in linux/xqm.h). This call is restricted to super user.

  • Q_XGETQSTAT

    Return an fs_quota_stat structure containing XFS filesystem-specific quota information. This is useful in finding out how much space is spent to store quota information and to get quota on/off status of a given local XFS filesystem.

  • Q_XQUOTARM

    Free the disk space taken by disk quotas. Quotas must have already been turned off.

There is no command equivalent to Q_SYNC for XFS because sync writes quota information to disk (in addition to the other filesystem metadata it writes out).

5.11.28.4. Return Values

quotactl() returns 0 on success and 1on failure and sets ERRNO to indicate the error.

5.11.28.5. Errors
  • EFAULT. addr and special are invalid.

  • EINVAL. The kernel has not been compiled with the QUOTA option. cmd is invalid.

  • ENOENT. The file specified by special or addr does not exist.

  • ENOTBLK. special is not a block device.

  • EPERM. The call is privileged and the caller was not the super user.

  • ESRCH. No disc quota is found for the indicated user. Quotas have not been turned on for this filesystem.

  • EUSERS. The quota table is full.

If cmd is Q_QUOTAON, quotactl() may set ERRNO to the following:

  • EACCES

    The quota file pointed to by addr exists but is not a regular file. The quota file pointed to by addr exists but is not on the filesystem pointed to by special.

  • EBUSY

    Q_QUOTAON attempted even though another Q_QUOTAON has already taken place.




UNIX to Linux Porting. A Comprehensive Reference
UNIX to Linux Porting: A Comprehensive Reference
ISBN: 0131871099
EAN: 2147483647
Year: 2004
Pages: 175

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net