6.5.1 System Configuration

Buffered sends could also be used to avoid the problem illustrated in Program 8.3, but the user would then be responsible for allocating buffer space. Buffering also requires additional copying, which can impact performance.
Ready mode sends Ready communication procedures imply a promise by the caller to the MPI library that the receive has been posted before the send is initiated. Under this assumption, the library is free to perform optimizations that would be difficult or impossible otherwise.
Synchronous mode sends A synchronous send does not return until the corresponding receive has been posted. Blocking synchronous sends and receives can be used to implement a two-process barrier. They may also be used as a debugging aid. In a correct program, all standard mode sends can be replaced by synchronous mode sends without altering the behavior. On the other hand, if the standard mode sends in Program 8.3 were replaced by synchronous sends, the program would immediately deadlock regardless of buffer sizes or other considerations. This can be used to quickly verify the absence of such incorrect constructs.
8.4.2 Virtual Topologies and Attribute Caching
In addition to defining contexts for communication based on groups of processes, communicators can also support virtual topologies, which allow applications to work in terms of logical Cartesian groups of processes, regardless of the underlying hardware network or architecture. Attribute caching allows libraries to attach arbitrary pieces of information to a communicator. This information might be used, for example to optimize subsequent library calls.
8.4.3 Derived Data Types
In the examples above, we restricted ourselves to a few of MPI's primitive data types. In fact, MPI includes a powerful set of procedures for defining and manipulating complex data types. These procedures can be used to refer to non-contiguous data, e.g., a sub-block of a matrix, and to refer to data of inhomogeneous type, e.g., an integer count followed by a sequence of floating point values.
Communication in parallel computations is frequently latency dominated. This means that the cost of sending a message is primarily in the startup overhead at both ends, and the actual transfer of data is relatively inexpensive. On Beowulf systems employing Fast Ethernet, messages below about 1.5kbytes are generally latency dominated. Under such circumstances, it is advantageous to pack multiple

 



How to Build a Beowulf
How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters (Scientific and Engineering Computation)
ISBN: 026269218X
EAN: 2147483647
Year: 1999
Pages: 134

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net