6.5 Defending the Pack: Security Strategies

We can derive the same result analytically by estimating the time to complete an iteration with N cells and P processors as:
c0207-01.gif
The first term is just the time it would take on one processor, Ntupdate, divided by P, i.e., it is the perfect speedup result. The last term, therefore, represents how much worse than perfect is the actual implementation. It is the ratio of the time spent by one processor in communication, 2tlatency to the time spent in CAupdate, (N/P)tupdate. If this ratio is small, then the overall speedup will be near perfect, while if this ratio is large, the overall speedup will be disappointing. The sample implementation performs about 4 million updates per second on a 200MHz PentiumPro processor. On a Fast Ethernet network with tlatency of 200 sec, N/P would have to be larger than 1600 or so to obtain good speedups. Of course, all these numbers are very rough estimates. Nevertheless, a quick analysis like this one can be extremely useful when designing, implementing, and debugging an MPI program. It allows one to estimate how times will scale with N, P, etc., and also to determine how communication latency and/or bandwidth affect overall performance. One obtains a quantitative expectation about real performance which can be compared against actual behavior. If they differ, further analysis and investigation may uncover an unexpected source of overhead, or an opportunity for improvement. One must be careful about factors not accounted for by this analysis. Graphical output and user interfaces, in particular, can be a significant overhead, especially if one tries to display the result of every iteration.
8.4 MPI Advanced Features
So far we have only covered the most basic procedures in the MPI library. MPI incorporates many of the best ideas that were implemented in the older research and proprietary systems that constitute its heritage. In this section we briefly review some of the more important advanced features. Programmers intending to use these features will need to consult the reference manual either online or in print form for details.
8.4.1 Blocking and Non-blocking Calls and Alternative Sending Modes
Blocking calls blocking communication calls, discussed in Section 8.2.1, always wait for the requested action (send or receive) to complete before returning

 



How to Build a Beowulf
How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters (Scientific and Engineering Computation)
ISBN: 026269218X
EAN: 2147483647
Year: 1999
Pages: 134

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net