2.3 Sometimes Hardware Fails and Software Quits

When multiple processors are cooperating to provide the solution to some problem, what happens if one or more of the processors fail? Should the program halt or should the work be redistributed somehow? When multiple computers are involved in the solution to some problem, what happens if the communications link between two or more of the computers is temporarily interrupted? What if instead of the communications link being interrupted , the traffic is so slow that processes on each end of the communications time out? How should the software respond in these situations? If we have 50 computers cooperatively solving a problem and only two of the computers fail, should the other 48 pick up the work? If in our electronic banking programming we have a $1,000 withdrawal and deposit tasks executing simultaneously and two of the tasks are deadlocked, should we shut down the server task? What do we do about the two tasks that are locked? What if the withdrawal and deposit tasks are working properly and for some reason the server task locks up? Should we terminate all the pending withdrawal and deposit tasks? What do we do about partial failures or partial executions? These kinds of considerations are not necessary in single computer sequential programs. Sometimes the failure is a result of some administration or security policy. For instance, if we have 1,000 routines working on some problem and several of the routines need write access to a file but don't have the write access, this could cause indefinite postponement, deadlock, or partial failure. What if some of the routines are blocked because they don't have security access to the resources they need? Should the entire system be shut down in such cases? How can the information or processing performed be useful if there are hardware interruptions, communications failures, and partial executions? Yet these situations represent normal processing within distributed and parallel environments. In this book, we present several software architectures and programming techniques that can be used to manage these situations.



Parallel and Distributed Programming Using C++
Parallel and Distributed Programming Using C++
ISBN: 0131013769
EAN: 2147483647
Year: 2002
Pages: 133

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net