The MPP architecture provides much more parallelism than the SMP architecture. Moreover, unlike all other parallel architectures, the MPP architecture is scalable. It means that the speedup provided by this architecture is potentially infinite. This is due to the absence of principle bottlenecks that might limit the number of efficiently interacting processors.
Message passing is the dominant programming model for the MPP architecture. As the MPP architecture is much far away from the serial scalar architecture than the vector, superscalar and even SMP architectures, it is very difficult to automatically generate efficient message-passing code for a serial source code written in C or Fortran 77. In fact optimizing C or Fortran 77 compilers for MPPs would have to solve the problem of automatic synthesis of an efficient message-passing program using the source serial code as a specification of its functional semantics. This problem is still a research challenge. Therefore no industrial optimizing C or Fortran 77 compiler for the MPP architecture is now available.
Basic programming tools for MPPs are message-passing libraries and high-level parallel languages. Message-passing libraries directly implement the message-passing paradigm and allow the programmers to explicitly write efficient parallel programs for MPPs. MPI is a standard message-passing interface supporting efficiently portable parallel programming MPPs. Unlike the other popular message-passing library, PVM, MPI supports modular parallel programming and hence can be used for development of parallel libraries.
MPI is a powerful programming tool that allows the programmers to implement a wide range of parallel algorithms on MPPs in the form of highly efficient and portable message-passing applications. However, scientific programmers find the explicit message passing provided by MPI tedious and error-prone. They use data parallel programming languages, mainly HPF, to write programs for MPPs. When programming in HPF, the programmer specifies the strategy for parallelization and data partitioning at a higher level of abstraction, based on the single-threaded data parallel model with a global name space. The tedious low-level details of translating from an abstract global name space to the local memories of individual processors and the management of explicit interprocessor communication are left to the compiler.
Data parallel programs are easy to write and debug. At the same time, data parallel programming model allows the programmer to express only a limited class of parallel algorithms. HPF 2.0 addresses the problem by extending purely data parallel HPF 1.1 with some task parallel features. The resulting multiparadigm language is more complicated and not as easy-to-use as pure data parallel languages.
Data parallel languages (i.e., HPF) are difficult to compile. Therefore it is hard to get top performance via data parallel programming. The efficiency of data parallel programs strongly depends on the quality of the compiler.