6.2 The PVM Library for C

6.2 The PVM Library for C++

The PVM functionality is accessed by C++ through a collection of library routines provided by the PVM environment. The routines are typically divided into seven categories:

Process Management and Control
Messaging Packing and Sending
Message Unpacking and Receiving
Task Signaling
Message Buffer Management
Information and Utility Functions
Group Operations

The library routines are easy to integrate into the C++ environment. The pvm_ prefix to each function helps to keep the namespace clear. To use the PVM library routines, your programs must include the pvm3.h header file and link to libpvm . Programs 6.1 and 6.2 show how a simple PVM program works. The instructions for compiling and executing Program 6.1 are contained in Program Profile 6.1.

Program 6.1

 #include "pvm3.h" #include <iostream> #include <string.h> int main(int argc,char *argv[]) {    int RetCode,MessageId;    int PTid, Tid;    char Message[100];    float Result[1];    PTid = pvm_mytid();    RetCode = pvm_spawn("program6-2",NULL,0," ",1,&Tid);    if(RetCode == 1){       MessageId = 1;       strcpy(Message,"22");       pvm_initsend(PvmDataDefault);       pvm_pkstr(Message);       pvm_send(Tid,MessageId);       pvm_recv(Tid,MessageId);       pvm_upkfloat(Result,1,1);       cout << Result[0] << endl;       pvm_exit();       return(0);    }    else{           cerr << "Could not spawn task " << endl;           pvm_exit();           return(1);    } }

Program Profile 6.1

Program Name

 program6-1.cc

Description

Uses pvm_send to send a number to another PVM task that is executing (Program 6.2) and pvm_recv to receive a number from that task.

Libraries Required

 libpvm3

Headers Required

 <pvm3.h> <iostream> <string.h>

Compile and Link Instructions

[View full width]

  [View full width] 
 c++ -o program6-1 -I $PVM_ROOT/include -L $PVM_ROOT/lib/  $PVM_ARCH -l pvm3

Test Environment

Solaris 8, PVM 3.4.3, SuSE Linux 7.1, gcc 2.95.2,

Execution Instructions

 ./program6-1

Notes

pvmd must be running.

Program 6.1 calls eight commonly used PVM routines: pvm_mytid() , pvm_spawn() , pvm_initsend() , pvm_pkstr() , pvm_send() , pvm_recv() , pvm_upkfloat() , and pvm_exit() . The pvm_mytid() routine returns the task identifier of the calling process. The PVM system associates a task identifier with each process that it creates. The task identifier is used to send messages between tasks, to receive messages from other tasks, to signal tasks, to interrupt tasks , and so on. Any PVM task may communicate with any other PVM task as long as it has access to the task identifier of the task it wants to communicate with. The pvm_spawn() routine is used to start new PVM processes. Program 6.1 uses the pvm_spawn() process to start a new process to execute Program 6.2. The task identifier for the new task is returned in the &Tid parameter of the pvm_spawn() call. The PVM environment uses message buffers to pass data between tasks. Each task can have one or more message buffers. However, only one buffer is considered the active message buffer. Prior to sending each message the pvm_initsend() routine is called to prepare or initialize the active message buffer. The pvm_pkstr() routine is used to pack the string contained in the message variable. This packing encodes the string for transport to another task in another process possibly on another machine with a different machine architecture. The PVM environment handles the details of the architecture-to-architecture conversions. The PVM environment requires the use of a packing routine prior to sending and an unpacking routine during receiving to make the message readable by the receiver. However, there is an exception to this, which we will discuss later. The pvm_send() and pvm_recv() are used to send and receive messages. The MessageId simply identifies which message the caller or sender is working with. Notice in Program 6.1 that the pvm_send() and pvm_receive() routines contain the task identifier of the task receiving the data and the task identifier of the task sending the data. The pvm_upkfloat() routine takes the message it retrieves from the active message buffer and unpacks it into an array of type float . Program 6.1 spawns a PVM task to execute Program 6.2.

Notice that Programs 6.1 and 6.2 both contain a call to the routine pvm_exit() . It's important that this function is called when the PVM processing for a task is finished. Although this routine does not kill the process or stop the process, it does PVM cleanup for the task and disconnects the task from the PVM. Notice that Programs 6.1 and 6.2 are self-contained, standalone programs that contain the main() function. Program Profile 6.2 has the implementation details for Program 6.2.

Program 6.2

 #include "pvm3.h" #include "stdlib.h" int main(int argc, char *argv[]) {    int MessageId, Ptid;    char Message[100];    float Num,Result;    Ptid = pvm_parent();    MessageId = 1;    pvm_recv(Ptid,MessageId);    pvm_upkstr(Message);    Num = atof(Message);    Result = Num / 7.0001;    pvm_initsend(PvmDataDefault);    pvm_pkfloat(&Result,1,1);    pvm_send(Ptid,MessageId);    pvm_exit();    return(0); }

Program Profile 6.2

Program Name

 program6-2.cc

Description

This program receives a number from its parent process and divides that number by 7. It sends the result to its parent process.

Libraries Required

 libpvm3

Headers Required

 <pvm3.h> <stdlib.h>

Compile and Link Instructions

[View full width]

  [View full width] 
 c++ -o program6-2 -I $PVM_ROOT/include program6-2.cc -L $PVM_ROOT  /lib/PVM_ARCH -lpvm3

Test Environment

SuSE Linux 7.1 gnu C++ 2.95.2, Solaris 8 Workshop 6, PVM 3.4.3

Execution Instructions

This program is spawned by Program 6.1.

Notes

pvmd must be running.

6.2.1 Compiling and Linking a C++/PVM Program

Version 3.4.x of the PVM environment packages the routines in a single library, libpvm3.a . To compile a PVM program include the pvm3.h header file and link with libpvm3.a :

  $ c++ -o mypvm_program -I $PVM_ROOT/include mypvm_program.cc   -I$PVM_ROOT/lib -lpvm3

The $PVM_ROOT environment variable points to the PVM installed directory. This command will produce a binary called mypvm_program .

To execute Programs 6.1 and 6.2, you must have the PVM environment properly installed. Three basic methods can be used to execute a PVM program: as a standalone binary, using the PVM console, or using XPVM .

6.2.2 Executing a PVM Program as a Standalone

The pvmd program must be started and each host involved in the PVM must have the correctly compiled programs in the appropriate directory. The default directory for the compiled programs (binaries) is:

 $HOME/pvm3/bin/$PVM_ARCH

where the PVM_ARCH contains the name of the machine's architecture. See Table 6-2 and items 1 and 2 from Section 6.1.5. The binaries should have the proper file permissions set to allow them to be accessed and executed. The pvmd program can be started as:

 pvmd &

or:

 pvmd hostfile &

where hostfile is a configuration file that has special options to be passed to the pvmd program. See item 5 from section 6.1.5. After the pvmd program has been started on one of the computers involved in the PVM, a PVM program can then be started simply by:

  $MyPvmProgram

If this program spawns any other tasks they will be started automatically.

6.2.2.1 Starting PVM Programs Using the PVM Console

To execute the programs using the PVM console, type the following at the PVM console. Start the PVM console by typing:

 $pvm

and at the pvm> prompt, type the name of the program to be executed:

 pvm>  spawn -> MyPvmProgram

6.2.2.2 Start PVM Programs Using XPVM

Besides starting the programs using the terminal-based PVM console, XPVM graphical interface for X Windows can be used. Figure 6-2 shows what to type in the tasks dialog of a XPVM session.

Figure 6-2. The XPVM task dialog.

graphics/06fig02.gif

The PVM library does not force any particular structure on a C++ program. The first PVM routine called by a program enrolls that program into the PVM. It is good practice to always call pvm_exit() for every program that is part of the PVM. If this routine is not called for every PVM task, the system will hang. It is a good rule of thumb to call pvm_mytid() and pvm_parent() early in the processing of the task. Table 6-1 contains the library routines broken down into the seven commonly used categories.

Table 6-1. Seven Categories of PVM Library Routines

Categories of PVM Library Routines	Description
Process Management and Control	Routines used to manage and control PVM processes.
Message Packing and Sending	Routines used to pack messages into a send buffer and send messages from one PVM process to another.
Message Unpacking and Receiving	Routines used to receive messages and unpack the data from the active buffer.
Task Signaling	Routines used to signal and notify PVM processes about the occurrence of an event.
Message Buffer Management	Routines used to initialize, empty, dispose, and otherwise manage buffers used to receive and send messages between PVM processes.
Information and Utility Functions	Routines used to return information about a PVM process and perform other important tasks.
Group Operations	Routines used joining, leaving, and otherwise managing processes in a group.

6.2.3 A PVM Preliminary Requirements Checklist

In addition to obtaining and properly installing a PVM distribution, there are a few other minor considerations. When the PVM environment is implemented as a network of computers, the following items must be handled before your C++ program can interact with the PVM environment.

Item 1

The environment variable PVM_ROOT and PVM_ARCH should be set. The environment variable PVM_ROOT should be set to the directory where PVM is installed.

Using the Bourne Shell (bash)	Using the C Shell
`$ PVM_ROOT=/usr/lib/pvm3`	`setenv PVM_ROOT /usr/lib/pvm3`
`$ export PVM_ROOT`

The PVM_ARCH environment variable identifies the architecture of the machine. Each machine involved in the PVM must be identified by architecture. For example, our Ultrasparcs have the designation SUN4SOL2 and our Linux machines have the designation LINUX. Table 6-2 shows the most commonly used architectures for the PVM environment. Check with your distribution of PVM if an appropriate architecture for your machines is not contained in Table 6-2.

Table 6-2 shows the name and machine type associated with the name. Set your PVM_ARCH environment variable to one of the names in Table 6-2. For instance:

Table 6-2. Most Commonly Used Architectures for the PVM Environment

PVM_ARCH	Computer
AFX8	Alliance
ALPHA	DEC Alpha
BAL	Sequent Balance
BFLY	BBN Butterfly TC2000
BSD386	80386/486 PC Running UNIX
CM2	Thinking Machine CM2
CM5	Thinking Machine CM5
CNVX	Convex C-series
CNVXN	Convex C-series
CRAY	C-90, YMP, T3D port available
CRAY2	Cray-2
CRAYSIMP	Cray S-MP
DGAV	Data General Aviion
E88K	Encore 88000
HP300	HP-9000 Model 300
HPPA	HP-9000 PA-RISC
I860	Intel iPSC/860
IPSC2	Intel iPSC/2 386 Host
KSRI	Kendall Square KSR-1
LINUX	80386/486 PC Running UNIX
MASPAR	Maspar
MIPS	MIPS 4680
NEXT	NeXT
PGON	Intel Paragon
PMAX	DECstation 3100,5100
RS6K	IBM/RS6000
RT	IBM RT
SGI	Silicon Graphics IRIS
SGI5	Silicon Graphics IRIS
SGIMP	SGI Multiprocessor
SUN3	Sun 3
SUN4	Sun 4, SPARCstation
SUN2SOL2	Sun 4, SPARCstation
SUNMP	SPARC Multiprocessor
SYMM	Sequent Symmetry
TITN	Stardent Titan
U370	IBM 370
UVAX	DEC Licro VAX

Using the Bourne Shell (bash)	Using the C Shell
`$PVM_ARCH=LINUX`	`setenv PVM_ARCH LINUX`
`$export PVM_ARCH`

Item 2

The binaries (executables) for any programs participating in the PVM have to be either located on all machines involved or accessible by all machines involved in the PVM. In addition to availability, each program must be compiled to work for the architecture it will run on. This means if we have UltraSparcs, PowerPCs, and Intel processors involved in the PVM, then we must have a version of the program compiled for each architecture. That version must be located in a place that the PVM is aware of. The location is often $HOME/pvm3/bin . However, the location can be specified in a PVM configuration file usually referred to as the hostfile or .xpvm_hosts if the XPVM environment is used. The hostfile would contain an entry such as:

 ep=/usr/local/pvm3/bin

This specifies any user binaries needed by the PVM can be found in the /usr/local/pvm3/bin directory.

Item 3

The user initiating the PVM program must have network access to each machine involved in the PVM. This access is typically rsh or ssh access. See the main pages for more details on the rsh and ssh programs. By default, the PVM accesses each machine using the login name of the user initiating the PVM program or the account name of the machine starting the PVM program. If another account besides the initiating login account is required, an entry must be added to the host file or .xpvm_hosts . For example:

 lo=flashgordon

Item 4

Create a .rhosts file on each host listing all the hosts you wish to use. These are the computers that have the potential to be involved in the PVM. Depending on the setting in the .xpvm_hosts file or the pvm_hosts file, these computers will automatically be added to the PVM when the pvmd is started. Computers listed in these files can also be dynamically added to the PVM at runtime.

Item 5

Create a $HOME/.xpvm_hosts and/or a $HOME/pvm_hosts file listing all the hosts you wish to use prepended by an & . The & means don't automatically ad the host. Not using & will cause the host to be automatically added. The pvm_hostfile is a user-created file. The name is arbitrary. However, .xpvm_hosts is the required name when using the XPVM environment. Figure 6-3 shows an example of a PVM hostfile. The same format would be used for the PVM console hostfile or for .xpvm_hosts .

Figure 6-3. An example of a PVM host file.

graphics/06fig03.gif

The primary thing to keep in mind is network access of the user running the PVM program. The owner of the PVM program should have account access to every computer involved in the pool of processors that will be executing parts of the program. This access will use either the rsh or rlogin commands or ssh . The program to be executed must be available on each host and the PVM environment must be aware of what the hosts are and where the binaries will be installed.

6.2.4 Combining the C++ Runtime Library and the PVM Library

Since access to the PVM is provided through a collection of library routines, a C++ program treats the PVM as any other library. Keep in mind that each PVM program is a standalone C++ program with its own main() function. This means that each PVM program has its own address space. When a PVM task is spawned, a new process is created. Each PVM program will have its own process and process id. The PVM processes are visible to the ps utility. Although two or more PVM tasks may be working together to solve some problem, they will have their own copies of the C++ runtime library. Each program has its own iostream , template library, algorithms, and so on. The scope of global C++ variables do not cross address space. This means global variables in one PVM task will be invisible to the other PVM tasks involved in the processing. Message passing is used to communicate between these separate tasks. Notice that this is in contrast to multithreaded programs where threads share the same address space and may communicate through parameter passing and global variables . If the PVM programs are executing on a single computer that has multiple processors, then the programs may share a file system and can use pipes, fifos, shared memory, and files as additional means to communicate. While message passing is the premier method of communicating between PVM tasks, nothing prevents the use of shared file systems, clipboards, or even command-line arguments as supplemental methods of communication between tasks. The PVM library adds to rather than restricts the capabilities of the C++ runtime library.

6.2.5 Approaches to Using PVM Tasks

The work a C++ program performs can be distributed between functions, objects, or combinations of functions and objects. The units of work in a program usually fall into logical categories: input/output, user interface, database processing, signal processing, error handling, numerical computation, and so on. Also, we try to keep user interface code separated from file processing code and printing routine code separated from the numerical computation code. Therefore, not only do we divide up the work our program does between functions or objects, we try to keep categories of functionality together. These logical groupings are organized into libraries, modules, object patterns, components , and frameworks. We maintain this type of organization when introducing PVM tasks into a C++ program. We can arrive at the WBS (Work Breakdown Structures) using either a bottom-up or top-down approach. In either case, the parallelism should naturally fit within the work that a function, module, or object has to do.

It is not a good idea to attempt to force parallelism in a program. Forced parallelism produces awkward program architectures that are hard to under-stand by making them hard to maintain and often hard to determine program correctness. So when a program uses PVM tasks, they should be a result of the natural division within the program. Each PVM task should be traceable to one of the categories of work within the program. For instance, if we have an application that has NLP (Natural Language Processing) and TTS (Text to Speech) processing as part of its user interface and inferencing as part of its data retrieval, then the parallelism that is natural within the NLP component should be represented as tasks within the NLP module or object that is responsible for NLP. Likewise, the parallelism within the inferencing component should be represented as tasks within the data retrieval module or the object or framework that is responsible for data retrieval. That is, we identify PVM tasks where they logically fit within the work that the program is doing as opposed to dividing the work the program does into a set of generic PVM tasks.

The notion of logic first, parallelism second, has several implications for C++ programs. It means that we might spawn PVM tasks from the main() function. We might spawn PVM tasks from subroutines called from main() or from other subroutines. We might spawn PVM tasks from within methods belonging to objects. Where we spawn the tasks depends on the concurrency requirements of the function, module, or object that is performing the work. The PVM tasks generally fall into two categories: SPMD (a derivative of SIMD) and MPMD (a derivative of MIMD). In the SPMD model, the tasks will execute the same set of instructions but on different pieces of data. In the MPMD model, each task executes different instructions on different data. Whether we are using the SPMD model or the MPMD model, the spawning of the task should be from the relevant areas of the program. Figure 6-4 shows some possible configurations for spawning PVM tasks.

Figure 6-4. Some possible configurations for spawning PVM tasks.

graphics/06fig04.gif

6.2.5.1 Using the SPMD ( SIMD ) Model with PVM and C++

In Figure 6-4, Case 1 represents the situation where the function main() spawns from 1 to N tasks where each task performs the same set of instructions but on different data sets. There are several options for implementing this scenario. Example 6.1 shows main() using the pvm_spawn routine.

Example 6.1 Calling the `pvm_spawn` routine from `main()` .

 int main(int argc, char *argv[]) {    int TaskId[10];    int TaskId2[5];    pvm_spawn("set_combination",NULL,0," ",10,TaskId); // 1rst Spawn    pvm_spawn("set_combination",argv,0," ",5,TaskId2); // 2nd Spawn    //... }

In Example 6.1, the first spawn creates 10 tasks. Each task will execute the same set of instructions contained in the set_combination program. The TaskId array will contain the task identifiers for the PVM tasks if the spawn was successful. Once the program in Example 6.1 has the TaskIds , then it can use the pvm_send() routines to send specific data for each program to work on. This is possible because the pvm_send() routine contains the task identifier of the receiving task. The second spawn in Example 6.1 creates five tasks but in this case it passes each task information through the argv parameter. This is an additional method to pass information to tasks during startup. This is another way for a child task to uniquely identify itself by using values it receives in the argv parameter. In Example 6.2, the main() function uses multiple calls to pvm_spawn() to create N tasks as opposed to a single call.

Example 6.2 Using multiple calls to `pvm_spawn` from `main()` .

 int main(int argc, char *argv[]) {    int Task1;    int Task2;    int Task3;    //...    pvm_spawn("set_combination",NULL,1,"host1",1,&Task1);    pvm_spawn("set_combination",argv,1,"host2",1,&,Task2);    pvm_spawn("set_combination",argv++,1,"host 3",1,&,Task3);    //... }

The approach used in Example 6.2 can be used when you want the tasks to execute on specific computers. This is one of the advantages of the PVM environment. A program can take advantage of some particular resource on a particular computer, for example, special math processor, graphics processor, or MPP capabilities. Notice in Example 6.2 each host is executing the same set of instructions but each host received a different command-line argument. Case 2 in Figure 6-4 represents the scenario where the main() function does not spawn the PVM tasks. In this scenario the PVM tasks are logically related to funcB() and therefore funcB() spawns the tasks. The main() function and funcA() don't need to know anything about the PVM tasks so there is no reason to put any of the PVM housekeeping code in those functions. Case 3 in Figure 6-4 represents the scenario where the main() function and other functions in the program have natural parallelism. In this case the other function is funcA() . Also the PVM tasks executed by main() and the PVM tasks executed by funcA() execute different code. Although the tasks that main() spawns execute identical code and the tasks that funcA() spawns executes identical code, the two sets of tasks are different. This illustrates that a C++ program may use collections of tasks to solve different problems simultaneously . There is no reason that the program has to be restricted to one problem at a time. In Case 4 from Figure 6-4, the parallelism is contained within an object, therefore one of the object's methods spawns the PVM tasks. Here, the logical place to initiate the parallelism was within a class as opposed to some free-floating function.

As in the other cases, the PVM tasks spawned in Case 4 all execute the same instructions but with different data. This SPMD (Single Program Multiple Data) method is a commonly used technique for parallelization of certain kinds of problem solving. The fact that C++ has support for objects and generic programming using templates makes C++ a particularly powerful choice for this kind of programming. The objects and templates allow the C++ programmer to represent very general and flexible solutions to entire classes of problems with a single piece of code. This single piece of code fits in nicely with the SPMD model of parallelism. The notion of a class extends the SPMD model so that an entire class of problems can be solved. The templates allow the class of problems to be solved for virtually any data type. So although each task in the SPMD model is executing the same piece of code, it might be for an object or any of its descendants and it might be for different data types (different objects!). For example, Example 6.1 uses four PVM tasks to generate four sets in which each has C(n,r) elements: C(24,9), C(24,12), C(7,4), and C(7,3). Specifically, Example 6.3 enumerates the combinations of a set of 24 colors taken 9 and 12 at a time. It also enumerates the combinations of a set of 7 floating point numbers taken 4 at a time and 3 at a time. For an explanation of the notation C(n,r), see Sidebar 6.1.

Example 6.3 Creating combinations of sets.

 int main(int argc,char *argv[]) {    int RetCode,TaskId[4];    RetCode = pvm_spawn("pvm_generic_combination",NULL,0," ",4,TaskId);    if(RetCode == 4){       colorCombinations(TaskId[0],9);       colorCombinations(TaskId[1],12);       numericCombinations(TaskId[2],4);       numericCombinations(TaskId[3],3);       saveResult(TaskId[0]);       saveResult(TaskId[1]);       saveResult(TaskId[2]);       saveResult(TaskId[3]);       pvm_exit();    }    else{       cerr << "Error Spawning ChildProcess" << endl;         pvm_exit();    }    return(0); }

Notice in Example 6.3 we spawn four PVM tasks:

 pvm_spawn("pvm_generic_combination",NULL,0," ",4,TaskId);

Each task will execute the program named pvm_generic_combination . The NULL argument in our pvm_spawn call means that we are not passing any options via the argv[] parameter. The in our pvm_spawn call means we don't care which computer the tasks execute on. TaskId is an array of four integers and will contain the task identifiers for each of the PVM tasks spawned if the call is successful. Notice in Example 6.3 we call colorCombinations() and numericCombinations() . These two functions assign the PVM tasks work. Example 6.4 contains the function definition for colorCombinations() .

Example 6.4 Definition of the `colorCombinations()` function.

 void colorCombinations(int TaskId,int Choices) {    int MessageId =1;    char *Buffer;    int Size;    int N;    string Source("blue purple green red yellow orange                  silver gray ");    Source.append("pink black white brown light_green                  aqua beige cyan ");    Source.append("olive azure magenta plum orchid violet    maroon lavender");    Source.append("\n");    Buffer = new char[(Source.size() + 100)];    strcpy(Buffer,Source.c_str());    N = pvm_initsend(PvmDataDefault);    pvm_pkint(&Choices,1,1);    pvm_send(TaskId,MessageId);    N = pvm_initsend(PvmDataDefault);    pvm_pkbyte(Buffer,strlen(Buffer),1);    pvm_send(TaskId,MessageId);    delete Buffer; }

Notice in Example 6.3 there are two calls to colorCombinations() . Each call assigns a PVM task a different number of color combinations to enumerate: C(24,9) and C(24,12). The first PVM task will produce 1,307,504 color combinations and the second task will produce 2,704,156 color combinations. The program named in the pvm_spawn() call does all the work. Each color is represented by a string. Therefore, when the pvm_generic_combination program is producing combinations it does so using a set of strings as the input. This is in contrast to the numericCombinations() function shown in Example 6.5. The code in Example 6.3 makes two calls to the numericCombinations() function. The first generates C(7,4) combinations and the second generates C(7,3) combinations.

Example 6.5 Using PVM tasks to produce numeric combinations.

 void numericCombinations(int TaskId,int Choices) {    int MessageId = 2;    int N;    double ImportantNumbers[7] = {3.00e+8,6.67e-11,1.99e+30,                                  1.67e-27,6.023e+23,6.63e-34,                                  3.14159265359};    N = pvm_initsend(PvmDataDefault);    pvm_pkint(&Choices,1,1);    pvm_send(TaskId,MessageId);    N = pvm_initsend(PvmDataDefault);    pvm_pkdouble(ImportantNumbers,5,1);    pvm_send(TaskId,MessageId); }

In the numericCombinations() function in Example 6.4, the PVM task is sent an array of floating point numbers as opposed to an array of bytes representing strings. So the colorCombinations() function sends its data to the PVM tasks using:

 pvm_pkbyte(Buffer,strlen(Buffer),1); pvm_send(TaskId,MessageId);

The numericCombination() function sends its data to the PVM tasks using:

 pvm_pkdouble(ImportantNumbers,5,1); pvm_send(TaskId,MessageId);

The colorCombinations() function in Example 6.4 builds a string of colors and then copies that string of colors into an array of char called Buffer . The array of char is then packed and sent to the PVM task using the pvm_pkbyte() and pvm_send() functions. The numericCombinations() function in Example 6.5 creates an array of double s and sends it to the PVM task using the pvm_pkdouble() and pvm_send() functions. One function sends a character array; the other function sends an array of double s. In both cases the PVM tasks are executing the same program pvm_generic_combination . This is where the advantage of using C++ templates and genericity comes in. The same tasks are able to do work not only with different data but on different data types without a code change. The template facility in C++ helps to make the SPMD model more flexible and efficient. The pvm_generic_combination program is almost unaware of what data types it will be working with. The use of C++ container classes allows it to generate combinations of any vector<T> of objects. The pvm_generic_combination program does know that it will be working with two data types. Example 6.6 shows a section of code from the pvm_generic_combination program.

Example 6.6 Using the `MessageId` tag to distinguish data types.

 pvm_bufinfo(N,&NumBytes,&MessageId,&Ptid); if(MessageId == 1){    vector<string> Source;    Buf = new char[NumBytes];    pvm_upkbyte(Buf,NumBytes,1);    strstream Buffer;    Buffer << Buf << ends;    while(Buffer.good())    {       Buffer >> Color;       if(!Buffer.eof()){          Source.push_back(Color);       }    }    generateCombinations<string>(Source,Ptid,Value);    delete Buf; } if(MessageId == 2){    vector<double> Source;    double *ImportantNumber;    NumBytes = NumBytes / sizeof(double);    ImportantNumber = new double[NumBytes];    pvm_upkdouble(ImportantNumber,NumBytes,1);    copy(ImportantNumber,ImportantNumber +(NumBytes + 1),    inserter(Source,Source.begin()));    generateCombinations<double>(Source,Ptid,Value);    delete ImportantNumber; }

Here we use the MessageId tag to distinguish which data type we are working with. But in C++ we can do better. If the MessageId tag contains a 1 , then we are working with strings. Therefore, we make the declaration:

 vector<string> Source;

If the MessageId tag contains a 2 , then we know we are working with floating point numbers, and we make the declaration:

 vector<double> Source;

Once we declare what type of data the vector source will contain, the rest of the function in the pvm_generic_combination is generalized. Notice in Example 6.6 that each if() statement calls the generateCombinations() function. This generateCombinations() function is a template function. This template architecture helps us to achieve the genericity that will extend the SPMD and the MPMD scenarios for our PVM programs. We will come back to the discussion of our pvm_generic_combination program after we present the basic mechanics of the PVM environment. It is important to note that C++ container classes, stream classes, and template algorithms add flexibility to PVM programming that cannot be easily implemented in other PVM environments. This flexibility creates opportunities for sophisticated yet elegant parallel architectures.

6.2.5.2 Using the MPMD (MIMD) Model with PVM and C++

Whereas the SPMD model uses the pvm_spawn() function to create some number of tasks executing the same program but on potentially different data or resources, the MPMD model will use the pvm_spawn() function to create tasks that are executing different programs each with their own data sets. Example 6.7 shows how a single C++ program could implement a MPMD model of computation using PVM calls.

Example 6.7 Using PVM to implement the MPMD model of computation.

 int main(int argc, char *argv[]) {    int Task1[20];    int Task2[50];    int Task3[30];    //...    pvm_spawn("pvm_generic_combination",NULL,1,"host1",20,Task1);    pvm_spawn("generate_plans",argv,0," ",50,Task2);    pvm_spawn("agent_filters",argv++,1,"host 3",30,&Task3);    //... }

The code in Example 6.7 creates 100 tasks. The first 20 tasks are generating combinations. The next 50 tasks are generating plans from the combinations as the combinations are being created. The last 30 tasks are filtering the best plans from the set of plans being generated by the set of 50 tasks. This is in contrast to the SPMD model, where all of the programs spawned by the pvm_spawn() function were the same. Here, we have pvm_generic_combination , generate_plans , and agent_filters performing the work of the PVM tasks. They are all executing concurrently. They each have their own set of data; although they are working with transformations of the data. The pvm_generic_combination transforms its input into something that generate_plans can use. The generate_plans program transforms its input into something that agent_filters can use. Obviously these tasks will send messages to each other. The messages will represent input and control information between the processes. Also notice in Example 6.7 that we used the pvm_spawn() routine to allocate 20 pvm_generic_combination on a computer named host1 . The generate_plans task was allocated to 50 anonymous processors, but each of the 50 tasks received the same command-line argument through the argv parameter. The agent_filters tasks were also directed to a particular computer. In this case, the computer was host 3 , and each task received the same command-line argument through the argv parameter. This emphasizes the flexibility and power of the PVM library. Figure 6-5 shows some options available for MPMD configurations using the PVM environment.

Figure 6-5. Some options available for MPMD configurations using the PVM environment.

graphics/06fig05.gif

We can take advantage of particular resources of particular computers if so desired. We can use arbitrary anonymous computers in other cases. In addition, we can assign different work to different tasks simultaneously. In Figure 6-5 Computer A is a MPP (Massively Parallel Processor) computer, and Computer B has a number of specialized numeric processors. Also notice that the PVM in Figure 6-5 consists of PowerPCs, Sparcs, Crays, and so on. In some cases we don't care what specific capabilities of the computers in a PVM are, but in other cases we do. The pvm_spawn() routine allows the C++ programmer to use the anonymous approach by simply not specifying which computer to create the tasks on. On the other hand, if there is something special about some member of the PVM, then that feature can be exploited by specifying the particular member using pvm_spawn() .

S 6.1. Combination Notation

Suppose we wish to choose a team of eight programmers from a pool of 24 candidates. How many different teams of eight programmers could we come up with? One of the results that follow from the Fundamental Principle of Counting tells us there are 735,471 different teams consisting of eight programmers that can be selected from a pool of 24. The notation C(n,r) is read the number of combination of n choose r. That is, the number of choices taken r at a time from n items. C(n,r) is calculated by the formula:

When we have a set that represents a combination, for example, {a,b,c} would be considered the same as the set {b,a,c}, or {c,b,a}. That is, we don't care about the order of the members in the set; we are only concerned about the members in the set. Many parallel programs, search algorithms, heuristics, and artificial intelligence “based programs have to deal with large sets of combinations and their close relative, permutations .