10.2 Writing PVM Applications

10.2 Writing PVM Applications

The PVM system currently supports many languages. C, C++, and Fortran languages are supported in the standard distribution. Third-party groups have created freely available Java, Perl, Python, S, Matlab, TCL/TK, and IDL interfaces to PVM. All these are downloadable from the PVM Web site (www.csm.ornl.gov/pvm). PVM is designed so that an application can be composed of tasks written in any mixture of these languages and the tasks will still be able to exchange data and to synchronize with each other.

The general paradigm for application programming with PVM is as follows. You write one or more sequential programs that contain embedded calls to the PVM library. Each program corresponds to a task making up the application. These programs are compiled for each architecture in the host pool, and the resulting object files are placed at a location accessible from machines in the host pool. To execute an application, you typically start one copy of one task (typically the "manager" or "initiating" task) by hand from a machine within the host pool. This process subsequently spawns other PVM tasks, eventually resulting in a collection of active tasks that then compute on the cluster and exchange messages with each other to solve the problem.

The C and C++ language bindings for the PVM user interface library are implemented as functions, following the general conventions used by most C systems. To elaborate, function arguments are a combination of value parameters and pointers as appropriate, and function result values indicate the outcome of the call. In addition, macro definitions are used for system constants, and global variables such as errno and pvm_errno are the mechanism for discriminating between multiple possible outcomes. Application programs written in C and C++ access PVM library functions by linking against an archival library ('libpvm3.a') that is part of the standard distribution.

Fortran language bindings are implemented as subroutines rather than as functions. This approach was taken because some compilers on the supported architectures would not reliably interface Fortran functions with C functions. One immediate implication of this is that an additional argument is introduced into each PVM library call for status results to be returned to the invoking program. Moreover, library routines for the placement and retrieval of typed data in message buffers are unified, with an additional parameter indicating the datatype. Apart from these differences (and the standard naming prefixes pvm_ for C, and pvmf for Fortran), a one-to-one correspondence exists between the two language bindings. Fortran interfaces to PVM are implemented as library stubs that in turn invoke the corresponding C routines, after casting and/or dereferencing arguments as appropriate. Thus, Fortran applications are required to link against the stubs library ('libfpvm3.a') as well as the C library.

All PVM tasks are identified by an integer task identifier tid. Messages are sent to tids and received from tids. Since tids must be unique across the entire virtual machine, they are supplied by the local pvmd and are not user chosen. Although PVM encodes information into each tid to improve performance, the user is expected to treat the tids as opaque integer identifiers. PVM contains several routines that return tid values so that the user application can identify other tasks in the system.

As mentioned earlier, tasks interact through explicit message passing, identifying each other with a system-assigned, opaque tid.

Shown in Figure 10.2 is the body of the PVM program 'hello.c', a simple example that illustrates the basic concepts of PVM programming. This program is intended to be invoked manually; after printing its task id (obtained with pvm_mytid()), it initiates a copy of another program called 'hello_other.c' using the pvm_spawn() function. A successful spawn causes the program to execute a blocking receive using pvm_recv. After the message is received, it is unpacked into a format the receiving computer understands using pvm_upkstr. Then the program prints the message as well its task id. The final pvm_exit call dissociates the program from the PVM system.

start figure

 #include "pvm3.h" main() {         int cc, tid, msgtag;         char buf [100];         printf("i'm t%x\n", pvm_mytid());         cc = pvm_spawn("hello_other", (char**)0, 0, "", 1, &tid);         if (cc == 1) {                 msgtag = 1;                 pvm_recv(tid, msgtag);                 pvm_upkstr(buf);                 printf("from t%x: %s\n", tid, buf);         } else                 printf("can't start hello_other\n");         pvm_exit(); } 

end figure

Figure 10.2: PVM program 'hello.c'.

Figure 10.3 is a listing of the hello_other program. Its first PVM action is to obtain the task id of its parent using the pvm_parent call. This program then obtains its hostname and transmits it to the parent using the three-call sequence: pvm_initsend to initialize the (transparent) send buffer; pvm_pkstr to place a string in a strongly typed and architecture-independent manner into the send buffer; and pvm_send to transmit it to the destination process specified by ptid, "tagging" the message with the number 1.

start figure

 #include  "pvm3.h" main() {         int ptid, msgtag;         char buf[100];         ptid = pvm_parent();         strcpy(buf, "hello, world from  ");         gethostname(buf + strlen(buf), 64);         msgtag = 1;         pvm_initsend(PvmDataDefault);         pvm_pkstr(buf);         pvm_send(ptid, msgtag);         pvm_exit(); } 

end figure

Figure 10.3: PVM program 'hello_other.c'.

Message tags are user-defined identifiers put on a message by the sender so that the receiving task can selectively get a particular message from the many that may have arrived. The receiver does not have to, nor may it be able to, know the tag put on a message. It is possible in PVM to probe for what tags have arrived so far. It is also possible to ignore the tag and simply receive the messages in the order they arrive at the receiving task. Message tags will become necessary as we explore more complicated PVM examples.

The next example, 'forkjoin.c', demonstrates spawning a parallel application from one cluster node. We then show PVM used in a Fortran dot product program PSDOT.F and a matrix multiply example that demonstrates the use of groups. Lastly, we show an example of a master/worker PVM application that calculates heat diffusion through a wire.

10.2.1 fork/join

The fork/join example demonstrates how to spawn off PVM tasks and synchronize with them. The program spawns the number of tasks specified by the user during startup. The children then synchronize by sending a message to their parent task. The parent receives a message from each of the spawned tasks and prints out information about the message from the child tasks.

This program contains the code for both the parent and the child tasks. Let's examine it in more detail. The first action the program takes is to call pvm_mytid(). In fork/join we check the value of mytid; if it is negative, indicating an error, we call pvm_perror() and exit the program. The pvm_perror() call will print a message indicating what went wrong with the last PVM call. In this case the last call was pvm_mytid(), so pvm_perror() might print a message indicating that PVM hasn't been started on this machine. The argument to pvm_perror() is a string that will be prepended to any error message printed by pvm_perror(). In this case we pass argv[0], which is the name of the program as it was typed on the command-line. The pvm_perror() function is modeled after the Unix perror() function.

Assuming we obtained a valid result for mytid, we now call pvm_parent(). The pvm_parent() function will return the tid of the task that spawned the calling task. Since we run the initial forkjoin program from a command prompt, this initial task will not have a parent; it will not have been spawned by some other PVM task but will have been started manually by the user. For the initial fork/join task the result of pvm_parent() will not be any particular task id but an error code, PvmNoParent. Thus we can distinguish the parent fork/join task from the children by checking whether the result of the pvm_parent() call is equal to PvmNoParent. If this task is the parent, then it must spawn the children. If it is not the parent, then it must send a message to the parent.

Let's examine the code executed by the parent task. The number of tasks is taken from the command-line as argv[1]. If the number of tasks is not legal, then we exit the program, calling pvm_exit() and then returning. The call to pvm_exit() is important because it tells PVM this program will no longer be using any of the PVM facilities. (In this case the task exits and PVM will deduce that the dead task no longer needs its services. Regardless, it is good style to exit cleanly.) If the number of tasks is valid, fork/join will then attempt to spawn the children.

The pvm_spawn() call tells PVM to start ntask tasks named argv[0]. The second parameter is the argument list given to the spawned tasks. In this case we don't care to give the children any particular command-line arguments, so this value is null. The third parameter to spawn, PvmTaskDefault, is a flag telling PVM to spawn the tasks in the default method. The default method is to distribute the tasks round robin to all the cluster nodes in the virtual machine. Had we been interested in placing the children on a specific machine or a machine of a particular architecture, we would have used PvmTaskHost or PvmTaskArch for this flag and specified the host or architecture as the fourth parameter. Since we don't care where the tasks execute, we use PvmTaskDefault for the flag and null for the fourth parameter. Finally, ntask tells spawn how many tasks to start, and the integer array child will hold the task ids of the newly spawned children. The return value of pvm_spawn() indicates how many tasks were successfully spawned. If info is not equal to ntask, then some error occurred during the spawn. In case of an error, the error code is placed in the task id array, child, instead of the actual task id; forkjoin loops over this array and prints the task ids or any error codes. If no tasks were successfully spawned, then the program exits.

For each child task, the parent receives a message and prints out information about that message. The pvm_recv() call receives a message from any task as long as the tag for that message is JOINTAG. The return value of pvm_recv() is an integer indicating a message buffer. This integer can be used to find out information about message buffers. The subsequent call to pvm_bufinfo() does just this; it gets the length, tag, and task id of the sending process for the message indicated by buf. In forkjoin the messages sent by the children contain a single integer value, the task id of the child task. The pvm_upkint() call unpacks the integer from the message into the mydata variable. As a sanity check, forkjoin tests the value of mydata and the task id returned by pvm_bufinfo(). If the values differ, the program has a bug, and an error message is printed. Finally, the information about the message is printed, and the parent program exits.

The last segment of code in forkjoin will be executed by the child tasks. Before data is placed in a message buffer, the buffer must be initialized by calling pvm_initsend(). The parameter PvmDataDefault indicates that PVM should do whatever data conversion is needed to ensure that the data arrives in the correct format on the destination processor. In some cases this may result in unnecessary data conversions. If you are sure no data conversion will be needed because the destination machine uses the same data format, then you can use PvmDataRaw as a parameter to pvm_initsend(). The pvm_pkint() call places a single integer, mytid, into the message buffer. It is important to make sure the corresponding unpack call exactly matches the pack call. Packing an integer and unpacking it as a float is an error. There should be a one-to-one correspondence between pack and unpack calls. Finally, the message is sent to the parent task using a message tag of JOINTAG.

       pvm_perror("calling pvm_initsend"); pvm_exit(); return -1;       }    info = pvm_pkint(&mytid, 1, 1);    if (info < 0) {       pvm_perror("calling pvm_pkint"); pvm_exit(); return -1;       }    info = pvm_send(myparent, JOINTAG);    if (info < 0) {       pvm_perror("calling pvm_send"); pvm_exit(); return -1;       }    pvm_exit();    return 0; } 

Figure 10.4 shows the output of running fork/join. Notice that the order the messages were received is nondeterministic. Since the main loop of the parent processes messages on a first-come first-served basis, the order of the prints are determined simply by the time it takes messages to travel from the child tasks to the parent.

start figure

 /*     Fork Join Example     Demonstrates how to spawn processes and exchange messages */     /* defines and prototypes for the PVM library */     #include <pvm3.h>     /* Maximum number of children this program will spawn */     #define MAXNCHILD  20     /* Tag to use for the joing message */     #define JOINTAG    11     int     main(int argc, char* argv[])     {         /* number of tasks to spawn, use 3 as the default */         int ntask = 3;         /* return code from pvm calls */         int info;         /* my task id */         int mytid;         /* my parents task id */         int myparent;         /* children task id array */         int child[MAXNCHILD];         int i, mydata, buf, len, tag, tid;         /* find out my task id number */         mytid = pvm_mytid();         /* check for error */         if (mytid < 0) {             /* print out the error */             pvm_perror(argv[0]);             /* exit the program */             return -1;             }         /* find my parent's task id number */         myparent = pvm_parent();         /* exit if there is some error other than PvmNoParent */         if ((myparent < 0) && (myparent != PvmNoParent)              && (myparent != PvmParentNotSet)) {             pvm_perror(argv[0]);             pvm_exit ();             return -1;             }     /* if i don't have a parent then i am the parent */     if (myparent == PvmNoParent || myparent == PvmParentNotSet) {         /*  find out how many tasks to spawn */         if (argc == 2) ntask = atoi(argv[l]) ;         /* make sure ntask is legal */         if ((ntask < 1) || (ntask > MAXNCHILD)) { pvm_exit(); return 0; }         /* spawn the child tasks */         info = pvm_spawn(argv[0], (char**)0, PvmTaskDefault, (char*)0,             ntask, child);         /* print out the task ids */         for (i = 0; i < ntask; i++)             if (child[i] < 0) /* print the error code in decimal*/                 printf(" %d", child[i]);             else /* print the task id in hex */                 printf("t%x\t", child[i]);         putchar('\n');         /* make sure spawn succeeded */         if (info == 0) { pvm_exit(); return -1; }         /* only expect responses from those spawned correctly */         ntask = info;         for (i = 0; i < ntask; i++) {             /* recv a message from any child process */             buf = pvm_recv(-1, JOINTAG);             if (buf < 0) pvm_perror("calling recv");             info = pvm_bufinfo(buf, &len, &tag, &tid);             if (info < 0) pvm_perror("calling pvm_bufinfo");             info = pvm_upkint(&mydata, 1, 1);             if (info < 0) pvm_perror("calling pvm_upkint");             if (mydata != tid) printf("This should not happen!\n");             printf("Length %d, Tag %d, Tid t%x\n", len, tag, tid);             }         pvm_exit();         return 0;         }     /* i'm a child */     info = pvm_initsend(PvmDataDefault);     if (info < 0) {         % forkjoin         t10001c t40149  tc0037         Length 4, Tag 11, Tid t40149         Length 4, Tag 11, Tid tc0037         Length 4, Tag 11, Tid t10001c         % forkjoin 4         t10001e t10001d t4014b  tc0038         Length 4, Tag 11, Tid t4014b         Length 4, Tag 11, Tid tc0038         Length 4, Tag 11, Tid t10001d         Length 4, Tag 11, Tid t10001e 

end figure

Figure 10.4: Output of fork/join program.

10.2.2 Dot Product

Here we show a simple Fortran program, PSDOT, for computing a dot product. The program computes the dot product of two arrays, X and Y. First PSDOT calls PVMFMYTID() and PVMFPARENT(). The PVMFPARENT call will return PVMNOPARENT if the task wasn't spawned by another PVM task. If this is the case, then PSDOT task is the master and must spawn the other worker copies of PSDOT.PSDOT then asks the user for the number of processes to use and the length of vectors to compute. Each spawned process will receive n/nproc elements of X and Y, where n is the length of the vectors and nproc is the number of processes being used in the computation. If nproc does not divide n evenly, then the master will compute the dot product on the extra elements. The subroutine SGENMAT randomly generates values for X and Y. PSDOT then spawns nproc-1 copies of itself and sends each new task a part of the X and Y arrays. The message contains the length of the subarrays in the message and the subarrays themselves. After the master spawns the worker processes and sends out the subvectors, the master then computes the dot product on its portion of X and Y. The master process then receives the other local dot products from the worker processes. Notice that the PVMFRECV call uses a wild card (-1) for the task id parameter. This indicates that a message from any task will satisfy the receive. Using the wild card in this manner results in a race condition. In this case the race condition does not cause a problem because addition is commutative; in other words, it doesn't matter in which order we add up the partial sums from the workers. However, unless one is certain that the race will not affect the program adversely, race conditions should be avoided.

Once the master receives all the local dot products and sums them into a global dot product, it then calculates the entire dot product locally. These two results are then subtracted, and the difference between the two values is printed. A small difference can be expected because of the variation in floating-point roundoff errors.

If the PSDOT program is a worker, then it receives a message from the master process containing subarrays of X and Y. It calculates the dot product of these subarrays and sends the result back to the master process. In the interests of brevity we do not include the SGENMAT and SDOT subroutines.

      PROGRAM PSDOT * * PSDOT performs a parallel inner (or dot) product, where the vectors * X and Y start out on a master node, which then sets up the virtual * machine, farms out the data and work, and sums up the local pieces * to get a global inner product. * *    .. External Subroutines ..      EXTERNAL PVMFMYTID, PVMFPARENT, PVMFSPAWN, PVMFEXIT, PVMFINITSEND      EXTERNAL PVMFPACK, PVMFSEND, PVMFRECV, PVMFUNPACK, SGENMAT * *    .. External Functions ..      INTEGER ISAMAX      REAL SDOT      EXTERNAL ISAMAX, SDOT * *    .. Intrinsic Functions ..      INTRINSIC MOD * *    .. Parameters ..      INTEGER MAXN      PARAMETER ( MAXN = 8000 )      INCLUDE 'fpvm3.h' * *    .. Scalars ..      INTEGER N, LN, MYTID, NPROCS, IBUF, IERR      INTEGER I, J, K      REAL LOOT, GDOT * *    .. Arrays ..      INTEGER TIDS(0:63)      REAL X(MAXN), Y(MAXN) * *    Enroll in PVM and get my and the master process' task ID number *      CALL PVMFMYTID( MYTID )      CALL PVMFPARENT( TIDS(0) ) * *    If I need to spawn other processes (I am master process) *      IF ( TIDS(0) EQ. PVMNOPARENT ) THEN * *       Get starting information *         WRITE(*,*) 'How many processes should participate (1-64)?'         READ(*,*) NPROCS         WRITE(*,2000) MAXN         READ(*,*) N         TIDS(0) = MYTID         IF ( N GT. MAXN ) THEN            WRITE(*,*) 'N too large.  Increase parameter MAXN to run'//      $                'this case.'            STOP         END IF * *       LN is the number of elements of the dot product to do *       locally.  Everyone has the same number, with the master *       getting any left over elements.  J stores the number of *       elements rest of procs do. *         J = N / NPROCS         LN = J + MOD(N, NPROCS)         I = LN + 1 * *       Randomly generate X and Y *       Note: SGENMAT() routine is not provided here *         CALL SGENMAT( N, 1, X, N, MYTID, NPROCS, MAXN, J )         CALL SGENMAT( N, 1, Y, N, I, N, LN, NPROCS ) * *       Loop over all worker processes *         DO 10 K = 1, NPROCS-1 * *          Spawn process and check for error *            CALL PVMFSPAWN( 'psdot', 0, 'anywhere', 1, TIDS(K), IERR )            IF (IERR .NE. 1) THEN               WRITE(*,*) 'ERROR, could not spawn process #',K,      $                   '. Dying . . .'               CALL PVMFEXIT( IERR )               STOP            END IF * *          Send out startup info *            CALL PVMFINITSEND( PVMDEFAULT, IBUF )            CALL PVMFPACK( INTEGER4, J, 1, 1, IERR )            CALL PVMFPACK( REAL4, X(I), J, 1, IERR )            CALL PVMFPACK( REAL4, Y(I), J, 1, IERR )            CALL PVMFSEND( TIDS(K), 0, IERR )            I = I + J    10   CONTINUE * *       Figure master's part of dot product *       SDOT() is part of the BLAS Library (compile with -lblas) *         GDOT = SDOT( LN, X, 1, Y, 1 ) * *       Receive the local dot products, and *       add to get the global dot product *         DO 20 K = 1, NPROCS-1            CALL PVMFRECV( -1, 1, IBUF )            CALL PVMFUNPACK( REAL4, LDOT, 1, 1, IERR )            GDOT = GDOT + LDOT    20   CONTINUE * *       Print out result *         WRITE(*,*) ' '         WRITE(*,*) '<x,y> = ',GDOT * *       Do sequential dot product and subtract from *       distributed dot product to get desired error estimate *         LDOT = SDOT( N, X, 1, Y, 1 )         WRITE(*,*) '<x,y> : sequential dot product.  <x,y>^ : '//     $              'distributed dot product.'         WRITE(*,*) '| <x,y> - <x,y>^ | = ' ,ABS(GDOT - LDOT)         WRITE(*,*) 'Run completed.' * *    If I am a worker process (i.e. spawned by master process) *      ELSE * *       Receive startup info *         CALL PVMFRECV( TIDS(0), 0, IBUF )         CALL PVMFUNPACK( INTEGER4, LN, 1, 1, IERR )         CALL PVMFUNPACK( REAL4, X, LN, 1, IERR )         CALL PVMFUNPACK( REAL4, Y, LN, 1, IERR ) * *       Figure local dot product and send it in to master *         LDOT = SDOT( LN, X, 1, Y, 1 )         CALL PVMFINITSEND( PVMDEFAULT, IBUF )         CALL PVMFPACK( REAL4, LDOT, 1, 1, IERR )         CALL PVMFSEND( TIDS(0), 1, IERR )     END IF *      CALL PVMFEXIT( 0 ) * 1000 FORMAT(I10,' Successfully spawned process #' ,I2,', TID =',I10) 2000 FORMAT('Enter the length of vectors to multiply (1 -',I7,'):')      STOP * *    End program PSDOT *      END 

10.2.3 Matrix Multiply

In this example we program a matrix multiply algorithm described by Fox et al. in [39]. The mmult program can be found at the end of this section. The mmult program will calculate C = AB where C, A, and B are all square matrices. For simplicity we assume that m m tasks are used to calculate the solution. Each task calculates a subblock of the resulting matrix C. The block size and the value of m are given as a command-line argument to the program. The matrices A and B are also stored as blocks distributed over the m2 tasks. Before delving into the details of the program, let us first describe the algorithm at a high level.

In our grid of m x m tasks, each task (tij, where 0 i, j < m), initially contains blocks Cij, Aij, and Bij. In the first step of the algorithm the tasks on the diagonal (tij where i = j) send their block Aii to all the other tasks in row i. After the transmission of Aii, all tasks calculate Aii Bij and add the result into Cij. In the next step, the column blocks of B are rotated. That is, tij sends its block of B to t(i-1)j. (Task t0j sends its B block to t(m-1)j.) The tasks now return to the first step, Ai(i+1) is multicast to all other tasks in row i, and the algorithm continues. After m iterations, the C matrix contains A B, and the B matrix has been rotated back into place.

Let us now go over the matrix multiply as it is programmed in PVM. In PVM there is no restriction on which tasks may communicate with which other tasks. However, for this program we would like to think of the tasks as a two-dimensional conceptual torus. In order to enumerate the tasks, each task joins the group mmult. Group ids are used to map tasks to our torus. The first task to join a group is given the group id of zero. In the mmult program, the task with group id zero spawns the other tasks and sends the parameters for the matrix multiply to those tasks. The parameters are m and bklsize, the square root of the number of blocks and the size of a block, respectively. After all the tasks have been spawned and the parameters transmitted, pvm_barrier() is called to make sure all the tasks have joined the group. If the barrier is not performed, later calls to pvm_gettid() might fail because a task may not have yet joined the group.

After the barrier, the task ids for the other tasks are stored in the row in the array myrow. Specifically, the program calculates group ids for all the tasks in the row, and we ask PVM for the task id for the corresponding group id. Next the program allocates the blocks for the matrices using malloc(). (In an actual application program we would expect that the matrices would already be allocated.) Then the program calculates the row and column of the block of C it will be computing; this calculation is based on the value of the group id. The group ids range from 0 to m - 1 inclusive. Thus, the integer division of (mygid/m) will give the task's row and (mygid mod m) will give the column if we assume a row major mapping of group ids to tasks. Using a similar mapping, we calculate the group id of the task directly above and below in the torus and store their task ids in up and down, respectively.

Next the blocks are initialized by calling InitBlock(). This function simply initializes A to random values, B to the identity matrix, and C to zeros. This will allow us to verify the computation at the end of the program by checking that A = C.

Finally we enter the main loop to calculate the matrix multiply. First the tasks on the diagonal multicast their block of A to the other tasks in their row. Note that the array myrow actually contains the task id of the task doing the multicast. Recall that pvm_mcast() will send to all the tasks in the tasks array except the calling task. This approach works well in the case of mmult because we don't want to have to needlessly handle the extra message coming into the multicasting task with an extra pvm_recv(). Both the multicasting task and the tasks receiving the block calculate the AB for the diagonal block and the block of B residing in the task.

After the subblocks have been multiplied and added into the C block, we now shift the B blocks vertically. Specifically, the block of B is packed into a message and sent to the up task id; then a new B block is received from the down task id.

Note that we use different message tags for sending the A blocks and the B blocks as well as for different iterations of the loop. We also fully specify the task ids when doing a pvm_recv(). It's tempting to use wild cards for the fields of pvm_recv(); however, such use can be dangerous. For instance, had we incorrectly calculated the value for up and used a wild card for the pvm_recv() instead of down, we would be sending messages to the wrong tasks without knowing it. In this example we fully specify messages, thereby reducing the possibility of receiving a message from the wrong task or the wrong phase of the algorithm.

Once the computation is complete, we check to see that A = C, just to verify that the matrix multiply correctly calculated the values of C. This step would not be done in a matrix-multiply library routine, for example.

You do not have to call pvm_lvgroup() because PVM will automatically detect that the task has exited and will remove it from the group. It is good form, however, to leave the group before calling pvm_exit(). The reset command from the PVM console will reset all the PVM groups. The pvm_gstat command will print the status of any groups that currently exist.

 /*     Matrix Multiply */ /* defines and prototypes for the PVM library */ #include <pvm3.h> #include <stdio.h> /* Maximum number of children this program will spawn */ #define MAXNTIDS    100 #define MAXROW      10 /* Message tags */ #define ATAG        2 #define BTAG        3 #define DIMTAG      5 void InitBlock(float *a, float *b, float *c, int blk, int row, int col) {     int len, ind;     int i,j;     srand(pvm_mytid());     len = blk*blk;     for (ind = 0; ind < len; ind++)         { a[ind] = (float)(rand()%1000)/100.0; c[ind] = 0.0; }     for (i = 0; i < blk; i++) {         for (j = 0; j < blk; j++) {             if (row == col)                 b[j*blk+i] = (i==j)? 1.0 : 0.0;             else                 b[j*blk+i] = 0.0;             }         } } void BlockMult(float* c, float* a, float* b, int blk) {     int i,j,k;     for (i = 0; i < blk; i++)         for (j = 0; j < blk; j ++)             for (k = 0; k < blk; k++)                 c[i*blk+j] += (a[i*blk+k] * b[k*blk+j]); } int main(int argc, char* argv[]) {     /* number of tasks to spawn, use 3 as the default */     int ntask = 2;     /* return code from pvm calls */     int info;     /* my task and group id */     int mytid, mygid;     /* children task id array */     int child[MAXNTIDS-1];     int i, m, blksize;     /* array of the tids in my row */     int myrow[MAXROW];     float *a, *b, *c, *atmp;     int row, col, up, down;     /* find out my task id number */     mytid = pvm_mytid();     pvm_setopt(PvmRoute, PvmRouteDirect);     /* check for error */     if (mytid < 0) {         /* print out the error */         pvm_perror(argv[0]);         /* exit the program */         return -1;         }     /* join the mmult group */     mygid = pvm_joingroup("mmult");     if (mygid < 0) {         pvm_perror(argv[0]); pvm_exit(); return -1;         }     /* if my group id is 0 then I must spawn the other tasks */ if (mygid == 0) {     /* find out how many tasks to spawn */     if (argc == 3) {         m = atoi(argv[1]);         blksize = atoi(argv[2]);         }     if (argc < 3) {         fprintf(stderr, "usage: mmult m blk\n");         pvm_lvgroup("mmult"); pvm_exit(); return -1;         }     /* make sure ntask is legal */     ntask = m*m;     if ((ntask < 1) || (ntask >= MAXNTIDS)) {         fprintf(stderr, "ntask = %d not valid.\n", ntask);         pvm_lvgroup("mmult"); pvm_exit(); return -1;         }     /* if there is more than one task spawn them*/     if (ntask > 1) {         /* spawn the child tasks */         info = pvm_spawn("mmult", (char**)0, PvmTaskDefault, (char*)0,         ntask-1, child);         /* make sure spawn succeeded */         if (info != ntask-1) {             pvm_lvgroup("mmult"); pvm_exit(); return -1;             }         /* send the matrix dimension */         pvm_initsend(PvmDataDefault);         pvm_pkint(&m, 1, 1);         pvm_pkint(&blksize, 1, 1);         pvm_mcast(child, ntask-1, DIMTAG);         }     } else {     /* recv the matrix dimension */     pvm_recv(pvm_gettid("mmult", 0), DIMTAG);     pvm_upkint(&m, 1, 1);     pvm_upkint(&blksize, 1, 1);     ntask = m*m;     } /* make sure all tasks have joined the group */ info = pvm_barrier("mmult",ntask); if (info < 0) pvm_perror(argv[0]); /* find the tids in my row */ for (i = 0; i < m; i++)     myrow[i] = pvm_gettid("mmult", (mygid/m)*m + i); /* allocate the memory for the local blocks */ a = (float*)malloc(sizeof(float)*blksize*blksize); b = (float*)malloc(sizeof(float)*blksize*blksize); c = (float*)malloc(sizeof(float)*blksize*blksize); atmp = (float*)malloc(sizeof(float)*blksize*blksize); /* check for valid pointers */ if (!(a && b && c && atmp)) {     fprintf(stderr, "%s: out of memory!\n", argv[0]);     free(a); free(b); free(c); free(atmp);     pvm_lvgroup("mmult"); pvm_exit(); return -1;     } /* find my block's row and column */ row = mygid/m; col = mygid % m; /* calculate the neighbor's above and below */ up = pvm_gettid("mmult", ((row)?(row-1):(m-1))*m+col); down = pvm_gettid("mmult", ((row == (m-1))?col:(row+1)*m+col)); /* initialize the blocks */ InitBlock(a, b, c, blksize, row, col); /* do the matrix multiply */ for (i = 0; i < m; i++) {     /* mcast the block of matrix A */     if (col == (row + i)%m) {         pvm_initsend(PvmDataDefault);         pvm_pkfloat(a, blksize*blksize, 1);         pvm_mcast(myrow, m, (i+1)*ATAG);         BlockMult(c,a,b,blksize);         }     else {         pvm_recv(pvm_gettid("mmult", row*m + (row +i)%m), (i+1)*ATAG);         pvm_upkfloat(atmp, blksize*blksize, 1);         BlockMult(c,atmp,b,blksize);         }         /* rotate the columns of B */         pvm_initsend(PvmDataDefault);         pvm_pkfloat(b, blksize*blksize, 1);         pvm_send(up, (i+1)*BTAG);         pvm_recv(down, (i+1)*BTAG);         pvm_upkfloat(b, blksize*blksize, 1);         }     /* check it */     for (i = 0 ; i < blksize*blksize; i++)         if (a[i] != c[i])             printf("Error a[%d] (%g) != c[%d] (%g) \n", i, a[i],i,c[i]);     printf("Done.\n");     free(a); free(b); free(c); free(atmp);     pvm_lvgroup("mmult");     pvm_exit();     return 0; } 

10.2.4 One-Dimensional Heat Equation

Here we present a PVM program that calculates heat diffusion through a substrate, in this case a wire. Consider the one-dimensional heat equation on a thin wire,


and a discretization of the form


giving the explicit formula


The initial and boundary conditions are

A(t, 0) = 0, A(t, 1) = 0 for all t

A(0, x) = sin(πx) for 0 x 1.

The pseudocode for this computation is as follows:

     for i = 1:tsteps-1;         t = t+dt;         a(i+1,1)=0;         a(i+1,n+2)=0;         for j = 2:n+1;             a(i+1,j)=a(i,j) + mu*(a(i,j+1)-2*a(i,j)+a(i,j-1));         end;     end 

For this example we use a master/worker programming model. The master, 'heat.c', spawns five copies of the program heatslv. The workers compute the heat diffusion for subsections of the wire in parallel. At each time step the workers exchange boundary information, in this case the temperature of the wire at the boundaries between processors.

Let's take a closer look at the code. In 'heat.c' the array solution will hold the solution for the heat diffusion equation at each time step. First the heatslv tasks are spawned. Next, the initial dataset is computed. Notice that the ends of the wires are given initial temperature values of zero.

The main part of the program is then executed four times, each with a different value for Δt. A timer is used to compute the elapsed time of each compute phase. The initial datasets are sent to the heatslv tasks. The left and right neighbor task ids are sent along with the initial dataset. The heatslv tasks use these to communicate boundary information. Alternatively, we could have used the PVM group calls to map tasks to segments of the wire. By using that approach we would have avoided explicitly communicating the task ids to the slave processes.

After sending the initial data, the master process waits for results. When the results arrive, they are integrated into the solution matrix, the elapsed time is calculated, and the solution is written to the output file.

Once the data for all four phases have been computed and stored, the master program prints out the elapsed times and kills the slave processes.

 /* heat.c     Use PVM to solve a simple heat diffusion differential equation,     using 1 master program and 5 slaves.     The master program sets up the data, communicates it to the slaves     and waits for the results to be sent from the slaves.     Produces xgraph ready files of the results. */ #include "pvm3.h" #include <stdio.h> #include <math.h> #include <time.h> #define SLAVENAME "heatslv" #define NPROC 5 #define TIMESTEP 100 #define PLOTINC 10 #define SIZE 1000 int num_data = SIZE/NPROC; main() {   int mytid, task_ids[NPROC], i, j;     int left, right, k, 1;     int step = TIMESTEP;     int info;     double init[SIZE], solution[TIMESTEP][SIZE];     double result[TIMESTEP*SIZE/NPROC], deltax2;     FILE *filenum;     char *filename [4] [7] ;     double deltat[4];     time_t t0;     int etime [4] ;     filename[0][0] = "graph1";     filename[1][0] = "graph2";     filename[2][0] = "graph3";     filename[3][0] = "graph4";     deltat[0] = 5.0e-1;     deltat[1] = 5.0e-3;     deltat[2] = 5.0e-6;     deltat[3] = 5.0e-9; /* enroll in pvm */     mytid = pvm_mytid(); /* spawn the slave tasks */     info = pvm_spawn(SLAVENAME,(char **)0,PvmTaskDefault,"",         NPROC,task_ids); /* create the initial data set */     for (i = 0; i < SIZE; i++)         init[i] = sin(M_PI * ( (double)i / (double)(SIZE-1) ));     init[0] = 0.0;     init[SIZE-1] = 0.0; /* run the problem 4 times for different values of delta t */     for (1 = 0; 1 < 4; 1++) {         deltax2 = (deltat[1]/pow(1.0/(double)SIZE,2.0));         /* start timing for this run */         time(&t0);         etime[1] = t0; /* send the initial data to the slaves. */ /* include neighbor info for exchanging boundary data */     for (i = 0; i < NPROC; i++) {         pvm_initsend(PvmDataDefault);         left = (i == 0) ? 0 : task_ids[i-1];         pvm_pkint(&left, 1, 1);         right = (i == (NPROC-1)) ? 0 : task_ids[i+1];         pvm_pkint(&right, 1, 1);         pvm_pkint(&step, 1, 1);         pvm_pkdouble(&deltax2, 1, 1);         pvm_pkint(&num_data, 1, 1);         pvm_pkdouble(&init[num_data*i], num_data, 1);         pvm_send(task_ids[i], 4);         } /* wait for the results */         for (i = 0; i < NPROC; i++) {             pvm_recv(task_ids[i], 7);             pvm_upkdouble(&result[0], num_data*TIMESTEP, 1); /* update the solution */         for (j = 0; j < TIMESTEP; j++)             for (k = 0; k < num_data; k++)                 solution[j][num_data*i+k] = result[wh(j,k)];         } /* stop timing */         time(&t0);         etime[1] = t0 - etime[l]; /* produce the output */         filenum = fopen(filename[1][0], "w");         fprintf(filenum,"TitleText: Wire Heat over Delta Time: %e\n",             deltat[1]);         fprintf(filenum,"XUnitText: Distance\nYUnitText: Heat\n");         for (i = 0; i < TIMESTEP; i = i + PLOTINC) {             fprintf(filenum,"\"Time index: %d\n",i);             for (j = 0; j < SIZE; j++)                 fprintf(filenum,"%d %e\n",j, solution[i][j]);             fprintf(filenum,"\n");             }         fclose (filenum);     } /* print the timing information */     printf("Problem size: %d\n",SIZE);     for (i = 0; i < 4; i++)         printf("Time for run %d: %d sec\n",i,etime[i]); /* kill the slave processes */     for (i = 0; i < NPROC; i++) pvm_kill(task_ids[i]);     pvm_exit(); } int wh(x, y) int x, y; {     return(x*num_data+y); } 

The heatslv programs do the actual computation of the heat diffusion through the wire. The worker program consists of an infinite loop that receives an initial dataset, iteratively computes a solution based on this dataset (exchanging boundary information with neighbors on each iteration), and sends the resulting partial solution back to the master process. As an alternative to using an infinite loop in the worker tasks, we could send a special message to the worker ordering it to exit. Instead, we simply use the infinite loop in the worker tasks and kill them off from the master program. A third option would be to have the workers execute only once, exiting after processing a single dataset from the master. This would require placing the master's spawn call inside the main for loop of 'heat.c'. While this option would work, it would needlessly add overhead to the overall computation.

For each time step and before each compute phase, the boundary values of the temperature matrix are exchanged. The left-hand boundary elements are first sent to the left neighbor task and received from the right neighbor task. Symmetrically, the right-hand boundary elements are sent to the right neighbor and then received from the left neighbor. The task ids for the neighbors are checked to make sure no attempt is made to send or receive messages to nonexistent tasks.

 /* heatslv.c     The slaves receive the initial data from the host,     exchange boundary information with neighbors,     and calculate the heat change in the wire.     This is done for a number of iterations, sent by the master. */ #include "pvm3.h" #include <stdio.h> int num_data; main() {     int mytid, left, right, i, j, master;     int timestep;     double *init, *A;     double leftdata, rightdata, delta, leftside, rightside; /* enroll in pvm */     mytid = pvm_mytid();     master = pvm_parent(); /* receive my data from the master program */   while(1) {     pvm_recv(master, 4);     pvm_upkint(&left, 1, 1);     pvm_upkint(&right, 1, 1);     pvm_upkint(&timestep, 1, 1);     pvm_upkdouble(&delta, 1, 1);     pvm_upkint(&num_data, 1, 1);     init = (double *) malloc(num_data*sizeof(double));     pvm_upkdouble(init, num_data, 1); /* copy the initial data into my working array */     A = (double *) malloc(num_data * timestep * sizeof(double));     for (i = 0; i < num_data; i++) A[i] = init[i]; /* perform the calculation */   for (i = 0; i < timestep-1; i++) {     /* trade boundary info with my neighbors */     /* send left, receive right */     if (left != 0) {         pvm_initsend(PvmDataDefault);         pvm_pkdouble(&A[wh(i,0)],1,1);         pvm_send(left, 5);         }     if (right != 0) {         pvm_recv(right, 5);         pvm_upkdouble(&rightdata, 1, 1);     /* send right, receive left */         pvm_initsend(PvmDataDefault);         pvm_pkdouble(&A[wh(i,num_data-1)],1,1);         pvm_send(right, 6);         }     if (left != 0) {         pvm_recv(left, 6);         pvm_upkdouble(&leftdata,1,1);         } /* do the calculations for this iteration */     for (j = 0; j < num_data; j++) {         leftside = (j == 0) ? leftdata : A[wh(i,j-1)];         rightside = (j == (num_data-1)) ? rightdata : A[wh(i,j+1)];         if ((j==0)&&(left==0))             A[wh(i+1,j)] = 0.0;         else if ((j==(num_data-1))&&(right==0))             A[wh(i+1,j)] = 0.0;         else             A[wh(i+1,j)]=                 A[wh(i,j)]+delta*(rightside-2*A[wh(i,j)]+leftside);         }   } /* send the results back to the master program */     pvm_initsend(PvmDataDefault);     pvm_pkdouble(&A[0],num_data*timestep,1);     pvm_send(master,7);   } /* just for good measure */   pvm_exit(); } int wh(x, y) int x, y; {     return(x*num_data+y); } 

In this section we have given a variety of example programs written in both Fortran and C. These examples demonstrate various ways of writing PVM programs. Some divide the application into two separate programs, while others use a single program with conditionals to handle spawning and computing phases. These examples show different styles of communication, both among worker tasks and between worker and master tasks. In some cases messages are used for synchronization, and in others the master processes simply kill off the workers when they are no longer needed.

Beowulf Cluster Computing With Linux 2003
Beowulf Cluster Computing With Linux 2003
Year: 2005
Pages: 198

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net