ADVANTAGES OF STRUCTURE PARALLELISM | Data Mining: Opportunities and Challenges

data mining: opportunities and challenges

Chapter V - Parallel and Distributed Data Mining through Parallel Skeletons and Distributed Objects
Data Mining: Opportunities and Challenges
by John Wang (ed)
Idea Group Publishing 2003


	Brought to you by Team-Fly

Table 1 reports some software cost measures from our experiments, which we review to underline the qualities of the structured approach: fast code development, code portability, and performance portability.

Table 1: Software development costs for Apriori, DBSCAN and C4.5: Number of lines and kind of code, development times, best speedup on different target machines
	APRIORI	DBSCAN

Sequential code	2900 lines, C++	10138 lines, C++

Kind of parallelization	SkIE	SkIE
Modularization, l. of code	630, C++	493, C++
Parallel structure, l. of code	350, SkIE-CL, C++	793, SkIE-CL, C++
Effort (man-months)	3	2,5
Best CS2 speed-up COW and (parallelism)	20 (40) 9.4 (10) 3.73 (4)	- 6 (9) -

	C4.5

Sequential code	8179 lines, non-ANSI C, uses global variables

Kind of parallelization	SkIE	SkIE + Shared Tree	MPI
Modularization, l. of code	977, C, C++	977, C, C++	1087, C, C++
Parallel structure, l. of code	303, SkIE-CL	380, SkIE-CL, C++	431, MPI, C++
Effort (man-months)	4	5	5
Best speed-up CS2	2.5 (7)	5 (14)	-
and (parallelism) COW	2.45 (10)	-	2.77 (9)

Development Costs and Code Expressiveness

When restructuring the existing sequential code to parallel, most of the work is devoted to making the code modular. The amount of sequential code needed to develop the building blocks for structured parallel applications is reported in Table 1 as modularization, separate from the true parallel code. Once modularization has been accomplished, several prototypes for different parallel structures are usually developed and evaluated. The skeleton description of a parallel structure is shorter, quicker to write and far more readable than its equivalent written in MPI. As a test, starting from the same sequential modules, we developed an MPI version of C4.5. Though it exploits simpler solutions (Master-Slave, no pipelined communications) than the skeleton program, the MPI code is longer, more complex and error-prone than the structured version. On the contrary, the speed-up results showed no significant gain from the additional programming effort.

Performance

The speed-up and scale-up results of the applications we have shown are not all breakthrough, but comparable to those of similar solutions performed with unstructured parallel programming (e.g., MPI). The Partitioned Apriori is fully scalable with respect to database size, like count-distribution implementations. The C4.5 prototype behaves better than other pure task-parallel implementations. It suffers the limits of this parallelization scheme, due to the support of external objects being incomplete. We know of no other results about spatial clustering using our approach to the parallelization of cluster expansion.

Code and Performance Portability

Skeleton code is by definition portable over all the architectures that support the programming environment. Since the SkIE two-level parallel compiler uses standard compilation tools to build the final application, the intermediate code and the run-time support of the language can exploit all the advantages of parallel communication libraries. We can enhance the parallel support by using architecture-specific facilities when the performance gain is valuable, but as long as the intermediate code complies with industry standards the applications are portable to a broad set of architectures. The SMP and T3E tests of the ARM prototype were performed this way, with no extra development time, by compiling on the target machine the MPI and C++ code produced by SkIE. These results also show a good degree of performance portability.


	Brought to you by Team-Fly