Section 3.3. Programming for Parallelism | Practical FPGA Programming in C

3.3. Programming for Parallelism

As we implied in the preface to this book, there is a fundamental problem with attempting to program general-purpose hardware (or "non-von Neumann machines," if you will) using the C language. The problem is how to express parallelism. Parallel processing and the programming of parallel systems require support for concurrency in the language being used and an understanding of how to manage multiple quasi-independent computational elements on the part of the programmer. The standard C language does not contain any such features. VHDL and Verilog, on the other hand, which are intended for describing highly parallel systems of connected hardware components, are designed for exactly this purpose, albeit at a rather low level of abstraction.

The closest thing to a truly parallel programming model in the context of C-language programming is support for threads, which is not a standard feature of C but is popular and readily available in the form of add-on, operating system-specific libraries. Another, less common C library for this type of programming is the message-passing interface (MPI). This library is intended for the design of larger supercomputing applications implemented on clusters of standard desktop computers and other heavy-duty platforms.

So if the C language (and, by extension, any other programming language developed for von Neumann machines) is not appropriate for programming general-purpose hardware, and if the languages specifically designed for such hardware (for example, HDLs) are too low-level, what is the answer? As it turns out, a compromise solution is best. On the hardware side, we need to assemble all those undifferentiated hardware elements (the programmable gates and flip-flops that make up an FPGA) into some kind of abstract structure appropriate for higher-level programming. Fortunately, compiler tools can create this structure automatically, using knowledge of the application and of the available low-level FPGA hardware. On the software side, we need to extend the language of choice (which for the purposes of this discussion will be C) to support programming the abstract parallel processing machine that we have just assembled, as illustrated in Figure 3-1.

Figure 3-1. Multinode parallel machine model and multiprocess programming model.

To summarize: to make sense of programming FPGA-based hardware (as opposed to designing it from the ground up), we need to create an abstract machine model and choose a software programming model appropriate for that abstract machine.

Parallel programming researchers have generally found that creating an abstract, multinode machine (sometimes called a multicomputer) consisting of multiple, semi-autonomous processing nodes (often called processing elements [PEs]) is a good way to express a platform for coarse-grained parallel processing. By targeting this model of an abstract machine (whether or not the underlying hardware actually implements such a system) it is relatively easy to construct a usable programming model. The communicating sequential process (CSP) programming model has been well-studied and can be used to apply formal methods for FPGA design, as well as to build actual applications.