Section 6.5. Debugging the Generated Hardware

6.5. Debugging the Generated Hardware

Hardware that has been generated from C code should (in a perfect world) exhibit behavior that is exactly the same as is observed during a software simulation, such as when running under the control of a C debugger. In practice, however, there are many situations in which subtle coding errors made in the C code (such as relying on variables being initialized or making incorrect assumptions about process synchronization) can result in an application that operates perfectly during software simulation but fails in the actual hardware. To help guard against this, making use of hardware debugging techniques and hardware simulators can be an important part of your design efforts.


Although debugging automatically generated HDL may seem daunting (particularly if you are a software engineer), it is actually not as bad as you might think. You will find that the generated outputs will be quite dense with intermediate, low-level signals (perhaps hundreds or even thousands of them, most of which will be optimized away in the hardware synthesis process). Fortunately, you will also find that the variables used in your C file are still, for the most part, intact and have their names preserved so they can be monitored during debugging.

To help in analyzing control flow and cycle-by-cycle synchronization issues, it's useful to know that the hardware generator implements each process in your application as a separate state machine, with symbolic state names that can be referenced back to specific blocks and stages of C code in the original application. Parallel operations are found within the state machine and/or within concurrent statements found elsewhere in the generated HDL module.

The following excerpts from the generated FIR filter hardware description help illustrate this point. First, notice that in the declarations section for the lower-level HDL file we have the following declaration:

 type stateType is                 (init,b0s0,b0s1,b0s2,b0s3,b0s4,b0s5,b0s6,b0s7,b0s8,b0s9,b0s10,                 b0s11,b0s12,b0s13,b0s14,b0s15,b0s16,b0s17,b0s18,b0s19,b0s20,                 b0s21,b0s22,b0s23,b0s24,b0s25,b0s26,b0s27,b0s28,b0s29,b0s30,                 b0s31,b0s32,b0s33,b0s34,b0s35,b0s36,b0s37,b0s38,b0s39,b0s40,                 b0s41,b0s42,b0s43,b0s44,b0s45,b0s46,b0s47,b0s48,b0s49,b0s50,                 b0s51,b0s52,b0s53,b0s54,b0s55,b0s56,b0s57,b0s58,b0s59,b0s60,                 b0s61,b0s62,b0s63,b0s64,b0s65,b0s66,b0s67,b0s68,b0s69,b0s70,                 b0s71,b0s72,b0s73,b0s74,b0s75,b0s76,b0s77,b0s78,b0s79,b0s80,                 b0s81,b0s82,b0s83,b0s84,b0s85,b0s86,b0s87,b0s88,b0s89,b0s90,                 b0s91,b0s92,b0s93,b0s94,b0s95,b0s96,b0s97,b0s98,b0s99,b0s100,                 b0s101,b0s102,b1s0,b1s1,b2s0,finished); signal thisState, nextState : stateType; 

The generated type stateType symbolically represents all the blocks and stages of the generated process. In the case of the FIR filter there are quite a few of these states in the machine (107 of them, to be exact) that represent two major blocks of functionality in the expanded code. One of these states (the first one, b0s0) is shown here, along with the clock logic that drives the machine:

 if (clk'event and clk='1') then     case thisState is         when b0s0 =>             if (stateEn = '1') then                 r_tap <= ni4126_tap;             end if; 

Comments found elsewhere in the generated HDL help identify which specific block and cycle a given operation is associated with. For example, the following concurrent multiply and accumulate operations are associated with stage one of block one, as indicated by the comment line preceding them:

 -- b1s1 ni4130_nSample <= r_filter_in; ni4131_firbuffer_50 <= ni4130_nSample; ni4132_accum <= X"00000000"; ni4133_tap <= X"00000000"; ni4134_accum <= add(ni4132_accum, mul(r_firbuffer_0, r_coef_0)); ni4135_tap <= X"00000001"; ni4136_accum <= add(ni4134_accum, mul(r_firbuffer_1, r_coef_1)); 

Of the 107 states in the machine, those blocks identified by the b0 state name prefix represent the initialization section of the FIR filter, which consisted of two unrolled loops in the original C code. There are many stages in this block, but because this is only initialization code, the overhead of all those cycles is of little importance.

The key routine of the FIR filter, the inner code loop that actually processes the data stream, is represented by the states prefixed by b1 and b2, of which there are only two (b1s0 and b2s0) when pipelining has not been enabled and only one (b1s0) when pipelining has been enabled through the use of the PIPELINE pragma. You can use these symbolic states as an aid to hardware debugging with an HDL simulator, as shown in Figure 6-8.

Figure 6-8. Debugging the hardware state machine using a VHDL simulator.

Figures 6-9 and 6-10 show another hardware debugging session (again using the FIR application as an example) in which the expanded source code of the original example (in which the specific blocks and stages of the code have been identified, both graphically and in an expanded source listing) can, without too much difficulty, be related to specific lines of the generated HDL. Notice in the example shown that the variable firbuffer_50, which corresponds to one element of the scalarized firbuffer array, is easily identified in the HDL code during source-level hardware simulation and debugging. Comments embedded in the HDL code also help identify the specific blocks and stages of the original C code that correspond to the HDL statements being executed.

Figure 6-9. Viewing the expanded C code in the Impulse C Stage Master optimizer.

Figure 6-10. Debugging the same sequence of code in a hardware VHDL simulator.

The goal of performing hardware simulations at this level (after compilation from C) is to identify and verify correct cycle-by-cycle behavior. An example of this kind of debugging is stepping through the design one clock cycle at a time (or through some defined number of cycles) to zero in on a specific problem area, as defined by both space (the area of code) and time (the clock cycle in which a problem manifests itself). The Impulse design flow has three fundamental ways in which to perform cycle-accurate hardware simulations of this type:

  • The generated HDL files may be combined with an HDL test bench (as was done earlier in this chapter) for use in an HDL simulator, as you've just seen. This method generally requires at least a basic understanding of hardware design tools and requires access to an HDL simulator. The advantage of this method is that you have access to the many debugging and visualization features provided in hardware simulators.

  • The generated HDL files may be translated (through the use of a utility provided by Impulse) to a cycle-accurate C-language model. This model may then be linked into the original application (replacing the original "untimed" C code) so that cycle-accurate behaviors may be observed in the context of a C debugger. An advantage of this method is that your original C-language test bench can be used to drive the simulation.

  • The synthesized design may be simulated at the level of the FPGA netlist, including incorporating post-route timing details through the use of timing-annotated HDL. This is the least desirable method if the goal is to verify function because the simulation can be extremely time-consuming and the generated netlist may be very difficult to relate back to the original C code. This is the only effective way, however, of debugging problems related to clock skew or to actual gate delays that are impacting your maximum operating frequency. (Such timing-related debugging is beyond the scope of this book, but many excellent books, application notes, and tools are available to help.)

Which of these methods you choose will depend on the nature of the problem you are attempting to debug, on your expertise as a hardware designer, and on your access to HDL simulation tools.

    Practical FPGA Programming in C
    Practical FPGA Programming in C
    ISBN: 0131543180
    EAN: 2147483647
    Year: 2005
    Pages: 208

    Similar book on Amazon © 2008-2017.
    If you may any questions please contact us: