Compile your source code to assembly code.
The machine language generated by a compiler from your source code is the ultimate definition of how the compiler interpreted your program. Some compilers provide options to generate assembly language output so that you can see this binary machine language in a human-readable form. Sometimes it’s useful to generate such output when diagnosing a defect. Reading assembly code is tedious and error prone, and it should be among the last tactics you use to understand how the compiler interpreted your program.
How did the language translator interpret the program I compiled?
What hidden assumptions does my program make that the compiler must make specific?
Many people don’t know the assembly language of the machine they work on.
If the translator you’re using applies a preprocessor to your code, it’s sometimes necessary to look at the output of this preprocessing. This approach to getting translator feedback applies to C, C++, and PL/I, among other languages. To make this method effective, use selective preprocessing. This can reduce a huge quantity of code to a manageable size. There are several ways to apply selective preprocessing. You can expand only user macros, which can eliminate many definitions introduced by system header files. You can also just expand a single header file or a single definition.
Generate the intermediate representation of the compiler after performing semantic analysis, but before code generation. The intermediate representation is in effect another high-level language. This variation depends, of course, on having a compiler that will provide such output and having some knowledge of how compilers work internally.
Some compilers translate into a lower-level language. These days, the most typical target for such compilers is C, since a C compiler is available for every processor that matters. The most likely choices for this implementation technique are VHLLs like APL, Lisp, ML, Scheme, SETL, and so forth. When you use this approach, you can see storage allocation, implicit conversions and copies, runtime routine dispatching, and the like. These are particularly important in VHLLs.
Source-to-source translation is done by tools that make explicit all the defaults and shorthands that compilers may support. The obvious advantage to this approach is that you don’t need to learn another language. It is most helpful for languages that allow the programmer to use shortcuts, such as C, Fortran, and PL/I.
When you do source-to-source translation, you can normally see the following:
All attributes of every variable
Scope of all control-flow statements
Order of evaluation of expressions
A special form of this approach is to use a source-to-source translator that performs analysis for the purpose of introducing parallelism.
Use this tactic when one of the following conditions is true:
If you’re using a compiler that generates assembly code, use the basic tactic.
If you’re using a compiler that executes a preprocessor first, use refined tactic 1.
If you have access to an open source or research compiler that will dump its intermediate representation for debugging purposes, use refined tactic 2.
If you’re using a very high level language, and you have access to a compiler for that language that generates a lower level language, used the related tactics.
C++: This tactic is useful for exposing conversions and allocations.
Java: Java compilers generate a platform-independent bytecode into .class files. There are numerous tools that will convert .class files into a readable form.
C: This technique works best with C, because there is such a close relationship between C statements and generated assembly code.
Fortran: Fortran 95 has an extensive runtime library, and many operations may be translated as calls to that library, rather than assembly code.