Code Navigation


There are a few basic ways to traverse through functions and modules in source code, defined by where you start, what your goal is, and how you follow the code. Borrowing some language from other disciplines, code navigation can be described in terms of external flow sensitivity and tracing direction.

External Flow Sensitivity

When you review an entire application, you need to find different ways to decompose it into more manageable parts. One of the easiest ways to do this is to isolate the application code's external flow, which refers to how execution proceeds from function to function, but not inside a function. It's divided into two categories: control-flow sensitive and data-flow sensitive. A brief example should help illustrate what this concept means:

int bob(int c) {     if (c == 4)         fred(c);     if (c == 72)         jim();     for (; c; c)         updateglobalstate(); }


Look at this example first in the context of ignoring external control flow and data flow. This means you simply read this code from top to bottom; you don't branch out to any function calls. You might note that the code uses some sentinel values to call fred() and jim() and seems to trust its input c. However, all your analysis should be isolated to this function.

Consider the same example from a control-flow sensitive viewpoint. In this case, you start reading this function and see the call to fred(). Because you haven't seen fred() before, you pull it up and start checking it out. Then you trace into the call to jim() and do the same for the call to updateglobalstate(). Of course, each of these functions might call other unfamiliar functions, so your control-flow sensitive approach requires evaluating each one. This approach could conceivably involve reading dozens of functions before you finish this simple code path.

Now say you follow only the data flow corresponding to the data in the c variable and ignore any control flow that doesn't affect this data directly. With this approach, you trace through to the call to fred() because it passes the c variable. However, this analysis simply ignores jim() because it doesn't affect the data.

Finally, if you were following control flow and data flow, you'd have some idea of what the value of c might be coming into this function. You might have a certain value in mind or a possible set of values. For example, if you know that c couldn't be 4, you wouldn't bother reading fred(). If you suspected that c could be 72, however, you need to trace into jim().

If you haven't done much code review, you would probably guess that the most useful method combines control-flow sensitive and data-flow sensitive approaches because you'd be closely following what could happen as the program runs. It might surprise you to know that many experienced auditors rely primarily on techniques that aren't control-flow or data-flow sensitive. The reason they have done so is to simplify the number of mental context switches they deal with to make the most effective use of their time. Generally, it's more effective to review functions in isolation and trace the code flow only when absolutely necessary.

Note

Flow analysis is an important concept in compiler design, and these characterizations between control flow and data flow have been simplified for the purposes of this discussion. However, real compiler theory is far more complex and should only be attempted by card carrying computer scientists.


Tracing Direction

When tracing code, you can follow one of two paths: forward-tracing, usually done to evaluate code functionality, and back-tracing, usually done to evaluate code reachability.

Forward-tracing can be done using any of the four types of flow sensitivity outlined previously. Forward traces that incorporate control flow and/or data flow start at entry points, trust boundaries, or the first line of key algorithms. Forward traces that ignore control flow and data flow start at the first line of a file or the top of a module implementation. All four techniques are essential core processes for analyzing code.

Back-tracing usually starts at a piece of code identified as a candidate point, which represents a potential vulnerability in the system. Examples include issuing dynamic SQL statements, using unbounded string copies, or accessing dynamically generated file paths. Candidate points are usually identified through some form of automated analysis or by going through the code with the grep utility to find known vulnerable patterns. After identifying candidate points, the reviewer traces from them back to the application's entry points.

The advantage of back-tracing is that it involves fewer code paths than forward-tracing. The disadvantage is that it's only as strong as your selection of candidate points, so you run the risk of overlooking exploitable pathways because you didn't consider valid candidate points. You also tend to miss logic-related vulnerabilities entirely because they rarely map cleanly to algorithmically detectable candidate points.




The Art of Software Security Assessment. Identifying and Preventing Software Vulnerabilities
The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities
ISBN: 0321444426
EAN: 2147483647
Year: 2004
Pages: 194

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net