Chapter 12. A Taxonomy of Coding Errors


Chapter 12. A Taxonomy of Coding Errors[1]

[1] Parts of this chapter appeared in original form in Proceedings of the NIST Workshop on Software Security Assurance Tools, Techniques, and Metrics coauthored with Katrina Tsipenyuk and Brian Chess [Tsipenyuk, Chess, and McGraw 2005].

A horse! A horse! My kingdom for a horse!

King Richard the Third (William Shakespeare)

The purpose of any taxonomy like this one is to help software developers and security practitioners concerned about software understand common coding mistakes that impact security. The goal is to help developers avoid making mistakes and to more readily identify security problems whenever possible. A taxonomy like this one is most usefully applied in an automated tool that can spot problems either in real time (as a developer types into an editor) or at compile time (see Chapter 4). When put to work in a tool, a set of security rules organized according to this taxonomy is a powerful teaching mechanism. Because developers today are by and large unaware of security problems that they can (unknowingly) introduce into code, publication of a taxonomy like this should provide real, tangible benefits to the software security community.

This approach represents a striking alternative to taxonomies of attack patterns (see Exploiting Software [Hoglund and McGraw 2004]) or simple-minded collections of specific vulnerabilities (e.g., Mitre's CVE <http://www.cve.mitre.org/>). Attack-based approaches are based on knowing your enemy and assessing the possibility of similar attack. They represent the black hat side of the software security equation. A taxonomy of coding errors is, strangely, more positive in nature. This kind of thing is most useful to the white hat side of the software security world. In the end, both kinds of approaches are valid and necessary.

The goal of this taxonomy is to educate and inform software developers so that they better understand the way their work affects the security of the systems they build. Developers who know this stuff (or at least use a tool that knows this stuff) will be better prepared to build security in than those who don't.

Though this taxonomy is incomplete and imperfect, it provides an important start. One of the problems of all categorization schemes like this is that they don't leave room for new (often surprising) kinds of vulnerabilities. Nor do they take into account higher-level concerns such as the architectural flaws and associated risks described in Chapter 5.[2] Even when it comes to simple security-related coding issues themselves, this taxonomy is not perfect. Coding problems in embedded control software and common bugs in high-assurance software developed using formal methods are poorly represented here, for example.

[2] This should really come as no surprise. Static analysis for architectural flaws would require a formal architectural description so that pattern matching could occur. No such architectural description exists. (And before you object, UML doesn't cut it.)

The bulk of this taxonomy is influenced by the kinds of security coding problems often found in large enterprise software projects. Of course, only coding problems are represented since the purpose of this taxonomy is to feed a static analysis engine with knowledge. The taxonomy as it stands is neither comprehensive nor theoretically complete. Instead it is practical and based on real-world experience. The focus is on collecting common errors and explaining them in such a way that they make sense to programmers.

The taxonomy is expected to evolve and change as time goes by and coding issues (e.g., platform, language of choice, and so on) change. This version of the taxonomy places more emphasis on concrete and specific problems over abstract or theoretical ones. In some sense, the taxonomy may err in favor of omitting "big-picture" errors in favor of covering specific and widespread errors.

The taxonomy is made up of two distinct kinds of sets (which we're stealing from biology). What is called a phylum is a type or particular kind of coding error; for example, Illegal Pointer Value is a phylum. What is called a kingdom is a collection of phyla that share a common theme. That is, kingdoms are sets of phyla; for example, Input Validation and Representation is a kingdom. Both kingdoms and phyla naturally emerge from a soup of coding rules relevant to enterprise software. For this reason, the taxonomy is likely to be incomplete and may be missing certain coding errors.

In some cases, it is easier and more effective to talk about a category of errors than it is to talk about any particular attack. Though categories are certainly related to attacks, they are not the same as attack patterns.




Software Security. Building Security In
Software Security: Building Security In
ISBN: 0321356705
EAN: 2147483647
Year: 2004
Pages: 154
Authors: Gary McGraw

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net