Oracle extproc Overflow | Hacking Ubuntu: Serious Hacks Mods and Customizations (ExtremeTech)

Common Architectural Failures

As we saw in the previous example, things tend to fail in similar ways. After you've been looking at advisories every day for a few years , you start to notice patterns and then go after them in your own research. It's probably useful to stop and consider those patterns for a moment, since they might provide ideas for your future research.

Problems Happen at Boundaries

While this isn't universally true, it's generally the case that security problems occur when there's a transition of some kind: from one process to another, from one technology to another, or from one interface to another. The following are a few examples of these.

A Process Calling into an External Process on the Same Host

Good examples of this problem are the Oracle issue 57 described above and the Named Pipe Hijacking issue found by Andreas Junestam and described in Microsoft Bulletin MS03-031. To see some interesting privilege elevation issues, use HandleEx (from Sysinternals) to take a look at the permissions assigned to global objects (like shared memory sections) in Windows. Many applications don't guard against local attacks.

In Unix you'll find a whole family of problems relating to the parsing of command-line options when a process calls out to some other process to perform some kind of function. Once again, instrumentation is helpful if you're looking in this area. In this case, ltrace is probably the best way to go.

A Process Calling into an External, Dynamically Loaded Library

Again, Oracle and SQL Server provide multiple examples of this problem ”the original extproc bug found by David Litchfield (Oracle alert 29) being one, and the many extended stored procedure overflows found in SQL Server being another.

Also, there are a very large number of problems in ISAPI filters in Microsoft IIS, including a Commerce Server component, the ISM.DLL filter, the SQLXML filter, the .printer ISAPI filter, and many more. One of the reasons these sorts of problems occur is that although people heavily audit core behaviors of a network daemon, they tend to overlook extensibility support.

IIS isn't alone in this. Take a look at the Apache mod_ssl off-by-one bug, as well as problems in mod_mylo , mod_cookies , mod_frontpage , mod_ntlm , mod_auth_any , mod_access_referer , mod_jk , mod_php , and mod_dav .

If you're auditing an unknown system, a soft spot can normally be found in this kind of functional area.

A Process Calling into a Function on a Remote Host

This is also a minefield, although people tend to be more aware of the risks. The DCOM-RPC bug (MS03-26) shows that this kind of problem is still around. Most RPC bugs fit into this category, like the Sun UDP PRC DOS, the Locator Service overflow, the multiple MS Exchange overflows found by Dave Aitel, and the old favorite statd format string bug found by Daniel Jacobowitz.

Problems Happen When Data Is Translated

When data is transformed from one form to another, it's often possible to bypass checks. This is actually a very fundamental problem relating to translation between grammars. The reason this kind of problem (often called a canonicalization bug) is so prevalent is that it is exceptionally difficult to create a system in which programmable interfaces become less grammatically complex as you descend deeper into the call tree.

Formally, we could put it like this: Function f() implements a set of behaviors F . f() implements these behaviors by calling a function g() , passing it some of the input to f() . g implements the set of behaviors G . Unfortunately , set G contains behaviors that are undesirable to expose via f() . We call these bad behaviors Gbad . Therefore, f() must implement some mechanism to ensure that F contains none of Gbad . The only way that the implementer of f() can do this is to fully understand all of G and validate the inputs to f() to ensure that no combination of inputs results in any member of Gbad .

This is a problem for two reasons:

Things almost always get more complex the lower you go down the call tree, so f() deals with too many cases.
g() has the same problem, as do h() , i() , j() , and so on down the call tree.

For example, take the Win32 file system functions. You might have a program that accepts a filename. As far as it understands the concept of filenames, it assumes the following:

A filename may have an extension at the end. Extensions are normally, but not always, three characters long, and are denoted by the final period (.) character in the filename.
A filename may be a fully qualified path . If so, it starts with a drive letter, which is followed by a colon (:) character.
A filename may be a relative path. If so, it will contain backslash (\) characters.
Each backslash character signals a transition into a child directory.

This can be thought of as constituting a grammar for filenames as far as the program is concerned . Unfortunately, the grammar implemented by the underlying file system functions (like the Win32 API CreateFile ) includes many other potentially dangerous constructs such as the following (this is not an exhaustive list):

A filename can begin with a double-backslash sequence. If this is the case, the first directory name signifies a host on the network and the second an SMB share name . The FileSystem API will attempt to connect to this share using the (sniffable) credentials of the current user .
A filename can also begin with a \\?\ sequence, which denotes that it is a Unicode file path and is able to exceed the normal length limits imposed by the FileSystem API.
A filename can begin \\?\UNC , which will also trigger the Microsoft Share connection behavior described above.
A filename can begin \\.\PHYSICALDRIVE<n> , where <n> is the zero-based index of the physical drive to open. This will open the physical drive for raw access.
A filename can begin \\.\pipe\<pipename> . The named pipe <pipename> will be opened.
A filename can include a colon (:) character (after the initial drive letter sequence). This denotes an alternate data stream in the NTFS file system, which is treated effectively as a distinct file, but which is not listed disparately in directory listings. The :$DATA file stream is reserved for the normal contents of the file.
A filename can include (as a directory name) a " .. " or " . " sequence. The former case signals a transition to the parent directory, the latter signifies that no directory transition should be performed.

Many other equally bizarre behaviors are possible. The point is, unless you're careful about input validation, you'll end up introducing problems, because the underlying API is likely to implement behaviors that you're unaware of. Therefore, from the attacker's perspective, it makes sense to understand these underlying behaviors and try to get at them through the defending input validation mechanisms.

Some of the real-world bugs that happen because of this sort of problem (not all shellcode, unfortunately) include the IIS Unicode bug, the IIS double-decode bug, the CDONTS.NewMail SMTP injection problem, PHP's http://filename behavior (you can open a file based on a URL), and the Macromedia Apache source code disclosure vulnerability (if you add an encoded space to the end of a URL, you get the source code). There are many more. Almost every source code disclosure bug fits into the input validation category.

If you think about it, input validation is actually the reason why overflows are so harmful . The input to a function is interpreted in some underlying context. In the case of a stack overflow, the data that overflows the buffer is treated as a portion of a stack frame comprising data, Virtual Pointers (VPTRs), saved return addresses, exception handler addresses, and so on. What you might call a phrase in one grammar is interpreted as a phrase in a different one.

You could summarize almost all attacks as attempts to construct phrases that are valid in multiple grammars. There are some interesting defensive implications to this, in the fields of information theory and coding theory, because if you can ensure that two grammars have no phrases in common, you might (possibly) be able to ensure that no attack is possible based on a translation between the two.

The idea of interpretive contexts is a useful one, especially if you're dealing with a target that supports a variety of network protocols ”such as a Web server that sends e-mail or transfers data to a Web services server using a weird XML format.

Problems Cluster in Areas of Asymmetry

In general, developers tend to apply defensive techniques across a whole area of behaviors, using such things as length limits, checking for format strings, or other kinds of input validation. One excellent way to find problems is to look for an area of asymmetry and explore it to find out what makes it different.

Perhaps a single HTTP header supported by a Web server appears to have a different length limit than all the others, or perhaps you notice a weird response when you include a particular symbol in your input data. Or possibly specifying a recently implemented Web method in Apache seems to change your error messages.

Taking note of areas that are different can tip you off to areas of a product that are less protected.

Problems Occur When Authentication and Authorization Are Confused

Authentication is the verification of identity, nothing more. Authorization is the process of determining whether a given identity should have access to a given resource.

Many systems take great care over the former and assume that the latter follows . Worse, in some cases, there is seemingly no connection between the two ”if you can find an alternative route to the data, you can access it. This leads to some interesting privilege elevation situations, such as the Oracle extproc example. You can also see it in Lotus Domino with the view ACL bypass bug ( www.nextgenss.com/advisories/viewbypass.txt ), in Oracle mod_plsql with the authentication bypass ( www.nextgenss.com/papers/hpoas.pdf ”search for authentication by-pass ). The Apache case-insensitive htaccess vulnerability ( www.omnigroup.com/mailman/archive/macosx-admin/2001-June/012143.html ) was another good example of what happens when another route is provided to sensitive data.

You can also see this type of problem in many Web applications. Since HTTP is inherently stateless, the mechanism used to maintain the state (a session ID) normally carries with it the authentication state. If you can somehow guess or reproduce the session ID, you can skip the authentication stage.

Problems Occur in the Dumbest Places

If a particular bug hunt is becoming too technical and it's been a long day, don't be afraid to try the really obvious. Overly long usernames were the cause of these bugs among many:

www.nextgenss.com/advisories/sambar.txt
http://otn.oracle.com/deploy/security/pdf/2003Alert58.pdf
www.nextgenss.com/advisories/ora-unauthrm.txt
www.nextgenss.com/advisories/webadmin_altn.txt
www.nextgenss.com/advisories/ora-isqlplus.txt
www.nextgenss.com/advisories/ steel -arrow-bo.txt
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CAN-2002-0891
www.kb.cert.org/vuls/id/322540

Generally, the authentication phase of a protocol is a good target for overflow and format string research for the obvious reason that if you can gain control prior to authentication, you need no username and password to compromise the server. Another couple of classic, unauthenticated remote root bugs are the hello bug found by Dave Aitel ( http://cve.mitre.org/cgi-bin/cvename.cgi?name=CAN-2002-1123 ) and the SQL-UDP bugs found by David Litchfield ( www.nextgenss.com/advisories/ mssql -udp.txt ).