2.6 Decompiling Explained

When you compile source code into an assembly, the compiler interprets your C# or Visual Basic .NET statements, and creates a series of MSIL statements that will be executed by the .NET Framework. A decompiler is an application that analyses the MSIL statements in order to recreate the original Visual Basic .NET or C# statements written by the programmer.

Unfortunately, the .NET compilers contain human-readable information from our source code in the MSIL, including the names of types, methods, and fields. A decompiler can use this information to create source code that is very similar to the original. Some information, such as comments and blocks of code excluded by conditional compiler statements, are not included in assemblies by the compiler and cannot be restored by a decompiler.

The nature of MSIL makes it easier to decompile .NET assemblies than native Windows applications, which are compiled into instructions that are targeted at a specific CPU, such as an Intel Pentium. Lower-level instructions are more difficult to reconstruct into code statements than the relatively abstract MSIL statements. The proliferation and use of decompilers is more widespread than you might think. There are three main reasons why an assembly is decompiled:

Interest

The most benign reason for decompiling an assembly is simply to gain an understanding of how an application or library is written; the person who decompiles the assembly has no malicious intent, and simply seeks to improve her knowledge of .NET programming.

Intellectual property theft

Your assemblies can be decompiled to reveal your business secrets, which can be used commercially by your competitors. In this context, business secrets may be proprietary algorithms, or the inner workings of your complete application.

Application subversion

Decompiling an assembly can provide details of how to subvert an application, exposing licensing details and allowing license codes to be generated that allow illegal copies of your application to be activated.

The scope for intellectual property theft through decompilation has been lessened by the increased use of thinner clients to connect to network services (this includes the move towards XML web services); there is less complexity to the client application, and the sophisticated logic is deployed within a remote network. In contrast, the prevalence of network services increases the scope for application subversion. Any network service that grants trust to clients based on data that is included in an assembly is subject to subversion through decompilation. Analysis of a client application can provide a wealth of information on network protocols and security configuration, which can be used to manipulate the network components of an application against the wishes and expectations of the developer.

Decompiling into Another Language

MSIL is common between all .NET languages, meaning that an assembly written in Visual Basic .NET can be decompiled to produce C# statements, or statements in any other .NET-compliant language. There is no additional risk posed by this feature, but it does lower the barrier to understand your code; not only can the source code statements be reconstructed, but they can be reconstructed in a language which a potential attacker is familiar with.

2.6.1 Decompiling Assemblies

In this section, we demonstrate how much detail a decompiler exposes from an assembly. You will use the open source Anakrino/Exemplar decompiler to decompile the single-file assembly you created in the previous section; at the time of writing, the decompiler is available at http://www.saurik.com/net/exemplar/.

The decompiled versions of the SumNumbers and SumArray classes are below the decompiler we have selected generates only C# source code. We do not explain how to install or use the decompiler in this book we present the decompiled output so that you can understand what kind of information can be obtained from an assembly:

# C# using System; public class SumNumbers {     private int o_total;     public SumNumbers(  ) {         o_total = 0;     }     public void AddNumber(int p_number) {         o_total += p_number;     }     public int GetTotal(  ) {         return o_total;     } }

Be careful when you decompile an assembly. You may be in violation of your country's law if you decompile an assembly to which you do not own the copyright or intellectual property rights. We do not condone the decompilation of such assemblies. We accept no responsibility if you decompile assemblies which contain the intellectual or commercial property of others.

The efficacy of a decompiler is measured by the accuracy of the source code that it generates the better a decompiler is, the more the decompiled source code resembles the original statements. Our decompilation has produced a rendition of the SumNumbers class that is very close to the original; the names of the fields are preserved, and the structure and function of the class is clear:

# C# public class SumArray {     public static int SumArrayOfIntegers(int[] p_arr) {         SumNumbers sumNumbers = new SumNumbers(  );         int[] nums = p_arr;         for (int k = 0; k < (int)nums.Length; k++) {             int j = nums[k];             sumNumbers.AddNumber(j);         }         return sumNumbers.GetTotal(  );     } }

The decompiled version of the SumArray class is less like the original but still clearly demonstrates the implementation. Our simple assembly is easily decompiled, and the workings of our data types are clearly exposed; logic that is more complex can cause difficulties for decompilers, but in general, an unprotected assembly will yield its secrets easily.

2.6.2 Protecting Against Decompilation

If your assemblies contain no proprietary data, and no information that can be used to subvert your application, then you are in a position to distribute the assemblies freely; otherwise, you should consider protecting against decompilation with one of the techniques discussed below.

You should not protect your assemblies against decompilation unless you have a good reason all of the techniques that make decompilation more difficult rely on changing the contents of your assemblies; this makes it very hard to debug problems and may even introduce new defects into your applications.

2.6.2.1 Obfuscation

Obfuscation is the technique of altering the MSIL statements so that the application executes in the same way, but the output of a decompiler is unreadable. Obfuscation is such an important technique that Microsoft has included a copy of a limited functionality obfuscator in Visual Studio .NET 2003. Different obfuscators use different approaches to obscure decompiler output, but we summarize the more common types of obfuscation below:

Renaming methods and fields

Obfuscators rename the nonpublic methods and fields defined by your data types in a way that makes it difficult to read the decompiled output. A common technique is to use very long strings that differ by a single character or to use non-printing characters accepted by the .NET runtime, but which text editors do not display correctly.

Adding control flow obfuscation

Obfuscators make application logic more difficult to follow by creating complex sequences of method calls that do not do anything. This is more effective than it sounds, because it is hard to establish if these methods are related to the logic of the application. This technique is especially effective when combined with method and field renaming, creating especially complex decompiled output.

Encrypting literal strings

Encryption is applied to the literal strings defined within your assemblies; the purpose of this is to slow down searches that may reveal the purpose of sections of code; for example, searching for the word "license" may reveal which parts of your code deal with application license control.

Effective obfuscators combine these approaches and often apply proprietary techniques. There is a kind of "arms race" between the developers of obfuscators and the developers of decompilers, where each new feature added by an obfuscator is eventually compromised by a decompiler.

The biggest problem with obfuscation is that it alters the MSIL within your assembly; when problems arise, you will find that the obfuscation process can seriously hamper the debugging process. As a general guideline, do not obfuscate your assemblies unless you have to, and always select an obfuscator from a reputable company that will be able to support you if you encounter problems.

You must remember that obfuscation does not protect your assemblies against determined decompilation. Although an obfuscator will create MSIL that produces difficult-to-read decompiler output, your application logic is still contained in the MSIL and can eventually be reverse-engineered; at best, obfuscation slows down the process.

2.6.2.2 Native compilation

As you will see in Chapter 4, the .NET Framework runtime compiles your MSIL statements into native commands for the CPU before the code is executed. An alternative to obfuscation is to perform this compilation yourself and to create native instructions that cannot be processed by an MSIL decompiler.

Native compilation is a relatively new technique as applied to .NET assemblies, and the tools available at the time of writing are immature; the principal risk with native compilation is that the output can differ from that produced by the normal .NET compilation process, which can hamper the debugging process.



Programming. NET Security
Programming .Net Security
ISBN: 0596004427
EAN: 2147483647
Year: 2005
Pages: 346

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net