I am an interesting type of coder. While I cannot write simple, legitimate programs that any first-semester college coder can, I can decode complex malware. This comes from looking at worms, viruses, and trojans for 15 years. The best way to learn malicious programming techniques isn’t necessarily to write malware (as some authors suggest), but to read about malicious programming techniques and practice on real code as much as you can.
Reading and disassembling rogue code is hard enough, but malicious programmers use many techniques to make disassembly harder, including stealth, encryption, packing, and debugger tricks. Malware writers understand that the easier it is for the good guys to read their code, the faster an antivirus solution will be developed.
If the malware is active in memory, it can use stealth tricks to hide from disk editors, scanners, and disassemblers. Even the first IBM-compatible virus, Pakistani Brain, was stealth. When a disk editor peered at the viral-infected boot sector, it returned the original boot sector, not stored at the end of the disk, to the investigator.
Other stealth mechanisms include removing the infection just prior to examination, and then reinfecting the executable, artificially returning a false free memory amount, file size, or checksum prior to infection. These mechanisms are similar to what rootkits accomplish.
Computer viruses were the first programs to use encryption to hide. Initially, the encryption routines were simple XOR routines that didn’t prove overly difficult to reverse-engineer. But as overall legitimate encryption improved, so did malware encryption.
Today, polymorphic malware uses nearly unbreakable encryption, encryption routines that change on the fly, and decryption routines that change file location on every execution. Some malware writers have made themselves infamous by writing only successful encryption routines (polymorphic engines) that can be used by any malware program. Antivirus software often must execute rogue code into a simulated OS environment and wait for the malware to decrypt itself before the host program can be scanned.
Fortunately, disassemblers, like IDA Pro, can often recognize and automate the decryption process. At the very worst, they can help with the decryption process by letting the disassembler step through the decryption code process.
Packers are programs that compress executables into a smaller footprints. In the old days of MS-DOS, packers were needed because executable program segments were limited to 64KB memory segments and hard drives were relatively small. Packers allowed a program to be compressed (packed) and uncompressed on the fly. Like encrypted programs, packed programs must be unpacked before they can be examined.
Dozens of packers are available (http://datacompression.info/SFX.shtml), but UPX (http://upx.sourceforge.net) is probably the most popular.
Malicious hackers will complicate disassembly by modifying a packed file’s header (which can be used to identify a packed file) so that it cannot be readily identified but will still execute. In these cases, the modified packed file header must be suspected, and it can be fixed or bypassed using a feature-enabled disassembler (such as PE Explorer or IDA Pro).
Many malware programs are specifically coded to defeat easy disassembly. This can be accomplished by the previously discussed methods—stealth, encryption, and packing—or by adding instructions that confound debuggers and disassembly programs.
Most debuggers and disassemblers allow the examiner to execute the code step by step. This is done by artificially inserting an Int 1h (debug exception) instruction between every machine-language instruction—called a breakpoint. Breakpoints can be done in software or using special debugging instructions available in the CPU. Malware programs became creative by inserting their own breakpoints, which will be executed before the debugger can execute its own.
Another malware trick is to place code segments in inappropriate places. For example, instead of placing executable code in the data registers or memory, a malware program could place it on the stack instead. Or instead of placing local variables in the .data segment, they place these variables in the .code segment. All of these techniques make the disassembler’s job harder.
The release of the 386 CPU, and its successors, made it more difficult for antidisassembly tricks to be successful.
There are malicious programming tutorials all over the Internet and dozens of books to choose from on the subject. Using a search engine, search on the term “disassemble malware,” and it will bring up dozens of useful links. If you prefer books, as I do, consider the following suggestions:
An excellent book, Hacking Disassembly Uncovered, by Kris Kaspersky et al. (http://www.amazon.com/exec/obidos/ASIN/1931769222), is one of the best books on dissembling malicious code. Although it’s a bit dated, it uses IDA Pro and other common tools to look at and teach hands-on disassembly. It teaches techniques from the hacker and cracker point of view.
Malware: Fighting Malicious Code, by Ed Skoudis and Lenny Zeltser (both of SANS) (http://www.amazon.com/exec/obidos/ASIN/0131014056), is a highly ranked book, discussing, in technical detail, malware vectors.
Exploiting Software: How to Break Code, by Greg Hoglund and Gary McGraw (http://www.amazon.com/exec/obidos/ASIN/0201786958), teaches about disassembly while trying to teach the basics of good coding.
The Shellcoder’s Handbook: Discovering and Exploiting Security Holes, by a who’s who of computer security experts (http://www.amazon.com/exec/obidos/ASIN/0764544683), goes beyond coding details and explores the different ways to keep your system secure.