BLACK BOX ANALYSIS | Anti-Hacker Tool Kit, Third Edition

Reverse engineering using only the assembly or byte code is a very arduous, time- intensive process. If you find yourself in the situation where you can't access the source because it has been stripped or obfuscated , the best course of action is to start looking at the things you can easily see. What text strings does the file contain? Does the program try to access the network? What other files does the program rely on and is there any information you can glean from the support files? The amount of nonprogrammatic analysis you can perform varies from using a few command-line utilities to determine things about the file all the way to creating a true "sandbox" that tracks every movement of the binary as it executes on the system. If you are in a position where you will find yourself doing this more than once or twice every few months, it would be well worth your time to set up a lab sandbox to execute suspicious binaries in so you can quickly and easily diagnose them. We will discuss the creation of such a sandbox later in the chapter.

Viewing the Text Strings in a Binary

Seeing what text strings exist in a binary can be extraordinarily useful. These text strings can give you clues to what the binary does as well as information that the programmer thought was secret. For instance, you can determine if a program accesses the Internet because the addresses, if in canonical form (http://www.google.com, for example), will be stored in the executable as a text string. In addition, say, if the programmer has set a password to access a backdoor in the executable (this is common if say the program runs in the background and listens for someone connecting to the machine with correct credentials before activating), that password may be stored as a text string in the file. Let's take a look at a sample:

 $ strings backdoor ... l33t0wn3d ...

As you can see by the output, there is a string in the executable l33t0wn3d , which is more than likely some kind of password for the program. This can go a long way in helping us if we can figure out where to input it.

Using LSOF to Determine What Files and Ports a Binary Uses

LSOF is an open-source utility which can be extremely useful in determining what a program does. LSOF is short for LiSt Open Files, and it shows what files each program running has open. This is useful not only for determining what supporting files a program uses, but since network sockets are treated like files, you can also see what network connections a program has open for both transmission and listening. Let's take a look at the program output:

 lsof -p 600 COMMAND    PID USER   FD   TYPE        DEVICE SIZE/OFF   NODE NAME ... backdoor 600 root    8u  inet 0x30002432228      0t0    TCP *:2950 (LISTEN) backdoor 600 root    7u  inet 0x300031f1410      0t0    TCP out:*->199.1.90.2:* (IDLE) ...

As you can see, there are two network sockets associated with backdoor. One seems to be an outgoing connection that is idle. The other is of more interest to us. It is a listener on port 2950 via the TCP protocol. This could be the way that outside hackers communicate with the backdoor. Now that we know the ports and communication tunnels that the binary uses, we can look at how it communicates with the outside world.

Determining Ports Using NMAP

Sometimes the easy way can be just as effective as more complicated measures. Nmap is a popular port scanner used to determine which TCP/IP ports are open on a machine. While the binary is running, execute a port scan on your external network interface to see if it is listening on a port. There is a caveat here, however. Subverting port scanning, while not trivial, is not a difficult task. We have seen some root kits that use obsolete protocols that are overlooked in a port scan (covert channels that are neither TCP nor UDP). The other method is to use some kind of knock-knock protocol, such as sending an ICMP packet of a certain size before the backdoor will reveal itself. If you find a port using nmap, great, but don't assume that just because you don't see it at first glance it isn't there. If you are graphically inclined, you can use the GUI version of nmap, nmapFE, to help with the process.

Using a Sniffer to Determine Network Traffic

Since we now have the ports used by the backdoor, we can easily set up a sniffer to monitor the traffic flow in and out of the program. This can be as simple as setting up tcpdump or Ethereal to monitor the inflow and outflow, looking for patterns and data. The other thing that you should think about doing is setting up an IDS system such as snort to see if there is anything the program does that matches a signature for a known Trojan or backdoor. This can help identify both the genus of the malware, as well as determine if it is using some covert channel for communication that LSOF may have missed (a few years back the Honeynet project had a contest in which they had a binary that used an obsolete protocol to transmit information, and most entries completely whiffed on it because it was so nonstandard).

Looking at the System Calls

No man is an island. The same holds true for software. Whenever a program writes to the screen, accesses the network, accesses a file, or does any number of other things, these calls are made to the system libraries. These libraries are the core of the operating system, which allows for the layer of abstraction between the hardware and application software. If you have a binary that you think has been modified (i.e., we have seen versions of ls which have been modified to transmit out sensitive data when run), and you want to figure out with some certainty what it is doing, looking at the way it calls system routines can help. In the Unix world, there is a tool that started out as truss on Solaris and is now called strace on the other Unix flavors (Linux included). It allows you to do exactly that. Let's look at an example of some sample output so we can see how to interpret the results:

 strace -p 123 Process 123 attached - interrupt to quit rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0 read(5, "l33t0wn3d", 9)                 = 1 write(6, "welcome Dr. Faulken", 15)     = 1

As you can see, we attached to the process and monitored it for system calls. We have two of specific interest: the program called the system command read with the input "l33t0wn3d." As you discovered earlier using strings, this was built into the binary and is possibly an activation password. The suspicion is confirmed with the next line, where the program then pushes the string "welcome Dr. Faulken" back across the network, indicating the hacker has activated some kind of backdoor routine.

Identifying Kernel-hiding Techniques

Kernel-hiding methods for identifying and analyzing binaries have been around for tens of years. Trojan writers understand that employing these methods is the first thing a responder will do when looking for backdoors. As such, they often write code into their programs that will modify the kernel or somehow modify the system the code will run on to hide the backdoors. Depending on the sophistication of the hiding technique, it can make locating and identifying these Trojans a very difficult prospect. In the best case, they will simply modify the process listing program so that it doesn't display their Trojan as running. In the worst case, they actually modify the kernel so that it actively protects the secrecy of the program running. If you are using Linux and have a package manager such as RPM, you can actually check to see if any of the vital binaries have been modified. Here's a program in action:

 # rpm -V ps S.5....T /sbin/ps

So what does this mean? Let's look at the flags that RPM gives us in front of the filename:

 S = size change M = permissions change 5 = MD5 changed L = Symlink changed D = Device change U = User change G = Group change T = Date/Time change missing = file is gone

As you can see, since ps has been flagged as S, 5, and T: the size, the time/datestamp, and the md5 checksum all have changed since the default install was made. This is a very good indication that the binary has been modified, and if you didn't make the change, it's an even better indication that you have been thoroughly rooted and it may be time to scrap the machine and start over.

If the kernel has been modified, however, you have a much more difficult task ahead. The easiest way to proceed will be to create a quarantined sandbox machine that you can use to monitor the actions of the program without worry of a compromised machine.

Creating a Sandbox Machine

The obvious way to do this is to build a machine, load it up with monitoring tools such as LIDS, snort, and whatever else you can throw at it, fire up the program, and go to town. However, in my experience it is sometimes better in the long run to use a tool such as VMware to simulate a virtual machine and then monitor everything through the virtual machine. This approach has several benefits: First, it is very easy to start over with a clean image if the program obliterates everything on your test machine; second, since everything is a virtual software replication of hardware, you are afforded a level of access that would be hard and/or cumbersome to gain if you are just dealing with a hardware machine alone. Since VMware acts as a virtual bridge between the networking in the virtual machine and the true networking on the host computer, you can monitor, modify, and manipulate the data in ways that would be hard to impossible over a real network bridge/router configuration. Also, if you want to see how multiple computers running the binary interact with each other (such as the case with DDoS software), you can easily use VMware to create multiple machines and place them all on the same virtual subnet. With a real machine you would have to spend time and money getting everything up and running.