Although computer honeypots have probably been around in one form or fashion since the 1960s (when computer viruses and trojans were first taking shape), they were not widely discussed until Clifford Stoll’s successful venture in capturing a West German hacker using a physical honeypot in 1986. Stoll gained global notoriety recounting the story in his book, The Cuckoo’s Egg. It’s a good read even for people who are not interested in computer security.
Noted firewall authority Bill Cheswick published his early honeypot experiences in his infamous “An Evening with Berferd” paper (http://www.deter.com/unix/papers/berferd_cheswick.pdf) in 1991. Cheswick developed a few fake services, created fake password files, and added them to his production system at AT&T. He also wrote a script that pulled fake service activity from the logs. He recorded hacks involving SMTP, FTP, finger, and multiple attempts to escalate privileges. Cheswick eventually set up a sacrificial environment called the Jail, which looked a lot like today’s honeypot systems, to monitor and record the hacker activity.
Dr. Fred Cohen (http://www.all.net), considered the father of computer virus theory, developed a honeypot called the Deception Tool Kit (DTK) in 1997. It is a free collection of Perl scripts and C executables designed to respond to hacker probes as if they were vulnerable systems. It enjoyed much success and it is still in use today.
Up until just recently, honeypots existed without any standards for data monitoring and control. Few resources existed to get information about honeypots, and people were pretty much doing their own thing. Lance Spitzner changed all of this in 1999, when he legitimized the field of honeypots as a separate and distinct computer security discipline by forming The Honeynet Project. Spitzner, the former head of computer security for Sun Microsystems, researches and promotes honeypots professionally and in his spare time. The Honeynet Project’s research culminated in the Know Your Enemy book, published in 2002, and Spitzner’s Honeypots book, published in 2003. Both are good bibles for anyone new to honeypots, and you’ll find Spitzner and The Honeynet Project team members still eager to interact with anyone interested in honeypots.
Competitive open-source and commercial honeypot solutions really didn’t get started until 2001. The Honeynet Project developed the first standard model for deploying honeypots that is now known as the first-generation (GenI) model. It focuses on data control and data capture for honeypots.
There is actually a third component of GenI honeypots called data collection, which refers to the collection of data from multiple honeypots off a honeynet. Data collection will be covered in Chapters 9 and 10.
Data control means controlling what data goes into and out of the honeypot. Never let the hacker compromise the honeypot in such a way that you no longer have control over the flow of data. It means making sure that malicious packets are always directed toward the honeypot and away from production systems, and vice versa. The ultimate goal of data control is to prevent hackers from using the compromised honeypot to attack other computers.
Low-interaction honeypots emulate services only at a basic level, so they are self-limiting in what the hacker can do. High-interaction honeypots are a different matter and additional mechanisms must be used to maintain control. Many new honeypot administrators rely on being able to respond to alerts quickly and manually shutting down the honeypot if the hacker starts attacking other computers. This works fairly well if you’re there monitoring the honeypot, but not so well if you are miles away from the honeypot when the alert is sent. Most honeypot administrators attempt to automate data control. GenI honeypots do this by using scripts or filters on external routers or firewalls. Outgoing connections can be blocked or limited. If all outgoing connections are blocked, the hacker might become frustrated or suspicious and leave. The Honeynet Project suggests limiting outgoing connection attempts to a certain number of outgoing requests in a given time period. The thought here is that if connections are limited, hackers will be curbed in what malicious mayhem they can cause elsewhere. Honeypot administrators understand that it takes only one malformed packet to cause a DoS attack, but if malicious connections are at least limited, it decreases the risk that the hacker will be successful.
Data control tools and scripts were developed by The Honeynet Project. Some of the scripts, like those for Checkpoint’s FW-1 firewall, can be used in Windows or Unix, but there are significantly more tools and scripts for the Unix world. Windows users are left with developing their own mechanisms and solutions, and their efforts have had varying levels of success. No matter how you approach it, data control is one of the toughest issues for the honeypot administrator to tackle.
After making sure hackers go only where you want them to go, it’s now important to capture what they do.
Data capture refers to monitoring and logging everything the hacker does. The information should be recorded on remote management computers using a secure method and without alerting the hacker. Data capturing should be done in complementing layers (think of the defense-in-depth principle), with different mechanisms capturing different types of data. Data capturing can be done by many methods, including the following:
Honeypot log files
Network device logs
You always want to capture all network packets, headed to or from the honeypot, using either a network sniffer or an IDS. Full packet decodes are often the best way to identify what’s happening between the honeypot and outside world. They will capture file transfers, instant messaging communications, and remotely typed in keystrokes. Unfortunately, hackers are increasingly using encrypted communications to prevent us from prying. In these cases, it is essential that something monitor the hacker’s commands and communications on the honeypot before the traffic is encrypted. The solution is to install a keystroke-monitoring program to capture every keystroke the hacker types in on the honeypot before it is encrypted or after it is decrypted.
Of course, you can’t simply install a keystroke-logging program on the honeypot and hope the hacker doesn’t see it. Most keystroke-logging programs write data to a local file or send it to a remote computer. Either way, the hacker is liable to notice it if you don’t take steps to hide it. If the hacker does notice the keystroke-logging program, or any logging for that matter, it’s game over. The hacker will leave, or format the drive and then leave. The keystroke-monitoring mechanism must be hidden. This is often done by renaming the keystroke-monitoring program to something the hacker wouldn’t notice as unusual. For example, it could be renamed acroRd32.exe or atigirt.exe, posing as the ubiquitous Adobe Acrobat Reader or ATI video driver utility programs. There are even a few programs, like Sebek (http://www.honeynet.org/tools/sebek) and ComLog (http://iquebec.ifrance.com/securit/indexen.html), made specifically to hide as they capture keystrokes. These programs will be covered in Chapter 10.
You also want to capture everything the hacker modified on the honeypot, preferably in one exception report. Did she upload software, modify the Registry, change file permissions, add user accounts, elevate permissions, or modify executables? One of the easiest ways to answer these questions is to use snapshot software. Snapshot software (also known as integrity checkers) takes a digital snapshot of the system before and after the compromise. Tripwire (http://www.tripwire.com) is considered the commercial leader in the field of snapshot software. It works by creating a baseline database of files and their digital hashes. It logs file size, creation date, security-access controls, alternate streams, and documents 24 critical Registry areas. It can track modifications, deletions, and last-access dates. There are several free snapshot utilities for Windows systems, including Sysdiff (ftp://ftp.microsoft.com/bussys/winnt/winnt-public/fixes/usa/NT40/utilities/Sysdiff-fix) and Winalysis (http://www.winalysis.com). None are as good as Tripwire, but a few are close. I will cover these utilities in Chapter 10.
An open-source Unix version of Tripwire can be found at http://sourceforge.net/projects/tripwire.
Data capturing should also be done on any honeynet device that has logging. Somewhere between the outside world and your honeypot will sit one or more network devices. GenI honeypots are usually separated from the main network by a firewall, router, and switch (see Figure 1-2 earlier in this chapter). All of these devices have logging features that should be enabled. When you are analyzing the attack on your honeypot, you’ll be glad that you had all these layers of data captured. One layer will pick up what the others did not. Together, they will paint a picture that leaves little to the imagination. You will capture hacker mistakes, typos, file uploads, chats, and unexpected new exploits.
The firewall is usually a production firewall and serves as the first layer of data control and data capturing (see Figure 1-2). A GenI honeypot sits off a firewall port, preferably the DMZ segment. This gives the honeypot its own segment and separates it from the internal production network. Another router is placed between the firewall’s DMZ port and the honeypot to give an additional layer of data control and monitoring. Testing from The Honeynet Project showed that the router’s extra layer protected the production network from detection by the hacker. The switch is implemented so that port mirroring can be accomplished. Port mirroring (also known as port spanning) is a switch feature that allows one port to get copies of all traffic headed to another port. In this case, you would want your IDS/packet-capturing computer to sit on the management port receiving a copy of all traffic headed to and from the port the honeypot sits on. This setup makes it very difficult for the hacker to discover the monitoring of the honeypot. You should make sure the switch you use with your honeynet allows port mirroring.
All network devices used in the honeynet should be secured. This means physically securing the devices, using updated firmware, using complex passwords, changing the default administrator account name if possible, disabling unneeded features, and encrypting communications traffic between the management workstation and the network device.
GenI honeypots have always bothered security researchers for two reasons. First, the idea that hackers could use a compromised honeypot to attack even one innocent remote host with one malicious packet is a technical problem and an ethical dilemma. No researcher wants to assist a hacker in attacking somebody else. Putting aside the ethical dilemma for the moment, there are potential legal risks for allowing it to happen (see the “Risks of Using Honeypots” section later in this chapter). How do you prevent the hacker from attacking another computer using your honeypot without the hacker knowing?
Second, GenI honeypots have a higher than desired chance of being detected by the hacker. The extra router, because it decrements the time-to-live (TTL) counter in every packet header, could alert hackers to the fact that they are on a honeypot. Conventional hidden keystroke loggers can always be found out. With encrypted communications increasing, how can you capture the hacker’s keystrokes, record them to a remote computer for safekeeping, and make sure the hacker does not notice?
The second-generation (GenII) model, illustrated in Figure 1-4, responds to the GenI model deficiencies with a significant architecture change and three new mechanisms. The biggest change is using one network device, known as a honeywall gateway, to implement layer 2 bridging (versus routing), an inline IDS, and packet capturing—all on one computer. The previous data control problems are minimized by replacing firewall filters with an inline IDS to manipulate outgoing traffic (for example, using Snort in Replace mode). When malicious outgoing traffic is detected, the IDS changes it just enough so the attack becomes harmless.
Figure 1-4: GenII honeypot setup
For example, the Code Red worm buffer overflow exploit begins with the following command:
An IDS can detect the malicious packet, respond to its presence, and then change it to this:
The one byte change, the a to an o in the exploit command, default, makes the attack harmless. Unless the hacker tests a known vulnerable host with her attack, she will just think the attack is unsuccessful. Some researchers have suggested that the outgoing packets be redirected to another honeypot. That way, the hacker thinks she is being successful, but she is actually hacking yet another honeypot. GenII data control, if it can be pulled off successfully and reliably, is heads and shoulders above the previous technology.
The second change from GenI technology is the use of a layer 2 bridge device instead of a router to move malicious packets. This prevents the IP packet’s TTL number from being decremented the way it would be with a router. The layer 2 bridge can be combined with the inline IDS to forward all malicious packets to a honeypot. In Figure 1-4, notice that both the honeypot and the production network are on the same side of the firewall. This allows the honeypot to share the same network IP address scheme and the same broadcast collision domain, and makes it appear as if it were on your production network. But because the honeywall is directing traffic, there is little risk to the production network.
Unfortunately, there are no preconfigured Windows-based honeywall solutions. The Honeynet Project has released a GenII honeywall that runs from a bootable CD-ROM (http://www.honeynet.org/tools/cdrom). The goal is that you will be able to take a spare computer, attach it to your network, boot from the CD-ROM, do some configuration, and have a honeywall up and running in the shortest period of time possible. The honeywall uses Linux-based operating system utilities and tools. It requires a bit of configuration and an understanding of basic Linux commands. The CD-ROM doesn’t come with a honeypot or the other monitoring tools you’ll need, but it eliminates days of work of trying to build your own honeywall.
The third new component of the GenII model is a better way to hide the keystroke-logging mechanism. GenII honeypots modify the underlying OS’s kernel in such a way that a keystroke-logging program is running all the time, but it’s virtually undetectable. A program called Sebek (mentioned earlier), which means “watching over you” or “crocodile god” (depending on which source you read), has been developed for Unix and Windows platforms. After Sebek is bound to the OS kernel, it collects keystrokes (and other local GUI information) and sends them to a remote predefined computer. The information it sends is bound to the OS kernel in such a way that it is nearly impossible for hackers to find. Interestingly, its hiding technique was learned from a Unix malware program.
The Honeynet Project has already defined future honeypot generation technology. The goal is to collate data from multiple distributed honeypots (or honeynets) with other security-related devices and to provide relevant data that can be used in a proactive defense. Most security experts see this as the Holy Grail for the computer security industry, and honeypots and IDSs are leading the way.