Log Analysis Basics | Inside Network Perimeter Security (2nd Edition)

Now that we've discussed the basics of network log files, let's dig in to the really interesting materialhow to analyze the log files after you have them. Log file analysis is an incredibly important area of network security that is often overlooked. If network log analysis is not being performed regularly and thoroughly in your environment, your organization's perimeter defenses are significantly weakened because you and other administrators are missing a large part of the overall security picture.

Getting Started with Log Analysis

When you're just starting to perform log analysis, you might find it difficult. Let's be honestlooking at page after page of cryptic log entries probably isn't how you want to spend your time. You have many other tasks to do, and most of those tasks are attached to people who are calling you, wanting to know when you're going to be taking care of them. Log files don't yell at you or beg for your attention, so it's easy to ignore them or forget about them altogether. And when you first start analyzing your logs, it's bound to take a significant amount of your time.

But after you have been reviewing your logs on a regular basis for a while and you have automated the log analysis (as described in the "Automating Log Analysis" section later in this chapter), you will find that it doesn't take as much of your time as you might think, and it can actually save you time and headaches down the road. The hardest part is getting started. Here are some tips that will help you start analyzing your logs:

Establish a set time each day to review your logs, and stick to that schedule. Preferably, you can do your review at a time of the day when you will have minimal interruptions. If you start making excuses for skipping your log file review, you will most likely stop reviewing them altogether. Just hang in thereit will become easier and faster over time.
Choose a reasonable length of time to do your daily log review and analysis session. Of course, this is highly dependent on how many log entries you will be reviewing and the security needs in your environment. An hour a day is probably a good starting point; adjust accordingly as needed. Of course, if you find that the amount of time doesn't work for you, you can always change it later.
Decide in how much depth you want to analyze your logs. There is value in doing an overall review of all the log files, but also in doing an in-depth analysis of log excerpts. When you're starting, it's probably good to try both techniques to get a better "feel"for your log files.

Note

One method that might help you to "break the ice" and start analyzing your logs is to do searches on them using keywords such as "blocked," "denied," and "refused." With many log files, this is a quick-and-easy way to identify log entries that might need further review.

This brings up a critical point: the concept of "feel." Much of the motivation for reviewing your log files regularly is that you will get a feel for what your log files normally look like. The only way to achieve this is through regular reviews. After you have a sense of the baseline, it should be easy for you to spot most deviations from that baseline in the future. If you know what entries are normally recorded in your logs, and one day you see an entry that you have never seen before, you will naturally target that as something to be investigated further. If you didn't have that feel for your logs, you probably wouldn't even notice such an entry. That's the hidden value in performing regular analysisthe ability to sense when something unusual has occurred. After you have that feel, you can increasingly rely on automation to perform the basic log analysis for you, selecting only those items that are of particular interest to you.

Fun with Log Files?

You might think that reviewing log files is not the most exciting way to spend your time. Personally, I find log analysis to be a fascinating area of network security, and it's actually become a hobby of mine. Honest! When I am reviewing a potential incident, I think of log file analysis as a jigsaw puzzle, with some pieces that don't look like they fit at all, and other pieces that seem to be missing. Each piece has something to contribute to the picture, and figuring out how they all fit together is really quite addictive.

When I first started doing log analysis, I found it to be rather boring and difficult. I didn't understand most of the log formats, and I usually didn't realize the significance of the data I was reviewing. Over time, my knowledge grew, and trying to decipher the activity behind the log entries became a welcome challenge. Try log file analysisyou might be surprised at how much you like it.

Automating Log Analysis

After you have been manually reviewing and analyzing your log files for a while, you will be able to glance at many log entries and immediately understand their significance. Most log entries will probably be of little or no interest to you. Over time, you will notice that fewer of the entries merit your attention. This is where automated log analysis becomes so helpful to you. By now, you know what log entries you really want to seeor, more correctly, you know what log entries you really don't want to see. By automating parts of the log analysis process, you can generate a report of only the unusual activities that you would like to investigate further, which will save you a great deal of time.

Log analysis automation can be a bit tricky, depending on the format of the log files. Many log files are in a text format, which means they are typically easy to review using automated techniques. However, other log files are in binary or proprietary formats that cannot be automatically reviewed in their native form. In some cases, you can export the log file to a text file, either through the application that recorded the log file or through a separate log conversion utility. In other cases, you might not be able to access the log file information unless you use a viewer that the application provides; in this example, you probably can't automate the review of that log file. Whenever possible, it is best to handle log files that are in some sort of text file format, such as tab-delimited or comma-separated values (CSV). This will make the rest of the automation process far easier.

Another potentially difficult aspect of log analysis automation is handling the volume of the logs. Depending on the amount of traffic being monitored and what events are being logged, a single log could contain millions of entries a day. Remember, you might be reviewing dozens, hundreds, or even thousands of logs, depending on the size and composition of your environment. In such a case, you will need to choose an automation method that not only can process that number of entries in a timely manner, but also has adequate storage space. Some of the possible automation methods are discussed in the next section, "Getting the Right Data from Log Files."

As you do network log analysis, you will discover that reviewing the "bad" log entries is often insufficient. Sometimes you will also want to look through the original log file to see other entries from the same source IP address, for example, or to look for other activity that occurred immediately before or after the event in question. If you save only "bad" log entries and do not preserve the raw logs, you lose the ability to analyze such events. Raw logs are sometimes required for evidentiary purposes, so saving them is often important.

Getting the Right Data from Log Files

As mentioned earlier, it's best to convert your binary log files to a text format such as tab-delimited or comma-separated values. After you have your log files in a suitable format, you want to find the pertinent log entries and generate a report. You can do this in two ways:

Use a searching utility such as grep or sed to look through a log file for records that match or do not match particular strings or patterns.
Import some or all of the data from the log files into a database and then search and analyze the data.

Determining which method to use depends on several factors, including your own preferences. Performance is a major consideration; if you have to process enormous numbers of records, you need to choose a method that can handle it. This is dependent on the number of log entries, the complexity of the searches and analysis you want to perform, and the tools and databases available to you. Databases have a distinct advantage because you can import logs into a database, store the data there, and run reports over days, weeks, or even months of data and significant events. This can identify suspicious activity that occurs over long periods of time, which might never be found by processing a day's worth of data at a time. It is also invaluable when performing incident handling because you can review previously logged events involving particular hosts or protocols.

Many different tools can be used to assist in log file processing. grep and sed are two useful text-searching tools from the UNIX world that have Windows equivalents. Programming languages such as Perl can be extremely powerful in parsing log files, selecting entries, and generating reports. If you have a small volume of logs, you might be able to import them in a spreadsheet and analyze them through macros, sorts, and searches. Microsoft Excel has a feature called AutoFilter that allows you to quickly sift through numerous rows of data. For larger volumes of logs or more complex logs, a full-fledged database might provide a robust and powerful solution that can analyze log entries and generate reports of suspicious activities.

Note

If you are a UNIX administrator who needs to perform log analysis on Windows systems, you might find that the task is much easier if you use a collection of cross-platform tools such as Cygwin (http://www.cygwin.com/), which provides a simulated UNIX environment, complete with many UNIX utilities, for Windows machines.

Be aware that it might take considerable time and resources to create a log analysis automation solution. You might need to write programs or scripts to perform the analysis and to generate reports based on the results of that analysis. For every hour you spend creating and testing a strong automation solution, you will save yourself many more hours in analysis time, and you will be able to react much more quickly when an incident occurs.

Automating Check Point FireWall-1 Log Analysis

After spending hours each day looking through FireWall-1 logs, I realized the need for a system to automate some of the mundane tasks of correlating and flagging suspicious records. Unfortunately, we did not have the budget to purchase a commercial log analysis solution, so I decided to do what I could with a custom Perl script. The script I wrote operated as follows:

It extracted records from FireWall-1 log files.
It parsed each entry to pull out relevant fields for all blocked packets.
It counted the number of log entries for each source IP address, destination IP address, and destination port.
It added record counts to numbers saved from previous runs.
It generated a report that specified the top 20 addresses and ports the firewall blocked.

This relatively simple script made it easier for me to get a quick sense for the activity present in each day's logs, and it helped me detect low and slow scans that would not have come to my attention without maintaining a historical record of events.

Designing Reports

This might seem to be a silly question, but what do you want to report? Of course, you want to know what suspicious activity is occurring. But in many cases, it's ineffective to generate a single report of all suspicious activity. Some events are going to be much more significant than others in your environment by default, so you might want to emphasize them in your report and summarize the less interesting events. For example, if your daily logs typically include entries for a thousand port scans and two or three DNS exploit attempts, you probably want to see some details on the DNS attacks but only a summary of the port scan activity.

A key issue to consider is who will be receiving the report. If the report is just for you, then by all means, design it however you would like to. Perhaps, however, you are designing reports that will list suspicious activity for all of your organization's firewalls. Some system administrators might like a report of all events that involve their hosts that were logged by the firewall. The person who is responsible for web server security might like a report of suspicious web-related activity. It would also be nice to be able to do custom reports on demand, such as quickly generating a report of all events that were logged in the past two weeks involving a particular IP address. Such a capability would be extremely helpful when investigating an incident.

Using a Third-Party Analysis Product

Writing programs or scripts to analyze log files and generate reports might sound like a lot of work. Many times, it is, although it does give you a highly tailored solution. If creating your own automation system is not feasible, you might want to consider using a third-party product that will perform some log analysis and reporting for you. Or you might want to combine your own custom scripts with third-party products to create a solution.

Vendors such as ArcSight, e-Security, GuardedNet, Intellitactics, and NetForensics offer products called security information management (SIM) software that can be of great assistance to you in automating network log analysis. These products are designed to accept logs from various sources, including many brands of firewalls, intrusion detection systems, antivirus software, and operating systems. They can also accept generic text-based logs, such as ones you might create for other products with your own custom scripts. The SIM processes the log entries from all the sources to normalize the data into a consistent format and then performs event correlation and identifies likely intrusion attempts. SIM products offer other useful features, such as log archiving and extensive reporting functionality. You can save yourself many hours by using a SIM product as the heart of your network log analysis solution, but be warned that SIM products involve major software and hardware expenses.

Timestamps

The importance of timestamps in log files cannot be emphasized strongly enough. If an incident ever requires legal action and the timestamps in your log files are out of sync with the actual time, you might have difficulty proving that the incident occurred at a particular time. In many courts, you must be able to prove that your time source is reliable. A much more frequent problem is that you will have a difficult time correlating activities among different log files if all the log files are not synchronized to the same time. It's easy to compensate for this if two boxes are in adjacent time zones and are synched to be exactly an hour apart, but if you are dealing with 20 devices that are each seconds or minutes apart from each other, it becomes nearly impossible to correlate events effectively between any two logs, much less several of them.

The Network Time Protocol (NTP) can be used to perform time synchronization between many different devices on a network, as well as synchronizing the clocks on a network with a highly accurate NTP public time server. A detailed discussion of NTP is outside the scope of this book, but more information on it is available at many sites on the Internet, with the primary page at http://www.ntp.org/.

Hopefully, the logs you will be working with will have synchronized timestamps. However, if you are forced to work with logs that have unsynchronized timestamps, you can make the necessary adjustments to manually synchronize the logs. For example, you might be able to write a script that requests the current time from each logging device at the same time. It could then determine with a reasonable degree of accuracy (a few seconds) how far out of sync each box is. When the log files for that day are analyzed, the timestamps could be adjusted as necessary to bring all log entries to within a few seconds of being correctly synchronized. Of course, if you have a device that is drifting badly out of sync on a daily basis, you should probably be more concerned about taking care of its clock issues than synchronizing its log timestamps!

Note

It is generally recommended that if your organization's systems cover multiple time zones, you configure your systems to log everything using Greenwich Mean Time (GMT) to avoid confusion.

So far in this chapter, you have learned about the basics of log files and some of the fundamental concepts of log analysis. Now it's time to look at some real-world analysis examples. For the rest of this chapter, you will examine classes of devices: routers, network firewalls and packet filters, and host-based firewalls and intrusion detection systems. We will go through several log examples by studying a particular log type's format, reviewing a real example of that log format, and explaining how it can be analyzed. By the time you reach the end of the chapter, you will have a great exposure to how to analyze logs and identify suspicious and malicious activity. Let's start our journey by looking at router logs.