Log File Formats

Each log file is stored in a different format, depending on its use. Let's explore the text log files and their formats.

W3C Extended Log File Format

The most common type of file format is the W3C Extended log file format. It allows for the greatest flexibility for logging options. The World Wide Web Consortium (W3C), an organization that develops specifications for web technologies, developed this format. The W3C web site is located at http://www.w3.org.

The W3C Extended log file format is a customizable ASCII format that allows you to choose which fields you want to be logged, thereby limiting the log file size by including only necessary entries. Microsoft's implementation of the W3C Extended format includes several extra fields. Before we start configuring extended logging, here's some background on the format.

The Extended Log Format Specifications

The extended log format was created by the W3C to address the limitations identified with the common log file format and to provide a standard for web logging, regardless of operating system or web server. The Extended log file format uses regular ASCII text. One line makes up a directive or an entry.

Log File Directives The extended log file directives contain the information about the log file and which properties are contained in the log file. The directives appear at the beginning of the log and are preceded by a pound (#) sign.

Seven directives are available, but only Version and Fields are mandatory:

  • Version Defines the version of W3C logging that was used to create this log file.

  • Software Specifies which software package generated this log.

  • Start-Date Specifies the date and time this log was started.

  • End-Date Specifies the date and time this log ended.

  • Date Specifies the date and time the directives were added to the head of this log.

  • Remark Comments added by whomever; this entry is ignored by log analysis software.

  • Fields Specifies which fields are used in this log. This is the most important directive, since it details how to read the entry information. The Fields directive also contains a prefix that identifies how the data is associated with the client and/or the server.

Log File Entries The log file entries are the records of the actual user events or process events. Each entry has a prefix and a field. The prefix appears before any of the fields to let you know the client, server, or both with which the data is associated. The prefixes are listed here:

c

Client

s

Server

r

Remote

cs

Client to server

sc

Server to client

sr

Server to remote server

rs

Remote server to server

x

Application

Note 

Microsoft's implementation of extended logging doesn't use the sr, rs, or x prefix. In the interest of completeness, they are included here.

Each log file entry is listed on a single line, separated by white space. This gets around the issue of using a certain character, such as a comma, to delimit the entries. If a character is used, that character might appear in the entry and throw off the whole log file. If no entry appears for a certain field, a dash (-) is used to mark the space. Therefore, each log entry includes the same exact number of fields.

Table 11-2 shows the fields defined for logging by the W3C.

Table 11-2: Standard W3C Extended Logging Properties

Field

Description

date

Date on which transaction completed. The date is recorded in YYYY-MM-DD format and is recorded using Greenwich Mean Time (GMT), rather than using local time.

time

Time when transaction completed. Time is recorded in 24-hour format and is recorded using GMT, rather than using local time.

time-taken

Time taken for transaction to complete, in seconds.

bytes

Number of bytes transferred.

cached

Records whether or not a cache hit occurred.

ip

Records IP address and, optionally, the port number(IIS uses a separate field).

dns

Records the DNS Fully Qualified Domain Name (FQDN).

status

Status, in FTP and HTTP terms.

comment

Comment associated with the status code.

method

Records the method used.

uri

The full URI (Uniform Resource Indicator).

uri-stem

The stem portion of the URI.

uri-query

The query portion of the URI.

Because the W3C format is customizable, other fields can be added. Microsoft's implementation of W3C logging uses the additional fields shown in Table 11-3.

Table 11-3: Microsoft's Extensions to the W3C Extended Logging Properties

Field

Description

username

The username used for this transaction

sitename

The Internet service and instance number that was accessed by a client

computername

The name of the server on which the entry was generated

port

The port number used

win32-status

The status, in Windows terms

version

The version of the protocol used for this transaction

host

The content of any host header used

user-agent

The browser used by the client

cookie

The contents of any cookie sent or received

referrer

The address of the previous site visited

Note 

You may be wondering why the term URI (Uniform Resource Identifier) is used, and what happened to URL (Uniform Resource Locator)? The answer is complicated, due to the classic and modern interpretations of URIs and URLs. Even W3C and the IETF (Internet Engineering Task Force) admit there's a lot of confusion out there regarding the two acronyms. Because W3C uses URI, for the purposes of this chapter-and to simplify things-you can think of a URI as being the same as a URL.

The Advanced Tab's W3C Extended Logging Options

The Advanced tab's Extended Logging Options screen (shown in Figure 11-3) shows the properties from which you may choose when creating your log file. After each option, its particular prefix appears in parentheses. Microsoft's implementation of W3C logging has been optimized for the most commonly used fields for logging. The extended logging format does use the W3C standard for prefixes and fields.

click to expand
Figure 11-3: Extended Logging Options screen

Use the checkboxes to select which options you want to use in your log files. The options in Microsoft W3C logging are shown in Table 11-4.

Table 11-4: The Extended Logging Options

Option

Prefix

Description

Date

 

Records the date of the transaction

Time

 

Records the time of the transaction

Client IP address

c-ip

Records the client's IP address

User Name

cs-username

Records the username the client uses to connect to the server; anonymous users are represented by a dash (-)

Service Name

s-sitename

Records the server's site name

Server Name

s-computername

Records the server's computer name

Server IP Address

s-ip

Records the server's IP address

Server Port

s-port

Records the port number the site uses on the server

Method

cs-method

Records the method the client uses to access the server (such as HTTP GET)

URI Stem

cs-uri-stem

Records the URI stem the client sent to the server, the path to the document (everything after server name)

URI Query

cs-uri-query

Records the URI query the client sent to the server, if any

Protocol Status

sc-status

Records the protocol status message (such as 404, for HTTP not found)

Win32 Status

sc-win32-status

Records the Windows status from the server to the client (0 if no error)

Bytes Sent

sc-bytes

Records the bytes sent from the server to the client

Bytes Received

cs-bytes

Records the bytes sent from the client to the server

Time Taken

time-taken

Records the time taken for the transaction to complete, in seconds

Protocol Version

cs-version

Records the version of the protocol used (such as HTTP/1.1)

Host

cs-host

Records any host header used by the client to access the server

User Agent

cs(User-Agent)

Records the browser type of the client

Cookie

cs(Cookie)

Records the contents of any cookie used

Referer

cs(Referer)

The address of the previous site visited

Substatus Error Codes

W3C Extended log files can be helpful for determining errors in pages. Since IIS 6 doesn't return HTTP substatus error codes, a browser can't help you determine what an error actually is. For example, the HTTP status 404 means 'File or directory not found.' This error could be returned for several reasons, and the substatus codes come into play to help you determine why. For example, the HTTP status 404.2 means that the lockdown policy prevents you from getting this file. If you're after an ASP page, you need to enable the web service extension for Active Server Pages. While this full error message will appear in the log file, the client will receive only the '404' part, not the '.2' part. These substatus error codes are logged only when W3C Extended logging is used.

Figure 11-4 shows a sample of a W3C Extended log file in Notepad.

click to expand
Figure 11-4: A sample W3C Extended log file

Even though the W3C Extended log file is the most complicated format, it is the most flexible one and it gives you the greatest range of logging options. In addition, most log file readers work with the W3C Extended logging format. (There's nothing worse than finding out, after logging six months' worth of data into a SQL log, that the log file reader you just purchased doesn't work with SQL logging!)

Microsoft IIS Log Format

The Microsoft IIS log format is an ASCII format that cannot be modified. It includes basic information about each transaction. This format is comma separated, so it imports into Microsoft Excel very well. Because the fields are predefined and fixed, header information is not necessary, unlike with the W3C format. In addition, no extended properties window is available. In the log file, a blank field is represented by a dash (-). Time is represented in local time, in 24-hour format.

MS IIS Log Fields

The fields for the MS IIS log are as follows:

Field

Description

Client IP Address

The client machine IP address

User Name

Records the username the client uses to connect to the server; anonymous users are represented by a dash (-)

Date

Records the date of the transaction

Time

Records the time of the transaction

Service and Instance

Records the service and instance number of a particular site (such as W3SVC1)

Computer Name

The NetBIOS name of the server, interestingly, not the DNS name

Server IP Address

The IP address of the server

Time Taken

Time taken for transaction to complete, in seconds

Bytes Sent

Records the bytes sent from the server to the client

Bytes Received

Records the bytes sent from the client to the server

Service Status Code

Records the protocol status message (such as 404, for HTTP not found)

Windows Status Code

Records the Windows status from the server to the client (0 if no error)

Request Type

Records the method the client uses to access the server (such as HTTP GET)

Target URL

The stem portion of the URI

Parameters

Any parameters passed to a script

Figure 11-5 shows a sample of a Microsoft IIS log file.

click to expand
Figure 11-5: A sample MS IIS log file

NCSA Common Log File Format

NCSA Common format is a fixed (non-customizable) ASCII format that was designed for the CERN HTTP server, the first web server ever made. It was designed as a web server log and is not available for FTP sites. It is, however, available for IIS SMTP and NNTP sites. The NCSA log records basic information about a transaction. The fields are separated by spaces. In the log file, a blank field is represented by a dash (-). Time is represented in local time, in 24-hour format.

NCSA Common Log File Fields

The following fields are used by the log file:

Field

Description

Remote hostname or IP address

The IP address of the remote user, or the hostname if DNS is available to resolve the name

User name

The remote login name of the user

Authenticated name

The username used to authenticate on the server, as with password-protected pages

Date

The date, time, and GMT offset of the request

Request

The method, URI stem, and protocol used for the query

HTTP status code

Records the protocol status message (such as 404, for HTTP not found)

Bytes transferred

The bytes transferred between the client and the server

Figure 11-6 shows a sample of an NCSA log file.

click to expand
Figure 11-6: A sample NSCA log file

Converting Log Files to NCSA Format

NCSA is the one format to which you can convert other existing log files in different formats. If you want to convert your existing log file to the NCSA common log file format, you can use a utility called convlog.exe, which is located in the %systemroot%\ System32 directory. When you convert a log file, any fields that aren't represented in the NCSA log format will be discarded. The remaining fields are formatted to the NCSA log standard.

Convlog is useful if you have a log file reader that works only with the NCSA format, or if you want to convert the file for compatibility because other web servers log in NCSA format. Convlog.exe is a command-line utility. To use convlog, simply open up a command prompt and type convlog. You will be shown the proper syntax to use.

The syntax of convlog is

convlog [options] [LogFile]

The options used in convlog are as follows:

Option

Description

-ii

Specifies Microsoft IIS as the input log format

-in

Specifies NCSA common logging as the input log format

-ie

Specifies W3C Extended logging as the input log format

-t ncsa:

Specifies the GMT offset (such as -0600)

-o

Specifies the output directory; the default is the current directory

-x

Specifies to save non-WWW entries to separate dump file with a .dmp extension

-d

Specifies to convert the IP addresses to DNS names

-l0

Specifies date as MM/DD/YY, the default (U.S. date format)

-l1 -

Specifies date as YY/MM/DD (Japanese date format)

-l2 -

Specifies date as DD.MM.YY (German date format)

Convlog Examples Let's convert a log file in MS IIS format and then correct it for a six-hour GMT offset. The filename is in021104.log.

convlog -ii in021104.log -d -t ncsa:-0600

Now let's convert the ex040211.log from W3C Extended format to NCSA format. We'll also put it in the logfiles directory on a remote server called server1, and we'll replace the IP addresses with DNS names:

convlog -ie ex040211.log -d -o \\server1\logfiles 

Convlog File Naming When a file is converted, the new log file has the same filename. The extension used is based on whether the DNS conversion option is used. Log files converted without DNS conversion will have an .ncsa extension. Log files with IP addresses converted to DNS names will have an .ncsa.dns extension. The original log file will not be deleted.

ODBC Logging

ODBC logging is a more complicated means of logging, and it doesn't offer all the options available with W3C logging. However, if you want to use custom reporting for IIS with reports you've written, rather than an off-the-shelf package, ODBC logging may be for you. The upside to ODBC logging is that all log files for every IIS site you have can be stored in a single location. You may use any ODBC-compliant database, such as MS Access, SQL Server, or even Oracle. IIS does not set up the database for you, so you must set that up beforehand.

Note 

ODBC allows you to access a database for data storage using a standardized program interface.

ODBC logging has fixed data fields, so it cannot be modified. You are also limited to a maximum of 255 characters in any field. Unless you have some pretty long URLs, this shouldn't be a problem. The time in ODBC logging is recorded in local time.

The ODBC Log File Format

ODBC uses the following format for fields, which are not customizable:

Field

Description

ClientHost

The IP address of the client

username

The login name of the user

LogTime

The time of the log entry

service

The IIS identifier of the service (such as W3SVC1)

machine

The machine name of the client

serverip

The IP address the client used to access the server

processingtime

The time it took to process the request, in milliseconds

bytesrecvd

Records the bytes sent from the client to the server

bytessent

Records the bytes sent from the server to the client

servicestatus

Records the protocol status message (such as 404, for HTTP not found)

win32status

Records the Windows status from the server to the client (0 if no error)

operation

Records the method the client uses to access the server (such as HTTP GET)

target

Records the URI stem the client sent to the server, and the path to the document (everything after server name)

parameters

Any parameters passed to a script

Creating a Database for Use with ODBC Logging

To set up the database, you must first create a new database instance. For the purposes of this example, we'll use MS Access.

  1. Start Access.

  2. You'll see a dialog in which you must choose whether to create a blank database, use a wizard, or open an existing file. Choose to create a blank access database. You are then asked to save the database.

  3. You may choose any name and location you want, since IIS will use a DSN to connect to the database.

After the database is created, you need to create a table to hold the data, which you can do in two ways: you can use the GUI and create the table and all fields by hand, or you can use good old SQL to create the table for you. Let's choose the SQL option.

  1. In the main database window, choose Insert | Query, and then choose Design View.

  2. After you choose to create a new query, the Show Table screen will pop up.

  3. Click Close to take you to the Select Query screen.

  4. Choose View | SQL View, and the view will be changed.

    Tip 

    If you're testing, or using only one web site for logging, Access will work for your logging. If you're serious about logging, and multiple sites are logging to the same database, you're far better off using a more robust database. Microsoft Data Environment (MSDE) is freely available from Microsoft, and it's robust enough to handle multiple site logging. It can handle your logging database needs of up to 2 Gigabytes. For your insanely huge logging needs, use Microsoft SQL Server.

You can input the SQL code to create the table. Here is the code for creating the database:

create table inetlog ( ClientHost varchar(255),  username varchar(255), LogTime datetime,  service varchar( 255),  machine varchar( 255), serverip varchar( 50),  processingtime int,  bytesrecvd int, bytessent int,  servicestatus int,  win32status int, operation varchar( 255),  target varchar(255),  parameters varchar(255)  )

This code creates a table called inetlog. In that table, all the columns needed for storing the data from your IIS site are created. The syntax of each table is [tablename] [data type(max size)]. As shown here, no field is bigger than 255 characters, and the size has to be specified only for varchar (text) fields.

If you would like to name your table something else-say, so that all your tables for all your sites are in one database-you can change the name of the table to be created in this SQL statement. Just remember what you've named the table, because you'll need that name when you're setting up logging in your IIS site properties.

Note 

The SQL code shown here is contained on your IIS system in the file %systemroot%\ System32\inetsrv\logtemp.sql, and it can be used to create the table for any major database package, not just for Access.

After you've pasted the SQL code into the window, save and close the window, and name your query so you'll know what it is later. All you need to do is run the query, and the table will be created after the customary warning message. You may now close the database. Remember what you named the database and where you put it. We'll use it in the next section. Now you have created a table in your database (ours is called inetlog), ready for use.

Creating a DSN for Your Database

Now you'll create a Data Source Name (DSN) so that your system knows where the database is located, and so that you can refer to the database with a simple name, rather than the entire path. Here's how to create the DSN:

  1. Choose Start | Administrative Tools | Data Sources (ODBC).

  2. Choose the System DSN tab.

  3. Click the Add button to add a new data source.

  4. The Create New Data Source screen will appear. Here, you can choose which data source you want to set up. Since in our example we're using Access, we'll choose Microsoft Access Driver (*.mdb).

  5. Click Finish.

After you click Finish, the ODBC Microsoft Access Setup dialog box, shown in Figure 11-7, will open. In this dialog box, you can set up the Data Source Name. You will use this name in the ODBC Logging Properties window to connect to the database, so choose a name that you'll remember. For this example, we'll use inetlogdb. The description is for your use, so you may add any meaningful information here. For this example, we'll enter the path: D:\InetDB\LogFile.mdb.

click to expand
Figure 11-7: ODBC Microsoft Access Setup

Caution 

When choosing a name for your DSN, using spaces is generally a bad idea, since some programs have issues with spaces in the DSN. Avoid using them.

Now that you've named your data source, it's time to select the database.

  1. In the ODBC Microsoft Access Setup dialog box, click Select and browse to the Access database you created earlier; in the example, we indicated the path D:\InetDB\LogFile.mdb.

  2. Selected the Access database, and then click OK until you're out of ODBC Administrator.

If you've followed all the steps so far, you're ready to set up ODBC logging in your site properties, and you can skip to the next section. If you don't have MS Access, the next part is for you.

Creating a Database Without MS Access

You can create the Access database from the ODBC Microsoft Access Setup window. You can also perform functions on existing databases.

To access the setup window:

  1. Choose Start | Control Panel | Administrative Tools | Data Sources (ODBC).

  2. In the ODBC Data Source Administrator window, open the System DSN tab.

  3. Click the data source you have created.

  4. Click the Configure button.

  5. Click the Create button.

  6. In the New Database dialog box, choose a name and location for the database.

You can choose from among several options:

  • Format This version number refers to the version of the Jet database engine used, not the version of Access. The default for the database in this option is Version 4.x. It's not a good idea to change it, because versions 3.x and 2.x don't offer Unicode support.

  • System Database Choosing this option creates the database as a system DSN. If you're in the system DSN tab when you create the database, the Data Source Name will be created in the System DSN tab anyway.

  • Encryption This checkbox allows you to encrypt the database so that text editors can't read the information in the database.

  • Network This button allows you to map a drive to another machine, if you want to put your database there.

  • Locale This drop-down box allows you to choose the language for this database.

When you create a database in this manner, you still need to create the table, so it's probably easier to create the database from within MS Access. If you don't have MS Access, however, you can use this procedure to create the database and then use SQL tools to create the table. MS Query, a program that comes with MS Excel, will allow you to do this, because it can run queries against an Access database. See the MS Query documentation for more information.

Setting Up Your Site for ODBC Logging

Now that the database side of ODBC logging is set up, let's set up a web site to log to the database:

  1. Go to the Properties window of the site for which you want to set up ODBC logging, and choose ODBC Logging as the active log format.

  2. Click Properties to open the ODBC Logging Properties window shown in Figure 11-8.

    click to expand
    Figure 11-8: ODBC Logging Properties window

  3. Type in the DSN you created earlier for your database (inetlogdb in our example).

  4. Type in the table name used in the database (inetlog in our example).

    The options of the ODBC Logging Options window are shown here:

    • ODBC Data Source Name (DSN) Enter the name of the DSN you chose earlier. For our example, we chose inetlogdb.

    • Table Enter the name of the table created by the SQL statement here. In our SQL statement, we chose the name inetlog.

    • User Name and Password If your database requires a username and password for access, you would type them here. Our Access database does not require a username or password, so anything in this field is acceptable, or you can leave it blank.

After ODBC logging is set up here, your log is totally configured and ready for action. New data will not appear in the log database while the site is running, so if you want to open the database for viewing, stop the site that is logging to that database table first.




IIS 6(c) The Complete Reference
IIS 6: The Complete Reference
ISBN: 0072224959
EAN: 2147483647
Year: 2005
Pages: 193

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net