4.4 Why Are Agents Important?

One of the most compelling uses for agent technology is in the area of information retrieval; the explosion of information about individuals and companies on the Internet and the databases connected to it is huge. Based on studies from Forrester Research and the Yankee Group, there are over 1 billion documents on the visible Web, with 7 million documents being added on a daily basis. What is more important is that the Web is becoming increasingly database driven, and records in these databases cannot be indexed or retrieved using typical search engines. This is due in part to the rise of new technologies like XML and Active Server Pages (ASP), which conventional search engines omit simply because they cannot retrieve the records from these dynamic databases.

These same studies indicate that this dynamic Web is 500 times larger than the visible Web of 1 billion pages. Agent technologies, which support special scripting capabilities, have the capability to correspond to different information types and, thus, to retrieve much more information than normal search engines. In other words, agents can sense the type of data source and adjust and convert the parameters into a query that can be understood by the information source. Of course, these types of agents can negotiate and extract information, not just from Web-connected databases, but also from local databases, intranets, extranets, and other proprietary networks.

Agents are needed to help analysts and investigators deal with and leverage a tremendous amount of data in the course of their work. Agents are sophisticated programs that, as we have discussed, possess human-like attributes, such as the ability to work independently, communicate, coordinate, learn, and accumulate knowledge to conduct their assigned tasks. When used in conjunction with other data mining technologies, such as those that will be covered in subsequent chapters, agents can assist investigators in accessing, organizing, and using current and relevant data for security deterrence, forensic analysis, and criminal detection.

As we have seen, agents are designed to perform in a particular environment, such as a closed network or the Internet; they can also be categorized according to their functionality, such as information retrieval, information filtering, monitoring and alerting, etc. They can also be classified according to their core architecture; for example, there are learning agents that employ internal neural networks to acquire knowledge as they work or machine-learning algorithms to generate their own rules for behavior and action. For the most part, there are two major categories of agent that lend themselves to investigative data mining applications: Internet (open sources) and intranet (secured sources) agents.