Personalized product-brokering agents require a profile of the user in order to function effectively. The agent would also have to be responsive to changes in the user's interests and be able to search and extract relevant information from outside sources.
At MIT Media Labs, Maes and Sheth (Maes, 1994; Sheth & Maes, 1993) have come up with a system to filter and retrieve a personalized set of USENET articles for a particular user. This is done by creating and evolving a population of information-filtering agents using genetic algorithms.
Some keywords will be provided by the user, and they represent the user's interests. Weights are also assigned to each keyword, and the agents will use them to search and retrieve articles from the relevant newsgroups. After reading the articles, the user can give positive or negative feedback to the agents via a simple graphical user interface (GUI). Positive feedback increases the fitness of the appropriate agent and also the weights of the relevant keywords (vice versa for negative feedback). In the background, the system periodically creates new generations of agents from the fitter species, while it eliminates the weaker ones. Initial results obtained from their experiments showed that the agents are capable of tracking its user's interests and recommend mostly relevant articles.
While the researchers at MIT require the user to input their preferences into the system before a profile can be created, Crabtree and Soltysiak (Crabtree & Soltysiak, 1998; Soltysiak & Crabtree, 1998b) believed that the user's profile can be generated automatically by monitoring the user's Web and e-mail habits, thereby reducing the need for user-supplied keywords.
Their approach is to extract high information-bearing words, which occurs frequently in the documents that are opened by the user. This is achieved by using ProSum,  which is a text summarizer that can generate a set of keywords to describe the document and can also determine the information value of each keyword. A clustering algorithm is then employed to help identify user's interests, and some heuristics are used to ensure that the program could perform as much of the classification of interest clusters as possible.
However, they have not been completely successful in their own experiments. The researchers admitted that it would be difficult for the system to classify all the user's interests without the user's help. Nevertheless, they believed that their program has taken a step in the right direction by learning user's interests with minimal human intervention.
A new product-brokering agent usually does not have sufficient information to recommend any products to the user. Hence, it has to get product information from somewhere else. A good source of information will be the Internet. In order to do that, a method suggested by Pant and Menczer (2002) is to implement a population of Web crawlers called InfoSpiders that search the WWW on behalf of the user. Information on the Internet will be gathered based on the user's query and then will be indexed accordingly.
These agents initially rely on traditional search engines to obtain a starting set of URLs, which are relevant to the user's query. The agents will then visit these Web sites and decode their contents before deciding where to go next. The decoding process includes parsing the Web page and looking at a small set of words around each hyperlink. A score is given based on their relevance to the user. The link with the highest score is then selected, and the agent visits the Web site.
Profile-based text summarization.