KNOWLEDGE ACQUISITION FOR USER MODEL


KNOWLEDGE ACQUISITION FOR USER MODEL

Knowledge Acquisition (KA) has been an area of active and extensive research since the 1970s, but none of the existing KA systems seems directly applicable to capturing information about the user in the context of e-commerce. There exist several interactive KA tools that allow an end-user with limited experience in the knowledge acquisition process, and with minimal training, to enter or update a knowledge-base. An overview of several such tools can be found in Gil and Kim (2002). Some of these tools [PC PACK (Speel et al., 1999) [7] ; KSSn and WebGrid (Gaines & Shaw, 1998) [8] ; EXPECT (Blythe et al., 2001) [9] ; and PROT ‰G ‰-2000 (Grosso et al., 1999) [10] ] possess a user-friendly interface and allow creation of sophisticated knowledge bases, ontologies, or conceptual models. But, they all require at least some training, and they are relatively slow.

In the case of user modelling for e-commerce, we need to find another approach that would make the KA tool intuitive to use, even for a na ve user with no previous knowledge. In order to gain this advantage, we can sacrifice granularity because the user model does not need to make very subtle distinctions in attribute values, as is done, for instance, by WebGrid and other KSSn tools. Knowledge capture for user profiling on the web can benefit from a combination of techniques that will take some load off the user and put it on the system that will extract information from observation (activity logs, browsing behavior, shopping history, etc.).

There exist two major approaches to data collection for user profiling (Colkin, 2001):

  1. Explicit profiling occurs when users enter data themselves by filling in forms and answering questionnaires. This approach is good because the user has control over the information he/she supplies to the system; that builds trust. This approach puts less load on the system. On the other hand, the time and effort users spend on entering their data into the system should be minimized. An average customer wants convenience and speed at all stages of his/her interaction with the system, not just the promise of future enhancements. Therefore, another approach becomes necessary for fine-tuning and maintaining the user model.

  2. Implicit profiling and use of legacy data can fill that gap between the amount of information desirable for the system and the amount of user involvement. This method consists of tracking the user's behavior and using various machine learning techniques to make conclusions based on that behavior. The major downside of this approach is the unreliability of the algorithmically obtained inferences (there will be a lot of noise that is difficult to interpret).

We use explicit profiling as the first step in KA for user modelling in e-commerce. Based on this information, an initial user profile is built during the customer's first visit to the store. If the customer returns, this initial profile is updated, based on the shopping history.

To make the creation of the user model easy and pleasant for a user, we propose to provide him/her with an intelligent assistant to help in the task. Such an agent will be proactive and relatively autonomous. In the proposed system, this role is played by a community of agents which gets the data from the user via user interface, captures the information from the observation of the user's actions, processes this data, and performs data verification, if needed, with minimal load on the user's time and effort.

The proposed architecture incorporates all four agent's tasks (described on pages 170-172 as Assist 1 to Assist 4). Agent A1 takes care of the initialization of the profile (Assist-1) and dynamically updates it based on transaction data (Assist-2). Agent A2 takes care of interaction with the user when explanation or validation of results is necessary (Assist 2 and 3). The user profile is stored locally, and a KB-U agent ensures the safe information exchange between the user and other agents and the system software by filtering incoming queries (Assist-4). The agents are organized into a KA subsystem that also contains other modules, as shown in Figure 2.

click to expand
Figure 2: Architecture for Knowledge-Acquisition Sub-System

The KA subsystem's goal consists of transforming raw input data received from the dialogue with the user, through observation or event tracking, into a standardized format that can be used by the rest of the e-commerce system.

The KA subsystem tasks include:

  • receiving data from the user;

  • processing of data;

  • validating and fine-tuning of information about the user; and

  • storing the information about the user.

The KA subsystem gets the input from two sources: directly from the user, via User Interface (Type-1 interaction), and from the system (i.e., the Web Server Log Mining module), based on transaction history (Type-2 interaction). There are two kinds of data coming directly from the user:

  1. The results of filling forms with personal data (it can range, depending on the specifics of the e-store, from limited amount of personal data typically required for any registration and credit card transactions, to different sorts of additional information, such as color preferences, dietary restrictions, etc.);

  2. The elicited knowledge from the user through the dialogue between user and agents aimed at clarifying or validating the system's reasoning. Type-2 interaction occurs inside the system when different modules and agents exchange information in order to dynamically update the profile.

In the first stage of the process of building the user model, Agent A1 gets raw user data from the user interface and transforms it into a format defined by node frame slots, using agent's knowledge base that includes such information as market definition, product domain ontology, consumer typology , etc. The agent compares the user information to the facts stored in the system and produces values for the user profile in a format required by the user model. This process results in setting initial values for frame slots (features). Thus, even with a limited amount of information available, the user model is not empty when the customer returns after the initial registration.

For example, at the root of the tree, the system has stored information about name , age, and some other characteristics of the user. Immediate results of the availability of such information in the user profile would be customized presentation or advertisement (e.g., a teenager will not be exposed to content or visual presentations designed for seniors). Some values can be inferred for internal nodes (e.g., if the user Ms. X mentioned "weight control" as a main reason for shopping in the wellness store, the system can expect that the user will prefer fat-free varieties of all products).

Suppose that the system later collects information about the user by observation (browsing logs, purchase history) and updates previously set fillers of the frame slots.

The monitoring agent will detect the contradiction and create a dialogue with the user to confirm if the profile needs to be updated or if the violations against the profile are intentional and should be left alone.

Agent A1 also has a special set of rules that oversees the modifications made to the profile in order to detect problems. This includes detecting cases where the current user's actions contradict the information stored in the profile, as well as alerting the user if the modified value gets close to the threshold. When a problem has been detected , Agent A1 may call Agent A2, which initiates a clarification dialogue with the user. Or, Agent A1 may start a trial period during which it monitors the trend to decide whether the observed deviation in user behavior was occasional and should be ignored or whether it was persistent and should be incorporated into the user model.

Agent A3 monitors the effects of changes made to user profile on system behavior (for example, by analyzing system logs) and supplies the user with appropriate information and advice at a suitable moment.

The following four typical use cases are used to test a prototype that we have developed for KA purposes. These use cases are developed based on the overall operation of a typical system and its services.

Use Case 1: First-Time Registration and Profile Initialization

This use case involves Type-1 interaction and corresponds to Assist-1, described above. The typical scenario for this use case develops as follows :

At registration, the user keys in relevant information (personal data and appropriate additional facts related to the current domain). At this stage, the knowledge acquisition tool uses the information that is typically supplied by the user to a system for transaction authorization (thus, the user does not have to put extra effort into creating the profile and is not required to give the system additional personal data). Some of the stereotype-based values can be set even before any interaction with user. For example, the system may know that there is a 60 percent chance that a shopper would give preference to organic products, and the user normally shops in specialized small shops rather than in supermarkets.

This user data goes to Agent A1. The agent uses a set of rules that compares the user's values to the information stored in the database of stereotypes, consults the domain hierarchical organization (ontology), and arrives at values to be assigned to the slots of frames stored at PIE nodes. A certainty factor is calculated for each value. Certainty factors of leaf values of the PIE DAG influence the degree of belief in the facts stored in internal nodes higher in the hierarchy. For example, if the user states his/her interest in a vitamin-enriched vegetable juice used as dietary supplement, the certainty factor is increased for juices as well as for dietary supplements.

At this point, the user model available in the system is stereotype-based. It reflects typical characteristics of people of a certain age, sex, family situation, etc. (the values were assigned based on population statistics and marketing research data). Results of this stage allow the system to start the next session equipped with enough knowledge to be able to attempt the customization of the presentation, advertisement or search.

Use Case 2: Second Time Visitor Profile Update

This case describes the process of dynamic adaptation of the user profile based on activity logs. It involves Type-2 interaction and corresponds to Assist-2, described on page 170. The typical scenario for this use case develops as follows:

When a user who is already registered returns, his/her actions (time spent viewing, suggestions based on initial profile, browsing sequences, shopping cart decisions, etc.) are recorded in a user activity log. These data are later analyzed by Agent A1 in order to extract relevant information about the user's preferences. Results of the analysis are compared to values originally stored in the user model. In case of minor changes, new, individual values have preference over those based on stereotypes. If the difference is very significant (e.g., while filling the questionnaire, the user informed the system that he is vegetarian, which resulted in giving the slot "food-preferences" value "not meat" with highest certainty 1; but, while shopping, the user showed particular interest in deli products and even bought some ham), Agent A1 can not make a decision and calls Agent A2 to clarify the problem. Then, the system might ask the user about his/her preference (in this particular case, in a form of a simple yes/no question (see Use Case 3)). The user's answer to the question overwrites previously set values. Now the user model built upon the values of slots in the frames is more individualized ” wherever possible, stereotypes are replaced by individual user preferences ” and now, the customer's profile is different from a typical profile for the same demographic segment of the population.

Use Case 3: Repeat Customer Profile Maintenance

This use case is a variation of the previous one which occurs when the user model has been set, based on stereotypes as well as on the user's activity logs, and the system has to prove or disprove its conclusions based on additional data coming from the user's activity logs. This use case corresponds to functions Assist-2 and Assist-3; it also makes use of both Type-1 and Type-2 interactions. One of the possible scenarios of this use case would be the following:

While a customer makes repeated purchases in the store, Agent A1 keeps analyzing his/her activity logs in order to change the values of certainty factors (e.g., if a user is buying some high-fiber products at every visit, then the certainty factor associated with high fiber will increase accordingly ; on the contrary, if the user, at first, purchased herbal supplements but, some time later, started ignoring all such products suggested by the system and did not buy any more such supplements, the system will decrease the certainty factor for herbal supplements).

Agent A1 also monitors the logs for typical patterns in the user's shopping behavior (e.g., typical contents of the shopping basket , brand loyalty, etc.). Significant changes in the user's behavior are registered and validated by Agent A2 (e.g., the system would ask the user about the change in his/her preferences if, after having avoided any product containing cocoa, the customer starts buying chocolate bars at every visit). Minor changes can be validated by the system itself. In this case, Agent A1 sets a trial period during which the validation agent monitors the particular trend in the user behavior in order to distinguish between occasional deviation from typical pattern and consistent change of behavior.

Confirmed patterns are also reflected in the user model. After several visits to the store, the profile of the user becomes highly personalized and mostly reflects the user's individual preferences. Constant monitoring of the user's activity allows the system to perform dynamic updates of the user model in order to reflect changes in the user's preferences and to detect his/her typical shopping patterns.

Use Case 4: Information Exchange with the Outside World

This use case is served by Agent A4 and involves Type-2 interaction, but it can also be initialized by the user (Type-1 interaction). There are two main scenarios for this use case, depending on the direction of communication.

In the first scenario, the user or his agents request the information that is not available in the system. In this case, Agent A4 initializes a search for other sources of information. These sources include different forms of documents as well as other systems and agents. For example, if the user is looking for information on the possible side-effects of a particular weight-loss product, Agent A4 can extract the necessary data from the product monograph, request this information from the vendor's or producer's agents, or find people who used this product and are willing to share their experience. In the last case, A4 can supply the user with available contact information.

In the second scenario, the query comes from outside, and Agent A4 plays the role of a gate-keeper by filtering the incoming requests and limiting the amount of information to be supplied in response to a query, based on constraints set by the user (e.g., a list of friendly agents) and on world knowledge (e.g., information can be given to a reliable long- term partners but not to an unknown company).

[7] http://www.epistemics.co.uk/products/pcpack/

[8] http://tiger.cpsc.ucalgary.ca/WebGrid/WebGrid.html

[9] http://www.isi.edu/expect/

[10] http://protege. stanford .edu/index.html




(ed.) Intelligent Agents for Data Mining and Information Retrieval
(ed.) Intelligent Agents for Data Mining and Information Retrieval
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 171

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net