(ed.) Intelligent Agents for Data Mining and Information Retrieval
Authors: Mohammadian M.
Published year: 2004
Comparing these three distributed paradigms ” server-client, code on demand, and mobile agent ” it can be seen that mobile agent exhibits greater flexibility. Furthermore, mobile agent possesses the following advantages:
Mobile agent moves computation code to data, and the intermediate results passing are reduced. The network bandwidth consumption is reduced.
The agents do not require a continuous connection between machines (Huhns & Singh, 1997). The client can dispatch an agent into the network when the network connection is healthy , then it can go offline. The network connection needs to be reestablished later only when the result was returned by agents from remote host. Hence, it provides a more reliable performance when the network connection is intermittent or unreliable (Pals et al., 2000).
Agent operates asynchronously and autonomously, and the user doesn't need to monitor the agent as it roams in the Internet. This saves time for the user , reduces communication costs, and decentralizes network structure.
With adaptive learning and automation added to agents, the agent can be tooled with AI for information retrieval and filtering.
The main problem with mobile agent is security (see Ghanea & Gifford, 2001), which is still an area of research on its own. In an agent system with a low level of security, the mobile agent may harm the host or the host may harm the mobile agent.
With the above properties, the software language to construct a mobile agent system should be object-oriented and platform independent, with communication capability and implement code security (James, 1996). At present, the languages being used for mobile agent include Java, Telescript, and Tcl. IBM Aglet with Java is a popular tool, and we selected it as the platform for this project.
Some existing e-learning systems (some are university-based, others are not) have been studied and compared. Most of the systems were constructed based on server-client paradigms , and few provide search and user tracking functions. A few systems only provide course catalogs for the user to search. Currently, none of the systems uses mobile agent technology.
It is recommended to add a mobile agent-based search tool between the user and the e-learning system. This e-learning tool will help the user search for his preferred courses with some forms of AI and, at the same time, track the user's progress status continuously and report this information to the e-learning server. This information helps the online instructors know the progress of each learner and his status in the course, and then guide the learners differently. At the same time, the e-learning system administrator can also collect user status information for statistical purposes.
Through the e-learning tool, the learner and instructor are given an environment to interact with each other more easily and conveniently. A learner can decide whether to allow other learners to know his progress status information. When the learners' progress status is available among peers, competition and cooperation among learners are promoted. The more interactive the e-learning system, the more effective e-learning will be (Cupp & Danchak, 2001). In the following section, the method for implementing the e-learning tool will be discussed.
We propose a new system architecture, based on mobile agent, to improve the performance of current systems. In our proposed system, a certain mobile agent-based software was pre-installed in the university centers and was connected, through Agent Transfer Protocol (ATP) in the Internet, to form a huge and powerful e-learning system. The client is a networked PC connected to the Internet, enabling it to connect to the Agent Server Center (ASC) to query the universities on behalf of the user. The ASC will then create a mobile agent, and it will roam to the university servers to hunt for required data. Each university center may offer some courses to users. Users may access the e-learning system through any web browser via the Internet.
All the university centers can be considered as different nodes in the network, and the courses are information distributed into these different nodes. We can construct a Faded Information Field (FIF) to implement the information push and pull technology, as discussed earlier, to improve users' access time to information, with a higher reliability.
The FIF system consists of logically connected nodes. Information providers and users correspond to these nodes, and information allocated to these nodes decreases in amounts away from the information center. That means the nodes adjacent to information center contain more information; farther nodes contain less.
The users' demand for the sub course materials changes dynamically over time. FIF provides an autonomous navigation mechanism by university centers through mobile agents to satisfy users' heterogeneous requirements. The university centers generate push mobile agents, and these agents carry out information fading by negotiating with the other neighboring nodes in the network. When the push agents perform the information fading, it needs to take into account the popularity, size and lifetime of the information, and then assign a priority level to each information. The information with higher priority is stored on more nodes, and low priority information is stored in fewer nodes.
All the nodes around the university centers are ranked according to distance. The nodes with lower traffic costs are ranked near to university center, and those with higher traffic costs ranked farther away from the university center. The information push agents perform their task at a certain network off-peak time to avoid network congestion.
Through the mobile agent-based architecture and Faded Information Field, the network structure is decentralized, and it can be easily extended to a larger scale by adding more university centers. The course materials are usually stored on more than one server so that the users can get the course materials from the nearest node, which saves information access time. When some nodes in the networks are down, users can still get course materials from other nodes; therefore, the reliability is increased. The system is robust and fault-tolerant.
Keyword searching/text mining, in which, the server collects the index of all the stored information, is commonly used by current search engines. When a user wants to retrieve information from the server, he is required to enter a keyword to query the server database. The server then searches all the indices that match the keyword entered by user and retrieves the information accordingly (Martin & Eklund, 2000) [e.g., Electronic Campus (Southern Regional Education Board, 2002)].
In this information retrieval mechanism, two problems may occur. Firstly, this mechanism is based on the assumption that user has a good understanding about the server-stored information and can express what he wants to retrieve in terms of correct and accurate keywords. If the query keyword is poorly structured, or if some typing error exists, the search will not work as the user expected; it may even return nothing. In order to overcome this problem, a thesaurus function is used to expand the search keys.
Secondly, if the query causes plenty of information to be returned to user at the client site, most likely not all the information is what the user wants. The user needs to browse through the retrieved information and discard that which is not important. This causes a waste of network bandwidth and user time. To assist the user, the system provides a weighted keywords function to gauge the importance of each piece of retrieved information.
Adding artificial intelligence to the keyword search will improve the search quality. One approach is to do a parsing on the user-entered keywords (Katz & Yuret, 1999). This process will generate several synonyms equivalent to the original keywords. When the query done with the original keywords is not satisfactory, the query based on its synonyms will be performed to return more information. Figure 3 illustrates the process of user query expansion.
Figure 3: Thesaurus Module
One AI approach for e-learning course searches is to build web agents (Karjoth & Lange, 1997). The web agents will search for information on behalf of the user, according to his preferences. Such preferences are stored in a user profile database (Baek & Park, 1999). It has a learning function and can learn the user's likes and dislikes when the user searches the web with keyword searching.
Initially, the user enters the keywords and searches the Web. The monitor agent will save the keywords entered by the user, and then the search agent will start to roam the Web, searching for information (Cabri et al., 2000). As the search agent finds information, at remote sites, which matches the user's requirements, it will carry the information back to the user.
At the user site, the extraction agents will extract the keywords from the retrieved web documents. The keyword extraction includes two methods . One is based on the frequency of a word or phrase in a web document; the other is based on the frequency of a word or phrase in different web documents. The occurring frequency is then computed and weighted. If it exceeds a threshold value, the word can be treated as a keyword. It can then be saved into the user profile database together with the user-entered keywords. When the user reads through the retrieved web documents, some documents may not be what he wants. So, they are simply discarded by the user. The monitor agent will monitor this and add a negative preference with a weighting to the extracted keywords. For the correctly retrieved web documents, a positive preference weighting is added.
The next time the user does a keyword search, and enters similar keywords as the previous session, the user preference database will, first, find all the keywords stored with positive and negative weightings. Then, it will list those words and allow the user to pick the keyword he wants. The search priority is based on the weighting. Positive weighting indicates higher priority, and negative weighting indicates low priority. Since the negative weighting exceeds a certain threshold value, the web documents containing such keywords will not be retrieved in future sessions.
The monitor agent, search agent, and user preference database operate in close loop feedback, with learning ability. The user preference database will grow larger as the user does more keywords searches of the Web, and the search process will become more intelligent . Figure 4 shows the architecture of the search tool (Quah & Chen, 2002).
Figure 4: AI Search Engine Architecture
Among all the existing e-learning systems studied in this project, only "Corpedia" (Corpedia, 1998) and "Ninth House Network" (Ninth House, 1996) provide user progress tracking. Comprehensive and flexible user progress tracking provides useful information for both the e-learning instructor and the administration to improve the e-learning quality and make it more effective.
In our e-learning system, two kinds of agents are used for user tracking. One is a reporting agent, which is a mobile agent sitting at user machine; the other is the monitor agent, which resides at the e-learning center. After each e-learning session, the reporting agent will report the user's current status information to the monitor agent in a remote server via message passing. The monitor agent will capture the information into the database. The instructor and administrator can check this information to analyze teaching effectiveness and to collect statistics.
By exploring mobile agent in user tracking, the user is freed from the burden of manually reporting his learning progress to server. All this is done automatically by the mobile agent, and the whole system is transparent to the user.
In the proposed system, IBM Aglet workbench is used as the agent platform. Figure 5 shows the technical architecture.
Figure 5: Overall Architecture
The whole system consists of three layers : the front end user machine, the back-end server, and the e-learning servers on the Web. The front end can be any PC connected to a back-end server. The back-end server has a SQL server database, which stores the user account information. It is used for user verification when logging into the system, and each new user needs to register his account with the back-end server. The handling of these data is through CGI scripts.
The addresses of all the e-learning servers which are to be visited by the searching mobile agent are also captured in the database, which forms a search directory for the searching Aglet. Each e-learning course center on the Web must pre-install the Aglet Tahiti server, and each must have a database server to store the course materials. These e-learning centers registered their addresses with the back-end server, thus providing a context in which the searching Aglet can operate.
Each time the user does a search, a Java Servlet will run at the back-end server, which, in turn , generates an Aglet carrying the searching criteria and sends it into the Web. The mobile agent will roam the Web, searching for the required information on the user's behalf. When information is found, the mobile agent sends it back and continues on to the next Aglet host in the predefined trajectory. The retrieved information will be filtered by the thesaurus module and then presented to user.
The filtering process is the reverse of the query expansion process. Text mining techniques are used to narrow down the search criteria; they also take into consideration the context in which the keyword is used.
(ed.) Intelligent Agents for Data Mining and Information Retrieval
Authors: Mohammadian M.
Published year: 2004