Symbolic Machine Learning for Student Modeling | Designing Distributed Learning Environments with Intelligent Software Agents

< Day Day Up >

Machine learning has been widely used in intelligent software agents as an effective and feasible approach to improving an agent’s intelligent ability in MAS. Most research focuses on developing the personalized and adaptive mechanism for agents to manage the information available on the Web site. Specifically, lots of efforts focus on the adaptive mechanism of Web-based education systems (Guven et al., 1998a, 1998b, 2000; May et al., 2001). Guven and Blandford (1998a) successfully developed the Mltutor, which used a symbolic multialgorithmic learning method to analyze users’ navigational patterns within Web-based educational hypertext systems, in order to provide adaptation in the form of recommendations considered most relevant to the learner’s current area of learning. Mayo and Mitrovic (2001) proposed to optimize the behavior of ITSs (intelligent tutoring systems) using Bayesian networks and decision theory (Briscoe & Gaelli, 1996).

Machine learning can also be used to implement personalization and adaptation for DLEs, particularly for student modeling. Student modeling is an important aspect in DLEs, because it is directly related to adaptation and personalization in the learning systems and coordination and collaboration in MAS. A good student model can help in coordinating and collaborating the agents in MAS. In this section, we first give an overview on student modeling and symbolic machine learning technique, and we then discuss how to apply symbolic machine learning to student modeling.

Student Modeling

In general, student modeling entails developing models based on student behaviors and background knowledge to attain personalization and adaptation of learning environments. Apparently, the goal of student modeling is to build a model for assisting an intelligent learning environment in adapting to specific aspects of student behavior.

By definition, student modeling contains three essential elements: student behavior, background knowledge, and student model. In student modeling, student behavior is defined as the student’s observable response to a particular stimulus in a given domain. Student behaviors are the primary input to a student modeling system and can be classified into simple and complex behaviors. Background knowledge in student modeling mainly contains the theory of the domain, such as concepts, principles, procedures, strategies, and so on, and the bug library, which includes misconceptions and other errors made by the students in the same domain. Naturally, background knowledge also consists of historical knowledge of a particular student and stereotypical knowledge in that domain. The student model can be an approximate, possibly partial, primarily qualitative representation of student knowledge about a particular domain that fully accounts for specific aspects of student behavior (Sison & Shimura, 1998). With both student behavior and background knowledge, the student modeling system can construct the student model by using modeling approaches. In the next section, the application of symbolic machine-learning techniques to student modeling will be discussed.

Overview of Symbolic Machine Learning

Learning is an essential component of intelligent agent systems. It is also a fundamental component in MAS-based DLEs. Without any learning abilities, DLEs cannot benefit from their past experiences or adapt to dynamic changing learning environments.

The goal of symbolic machine learning (SML) is to induce new knowledge from existing or past data for future usage, or to compile knowledge in order for existing knowledge to improve its accuracy and performance. From the viewpoint of the machine-learning approach, the SML can be classified into two main categories: supervised learning and unsupervised learning.

The two types of learning are distinct, either from required data input or the tasks that they can address. Both supervised and unsupervised SML are constructive for student modeling in DLEs.

Supervised SML

Supervised learning requires all instances (data input) to be labeled with defined classification. A labeled instance is viewed as a pair (xi, ci), where xi is the given instance for learning, and ci represents corresponding classifications. These classification definitions (ci) are given by an external “teacher” from the domain applications, hence, the name of learning supervised SML. The task of supervised SML is to learn a function f (it may be a description or rules), where f(xi) = ci for allinstances. Many machine-learning algorithms can be utilized to learn the function f. These algorithms (Briscore & Gaeli, 1996) include but are not limited to decision tree, instance-based reasoning, case-based reasoning, Na ve Bayers, neural networks, rough set, regression, version space (also called theory revision), inductive logic programming (ILP), rule-based production system, covering algorithm, etc. To discuss these machine-learning algorithms in detail is out of the scope of this chapter. Interested readers can refer to books such as “Symbolic Machine Learning” by Briscore and Gaeli (1996) and “Machine Learning” by Mitchell (1997) for more information.

The learned function f or model has to be evaluated for its performance using new, previously unseen instances. There are several approaches (Efron, 1987; Breiman et al., 1984) for evaluating the learned function f or model. Such approaches include hold evaluation, cross-evaluation, bootstrapping evaluation, LOBO (leave-one-batch-out) (Kubat, Holte, & Matwin, 1998), and so on. After evaluation, the ideal function f or model will be selected for domain applications such as student modeling in DLEs.

Unsupervised SML

Unsupervised SML differs from supervised SML in that it does not require any classification information from external systems. Unsupervised SML is used to find the “commonality” or “regularity” in the given instances or examples. In this scenario, the system is not required to find a function or model from the given instances. To determine if unsupervised learning is successful, the testing set of the instances is taken to action by checking if they exhibit the same regularity as was discovered in the training set of instances.

The main technique for unsupervised SML is conceptual clustering. Conceptual clustering is the grouping of unlabeled instances into various categories, where conceptual descriptions can be formed, and the instances are with the same regularity.

Symbolic Machine Learning to Student Modeling

The discussion above clearly demonstrated that student modeling can benefit from symbolic machine learning. By applying SML to student modeling in DLEs, we will be able to extend the background knowledge in the student’s learning domain. To this end, both supervised and unsupervised SML are constructive to student modeling. Generally speaking, the procedures of applying SML to student modeling contain the following steps:

Collect data on student behavior or background knowledge from systems.
Represent the data with specific attributes.
Use SML algorithms to generate the model or to extend the background knowledge.
Evaluate the generated models and select an ideal one.
Apply the selected model to DLEs for intelligent support.

Up to this point, in the AI and e-education research fields, there have been a larger number of research achievements for student modeling. Many student modeling systems have been developed, and their results have been published in various journals and referenced in conference proceedings. To help the reader in accessing these remarkable results, we have summarized some results of student modeling in Table 1. We are not going to describe each system in detail; however, interested readers could find their details from the relevant references.

Table 1: The compression of some published student modeling systems
Student Modeling systems	Developers	Publishing Date	Task for Student Modeling	SML Algorithm
PIXIE	Sleeman et al.	1990	Extend domain background knowledge	Rule-based production system
ASSERT	Baffes & Mooney	1996	Induce student model	Theory reversion
MEDD	Sison, Numao, & Shimura	1998	Extend domain background knowledge	Conceptual clustering
THEMIS	Kono et al.	1994	Induce student model	Decision tree (ID3)
ACM	Langley & Ohlsson	1984	Induce student model	Decision tree (ID3)
DEBUGGY	Burton	1982	Induce student model	Covering algorithm
MLTUTOR	Guven & Blandford	1998	Induce student model	Decision tree (ID3)

The summary in Table 1 clearly shows that almost all major SML algorithms, regardless of their types, have been applied to develop student models or discover background knowledge in student modeling. Because SML is an experimental technique, it is impossible to say which algorithm or approach is better for student modeling. It depends entirely on the requirements of the domain application and the collected data. Therefore, evaluation of the learned model or knowledge is indispensable for different application domains.

< Day Day Up >