Constraint-Based Student Modeling for an Intelligent Tutoring System on Data Structures
Intelligent Systems Lab, School of Applied Science
Nanyang Technogical University, Singapore 639798


The goal of an Intelligent Tutoring Systems (ITS) is for itself to be aware of the knowledge it contains, as well as what the student knows. Such systems should be "... capable of automatically inducing and using a structural model of the student's reasoning strategies" [1]. An ITS should also be "self-adaptive" - able to adapt tutoring sessions based on self-contained knowledge. Several systems are evaluated below. These represent the current range of approaches for the ITS playing field in Student Modeling. Most systems to date either employ one of the following techniques, a variant of one, or a hybrid of two or more.

A sophisticated Intelligent Multimedia Tutoring System (IMTS) for an university course on Data Structures in C gives students a possibility to get answers to their questions outside of labs [2]. The students could also prepare better for their labs using the IMTS or review part of the lecture material during lab sessions when they need help. The IMTS is completely developed in Java and resides on a parent canvas, either of proprietary origin (Web Browser or Standalone Application) or on one designed specifically for the IMTS. In the first phase it supports the students on algorithms on linked lists, queues, stacks and binary search trees. The user interface is completely interactive, meaning that the student will interactively manipulate the data structures on the screen. In tutorial mode, the student will be given exercises, which s/he has to solve interactively by manipulating the nodes of the data structures. Feedback and tutoring will be given to the student upon errors. In this paper we describe the usage and implementation of constrained-based student modeling as introduced by Ohlsson [3].

Constraint-Based Student Modeling

No perfect solution has been found and perhaps none should be expected. The extent of false knowledge is so overwhelming that the problem of inferring what the student "knows that is not so" is computationally intractable in its general form and perhaps even conceptually incoherent. Instead of a single general solution, there has been a slowly growing library of imperfect but useful techniques. Each technique has a unique combination of advantages and disadvantages. What is required is to match a suitable technique for each pedagogical purpose. The approaches to student modeling can be divided into three broad but distinct groups of thought:

  1. Bug Library technique: Map the student's solution onto a predefined library of error types [4].
  2. Machine Learning approach: Search the procedure space for a fitting model on-line [5].
  3. Model Tracing technique: Follow the student step by step [6].
Two considerations have helped guide the present approach to student modeling :
  1. The main purpose of student modeling in an ITS is to guide pedagogical decision making. A description of the student's knowledge is not necessary, unless there is some way for the tutoring system to make use of or react to it.
  2. The abundance of false knowledge requires that abstractions be used to reduce the scope. The numerous combinations hinder student modeling efforts because most techniques deal with the specifics of student's solution paths. They also try to infer the particular mental process/strategy students are employing. Current AI algorithms fall short of being able to perform this complicated diagnostic task during ongoing interactions with students. What is required is a type of description that can extract detail from a student's mental process while preserving pedagogically important properties of the student's current knowledge.
These two considerations imply that the student should be modeled in terms of equivalence classes of solutions rather than specific solutions or strategies. The members of a particular equivalence class are those learner states that require the same instructional response. In short, student modeling can be conceptualized as the problem of defining instructionally relevant equivalence classes in a computationally tractable way.

The Constraint-Based approach to student modeling [Ohlsson, 1992] suggests representing subject matter knowledge in sets of constraints. A student violating one of these constraints indicates incomplete or incorrect knowledge. These violations can be used to guide an ITS in its response. This approach suggests that runnable models of either the expert or the student may no longer be needed. It also proposes that the computations required for student modeling can be reduced to just simple pattern matching.

Constraint-based student modeling is based on the notion that students can be described in terms of entities more abstract than particular solution paths or strategies. Abstraction can be achieved in two ways:

  1. Some problem-solving steps are highly diagnostic because they follow from the fundamental ideas of the subject matter, while other steps are only a mechanical function of the task and would thus provide less information.
  2. The diagnostic information associated with a highly informative step may not necessarily reside in the step, but can also be in the properties of the problem state. Classes of problem states can therefore be defined by expressing the fundamental ideas of a domain in terms of constraints on correct problem states.
The constraint-based technique displays potential to significantly reduce the effort required to implement an ITS [3]. Its advantages are:
  1. There is no need to create a runnable expert (or ideal student) model. Hence, large scale empirical studies of domain experts can be omitted (interviews with experts may help identify key ideas in a domain).
  2. There is no need to create a bug library. Although BUGGY showed that a bug library had its merit, the constraint-based technique provides a much cheaper alternative especially in computational overheads. There is no need for sophisticated and computationally expensive AI inference mechanisms. Constraints can be compared to problem states with pattern matching algorithms which can be easily implemented. Hence, large scale empirical studies of students to create and validate such bug libraries can be omitted (although studies of incorrect solution paths could expose useful constraints).
  3. A constraint-based student model ignores the student's problem-solving strategy. This allows the flexibility to monitor free exploration, recognize creative solutions as correct, and succeed in the face of various types of inconsistency on the part of the student, including radical strategy variability. These two features are not offered by any of the systems previously studied.
  4. The constraint-based technique is neutral with respect to pedagogy. A student can be tutored after each constraint violation or after a sequence of them, as seems most appropriate for the domain. A particularly useful feature is the ability to adjust the pedagogical level of coaching. With this added flexibility, the expert model can immediately step in and guide the student, or the system can elect to offer only advice or hints, or to monitor for a sequence of erroneous actions before taking corrective actions.
The potential disadvantages of the constraint-based technique are:
  1. For some domains it might be impossible to identify properties of problem states which are highly informative with respect to the student's understanding.
  2. The set of constraints might provide too loose a net, such that many incorrect solutions slip through without violating any constraints. This might result in under-instruction or failure to respond to errors made by students, which may be misinterpreted by students as active endorsement by the system for their solution.
  3. The set of constraints might not provide the correct type of abstraction. Thus there is a danger that student behaviors would not be divided into pedagogically relevant equivalence classes.
Ohlsson demonstrates the feasibility of constrained-based tutoring with examples of fraction teaching in mathematics and teaching of the Lewis structure in organic chemistry. In both cases the student gets an exercise and tries to solve it. Ohlsson claims that in these two particular cases, if the student makes an error there is no need for an extensive bug library or diagnosis system, as it is immediately clear what the student is doing.

Applying this to teaching of data structures, it was discovered that this modeling technique is the best suited. Imagine that the students' task is to learn how to delete an element out of a linked list, e.g. element 2 as shown in Figure 1 (a) - (e). The steps the student has to perform are the following:

    1. Original list at start of exercise
    2. Create a help pointer
    3. Assign the help pointer to the tail of the list
    4. Delete (free) the element
    5. Assign the tail to the element to the one before the deleted element, e.g. element 1.
Figure 1: Deleting an element from a linked list correctly (a) - (e), incorrectly (f)

 If the student leaves out step 1 and 2, then upon deleting element 2 the tail of the linked list will be lost (Figure X (f)), as indicated by the dashed element 3. Detecting this situation it is immediately clear what kind of error the student made and the tutoring module can act upon it. It can reset the exercise and give the student useful hints, so that s/he discovers her/himself that a help pointer is needed in order to keep the tail.

With many of the documented strengths of the technique in line with the data structure ITS domain, the project will concentrate on applying this technique while implementing the ITS component. However, the potential disadvantages of the constraint-based model cannot be ignored. The expert model designed should attempt to provide a sufficiently exhaustive semantic net such that disadvantage 2 can be avoided. This will also affect the kind of pattern to be used in defining constraints.

To our knowledge the constraint-based student modeling technique was never before implemented in an ITS. Therefore there is no evidence that the technique can produce psychologically realistic or pedagogically useful student models capable of guiding instruction. However, the off-line diagnosis done by Ohlsson (1992) has demonstrated its feasibility. Based on the diagnosis, this technique is well suited with respect to the goals of the data structure tutoring system.


Constrained-based student modeling was proven to be successful for an Intelligent Tutoring System on data structures. The computational overhead that was previously needed by other ITSs was greatly reduced this way. Instead of providing an extensive bug list or diagnosis model, a more simple constrained-based error list is needed. In combi

nation with a sophisticated tutoring method such as socratic dialogue the system will serve as an intelligent interactive tutoring system during laboratory sessions.


[1] Brown, J. S., R. Burton, M. Miller, J. DeKleer, S. Purcell, C. Hauseman and R. Bobrow (1975): Steps towards a Theoretical Foundation for Complex, Knowledge-Based CAI. Technical Report 3135, Bolt, Beranek and Newman.

[2] Warendorf, K. (1996): Combining interactive multimedia and inquiry teaching to build an intelligent multimedia tutoring system, Proceedings of ED-MEDIA 96, Boston, Pg. 814.

[3] Ohlsson, S. (1992): "Constraint-Based Student Modeling". Journal of Artificial Intelligence in Education 3(4), (pp. 429-447).

[4] Burton, R. (1982): "Diagnosing bugs in a simple procedural skill" in D. H. Sleeman and J. S. Brown (eds.): Intelligent Tutoring Systems (pp.17-183). New York: Academic Press.

[5] Ohlsson, S., Langley, P. (1988): "Psychological evaluation of path hypotheses in cognitive diagnosis" in Mandl H., Lesgold A. (eds.): Learning issues for intelligent tutoring systems. (pp. 42-62). Springer-Vlg.

[6] Anderson, J. R., Boyle, C. F., Corbett, A. T., and Lewis, M. W. (1990): "Cognitive modeling and intelligent tutoring". Artificial Intelligence 42, (pp. 7-49).