Class 11: Learning: Version Spaces

November 22, 2004

Introduction

 

Example: A navigating robot must classify obstacles to moveable or immovable to know whether it can go through them, or stop and try to get around them. A robot may have a classifying system based on dimensions and shapes. When a robot encounters a new object, it may not have enough data to make a decision, so it tries to go through and see if it is an immovable obstacle. Hence it makes some adjustment to its classifying system to recognize this object in the future.

 

These adjustments constitute a particular sort of learning. Learning involves changes to the content and organization of a system's knowledge enabling to improve its performance on a particular task or set of tasks. Learning occurs when the system acquires new knowledge from its environment or when it organizes its current knowledge to make better use of it.

 

We will deal with learning methods that use inductive inference. Inductive inference arrives at general conclusions by examining particular examples. For instance, when robot finds out that several encountered tall and wide objects are immovable, it may conclude that all tall and wide objects are immovable.

 

Learning based on inductive inference can be divided into supervised and unsupervised learning

 

Supervised Learning

 

In supervised learning, the learning program is given a sequence of input/output pairs <xi, yi>, where xi is a possible input, and yi is the output associated with xi. Such pairs are called examples and are assumed to come from some unknown function. The learning program is expected to learn a function f that accounts for all examples seen so far: f(xi) = yi, and that makes a good guess for the outputs of inputs it has not seen. Examples: introducing obstacles and classifying them; driving robot is introduced to different views to reactions.

 

If the function is discrete valued, then the outputs are called classes, and the learning task is called classification. If there are only two possible outputs, then the learned function is called a concept, and the task is called concept learning.

 

Unsupervised Learning

 

In this type of learning, the system is not presented with conveniently packaged examples. In reinforcement learning, for example, a feedback signal is used to give the learning program an indication of whether or not it has learned correctly. For example, a navigating robot tries to push an object. If it does not succeed for a long time, this may indicate the obstacle is immovable. Playing chess is another example. In this case, the objective of learning is to maximize the expectation of reward (or to minimize the expectation of punishment).

 

Theory of Inductive Inference

 

We focus on concept learning. Assume X is a set of all examples. Formally, C Ì X is a subset corresponding to all examples belonging to concept. In concept learning, the learning program is presented with a set of training examples drawn from X, labeled positive or negative.

 

The learning program is given examples of the form <x, y>, where x Î X, and if x Î C => y=1, otherwise y=0. The objective is to learn a function f such that f(x) = 1 if x Î C and f(x) = 0 if x Ï C.

 

As described so far, the problem is not well enough defined to precisely formulate a solution. We need to specify the set of functions, from which we select f. We then assume that f is drawn from a space of possible hypotheses, and notate this hypothesis state by H. To make concept learning well defined, we need to introduce constraints on H. Such a constraint is called an inductive bias. Such bias provides a learning program for choosing among possible representations of f.

 

Version Spaces

 

Learning theory postulates that, in some cases, finding a hypothesis consistent with the training examples is sufficient for effective learning. However, finding such a hypothesis can be time consuming. We now study methods that cope with such complexity by implementing inductive bias. The first method, called version-space method, exploits the structure of a restricted hypothesis space to maintain bounds on the set of all concepts consistent with a set of training examples. The method proceeds by processing one training example at a time and is asymptotically optimal for the restricted hypothesis space consisting of conjunctions of positive literals.

 

A concept C1 is a specialization of concept C2 if C1 Ì C2. C2 is then a generalization of C1. For example, the concept a "tall object" is more general than a "tall and wide object". A version space is a graph whose nodes are concepts and arcs specify immediate specializations. For example, to classify obstacle by using height, width, and shape we have:

 

Ù

 

tall                    short                 wide                 narrow            star                   circle                rectangle

 

tall Ù wide        tall Ù narrow    short Ù circle    ...

 

tall Ù wide Ù circle       ...         short Ù circle Ù narrow ...

 

 

The version space is represented by using three dimensions: (height, {tall, short}), (width, {wide, narrow}), and (shape, {star, circle, rectangle}). We present an algorithm that modifies a general and specific boundary after each training example.

 

The idea is to take a training example at a time and adjust current general and specific boundaries. If an example is positive, say "tall and wide" and there's "tall and narrow" in the specific boundary, we may generalize it to "tall". If an example is negative, say "short and narrow", then specialize each concept in general boundary (e.g. "short", "narrow") until it is not consistent.

 

A concept C is guaranteed to include all positive examples and exclude all negative examples if there exist C' and C'' as follows:

 

1. C' is the general boundary

2. C is equal to or more specific than C'

3. C'' is in the specific boundary

4. C is equal to or more general than C''

5. C' is equal to or more general than C''

 

C' --- C --- C''

 

Algorithm

 

The following algorithm guarantees convergence to a single concept if there is just one concept consistent with all training examples:

 

·        If the example is positive:

o       Eliminate all concepts from C' that are not consistent with the example.

o       Minimally generalize all concepts in C'' until they are consistent with the example, and eliminate those that fail the following requirements:

§         Each concept is a specialization of some in the general boundary C'

§         Each concept is not a generalization of some other concept in the specific boundary C''

·        If the example is negative:

o       Eliminate all concepts in the specific boundary C'' that are consistent with the example.

o       Minimally specialize all concepts in the general boundary C' until they are not consistent. Eliminate those that fail the following requirements:

§         Each concept is generalization of some concept in the specific boundary C''

§         Each concept is not specialization of some concept in general boundary C'

 

Explain an example with navigating robot.

 

Lisp Implementation

 

Define the following data structures.

 

A feature is an attribute and value:

 

(defun make-FEATURE (attribute value) (list attribute value))

(defun FEATURE-attribute (feature) (first feature))

(defun FEATURE-value (feature) (second feature))

 

A dimension is an attribute with a set of possible values:

 

(defun make-DIMENSION (attribute values) (list attribute values))

(defun DIMENSION-attribute (dimension) (first dimension))

(defun DIMENSION-values (dimension) (second dimension))

An example is a set of features (conjunctions) and an expression representing a class. In our case a class is yes or no.

 

(defun make-EXAMPLE (id features class) (list id features class))

(defun EXAMPLE-id (example) (first example))

(defun EXAMPLE-features (example) (second example))

(defun EXAMPLE-class (example) (third example))

 

Functions dimensions and classes represent the list of different dimensions and classes respectively:

 

(defun dimensions()

      (list

            (make-DIMENSION 'height '(tall short))

            (make-DIMENSION 'width '(wide narrow))

            (make-DIMENSION 'shape '(star circle rectangle))

      )

)

(defun classes () '(yes no))

 

Examples are represented as follows:

 

(setq examples

      (list (make-example '1 '((height tall)(width wide)(shape star)) 'yes)

             ...

)

 

;; The initial general boundary is empty; the specific boundary

;; contains lists of example features. (In the first step of the algorithm

;; the first example will eliminate all but its own conjuncts.)

(setq boundaries

      (list (list '())

            (mapcar #'EXAMPLE-features examples)

      )

)

 

;; Utility: return all items from a given list on which

;; a given function returns non-nil

(defun findall (list test)

      (cond

            ((null list) nil)

            ((apply test (first list)) (cons (first list) (findall (rest list) test)))

            (t (findall (rest list) test))

      )

)

 

;; Utility: get the list of all features for a given problem:

;; ((height (tall short))) => ((height tall) (height short))

;; mapcan is like mapcar, but sows results together:

;; (mapcan #'list '(a b c) '(1 2 3))

;; (a 1 b 2 c 3)

(defun features ()

      (mapcan

            #'(lambda(dim)

                  (mapcar

#'(lambda (value)

(list (DIMENSION-attribute dim) value))

                        (DIMENSION-values dim)

                  ))

            (dimensions)

      )

)

 

;; Refine: the main test on an example to apply different strategies

;; in case of positive or negative

(defun refine (example general specific)

      (if (eq (EXAMPLE-class example) 'yes)

            (list general (generalize-specific example specific))

            (list (specialize-general example general) specific)

      )

)

 

;; Generalize specific: generalizing each concept until it is consistent

;; with the positive example

(defun generalize-specific (example boundary)

      (mapcan

            #'(lambda (concept) (run-generalize-specific example concept))

            boundary

      )

)

(defun run-generalize-specific (example concept)

      (if (consistent (EXAMPLE-features example) concept)

            (list concept)

            (generalize-specific example (generalize concept))

      )

)

 

;; Specialize general: specializing each concept until it is not consistent

(defun specialize-general (example boundary)

      (mapcan

            #'(lambda (concept) (run-specialize-general example concept))

            boundary

      )

)

(defun run-specialize-general (example concept)

      (if (not (consistent (EXAMPLE-features example) concept))

            (list concept)

            (specialize-general example (specialize concept))

      )

)

 

;; Consistency check: for each feature(conjunct) of the concept the attribute is either:

;;    not mentioned in the example;

;;    mentioned in the example and has the same value

;;

(defun consistent (features concept)

      ;; every tests if each element of the list satisfies the predicate

      (every

            #'(lambda (f)

                  (or

                        ;;assoc tests a list like ((key1 val1) (key2 val2)...)

                        (not (assoc (FEATURE-attribute f) features))

                        (member f features :test #'equal)

                  ))

            concept

      )

)

 

;; Generalization: simply remove features

(defun generalize (concept)

      (mapcar

#'(lambda (feature) (remove feature concept :test #'equal))

concept

      )

)

 

;; Specialize: add features that are consistent with concept

(defun specialize (concept)

      (mapcar

            #'(lambda (feature) (cons feature concept))

            (findall

(features)

#'(lambda (feature)

      (and

            (consistent (list feature) concept)

            (not (member feature concept :test #'equal)

      )

            )

      )

)