Professor Harry Wechsler

Department of Computer Science

George Mason University

Fairfax, VA 22030

e-mail : wechsler@cs.gmu.edu

web : http://cs.gmu.edu/~wechsler/

           (703) 993-1533 (office)

(703) 993-1530 (sec)

(703)993-1710 (fax)

 

GEORGE MASON UNIVERSITY

         SPRING   '2005

 CS 844 --- PATTERN  RECOGNITION

 001  13474    T   4:30 p.m. –  7:10 p.m.   IN    327

[except on March 29 and April 5 when the class is held in ST2 260]

 IT  844 --- PATTERN  RECOGNITION

             001 11307    T    4:30 p.m. –  7:10 p.m.   IN    327

 [except on March 29 and April 5 when the class is held in ST2 260]

Office Hours

T   3:15 p.m. - 4:00 p.m. or by appointment (SITE II - Rm. 461)

          Textbook

          1. A. Webb, Statistical Pattern Recognition (2nd ed), Wiley, 2004.

References

1.   C. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, 1995.

 

2.  V. Cherkassky and F. Mulier, Learning from Data : Concepts, Theory, and Methods,  Wiley, 1999.

 

3. N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines, Cambridge   University Press, 2001.

 

         4.   R. Duda, P. Hart and D. Stork, Pattern Classification, Wiley, 2002.

 

5.  S. Haykin, Neural Networks, Prentice-Hall, 1999.

 

6. B. Scholkopf and A. J. Smola, Learning with Kernels : Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, 2002.

 

7.   J. Shawe-Taylor and N. Cristianini, , Kernel Methods for Pattern Analysis, Cambridge University Press, 2004.

 

8. V. Vapnik, Statistical Learning Theory, Wiley, 1998.

 

       

Course Description

           The course covers the Statistical Pattern Recognition (SPR),

           the Neural Network (NN), and the Statistical Learning  Theory (SLT) approaches

           for Pattern Recognition (PR)   Topics include decision theory and Bayes’ theorem, 

           density (parametric and non-parametric) estimation, linear (MSE and LMS) and non-linear

           discriminant analysis,  SVM (support vector machines) and kernel methods, SRM (structural risk

           minimization) and model selection, performance evaluation, mixture of experts (AdaBoost),

           feature selection and extraction,  clustering. Experimental design, applications,

           and performance evaluation are  emphasized  throughout the course.

 

          Schedule

 

          1st day of classes: January 25, 2005

 

         Spring Break: March 15, 2005

 

         Last Day of Classes: May 3, 2005

          

Grading

1. Homework Assignments: 50 % (#1: 12.50; #2: 12.50; #3: 25%)

assignment #1 --  due February 15 -- elementary decision theory (see Sect. 1.5.1 in

textbook – pp. 6 – 16) includes {Bayes minimum error and risk, reject option, and

Neyman – Pearson decision rule} task: “roll” [enough time] two pairs of dice, one pair fair (1 – 6) and

one pair not fair (3 – 8); roll one pair at a time and report the sum x = s for the faces of two dices. for

the fair pair s = 2 ..12 while for the non-fair pair s = 6 .. 16. The decision required is to guess

in an “optimal” fashion the pair rolled. Simulate (on computer) the task,

duplicate the derivations found in Sect. 1.5.1 to minimize error and/or risk, and graph the results

accordingly.

assignment #2 --  due February 22 – EM – Consider three Gaussians pdf’s N (1.0, 0.1), N (1.5, 0.1) and N (2.0, 0.2). Generate 500 samples according to the following rule. The first two samples are generated from the 2nd Gaussian, the 3rd sample from the 1st Gaussian, and the 4th sample from the last Gaussian. This rule repeats until all 500 samples have been generated. The pdf underlying the random samples is modeled as a mixture SUM (i = 1..3) N (m(i), sigma**2(i)) P(i). Use the EM algorithm on the generated samples to estimate the unknown parameters {m(i), sigma**2(i) and P(i)}. Display/graph the three (3) original Gaussians and the mixture approximation. Discuss your results in terms of accuracy and convergence (number of steps).

assignment #3 --  due March 29 – (Data) Structure < Representation > and

Algorithm <Classifier> - Use 2 (two) Data Sets to Assess Different {representations, classifiers} combinations in Terms of Overall Performance for Multi-Class (c > 2) Classification. First data set should be 2D to allow for display and visualization, while the second data set should be multidimensional.  Representations include {PCA <principal component analysis>, LDA <linear discriminant analysis>, NPLDA <non-parametric LDA>, EPP <exploratory projection pursuit> / EP <evolutionary pursuit [Liu and Wechsler, IEEE on PAMI, 2000]}. Classifiers include {RBF <radial basis functions>, BP <back propagation>}. You can include additional representations, e.g., MDS <multidimensional scaling> and/or classifiers, e.g., SVM <support vector machines>. Define briefly each of the methods used, experimental design regarding data acquisition and software used, comparative (using raw data vs. transformed data) performance observed, limitations  and sensitivity {to parameter choice and/or noise}, and conclusions. Power point presentation expected. The presentation should be made available to the other students on the web.

3. PROJECT:  50% - (3.1) Literature Survey, (3.2) Method, (3.3) Experimental Set Up {data acquisition + software}, Results, and Performance Evaluation, (3.4) Analysis and Future, and (3.5) Class Presentation [ April 26 and May 3].
Topic and Scope for the project to be agreed with the instructor.