CS 750 / INFS 755 Theory and Applications of Data Mining

Summer 2009

Prerequisite: CS 688 or permission of instructor

A01 41083 / 41088 5/18 MWF 7: 00 pm 10:05 pm SITE I 206

Office Hours: TBD (ENGR - 4448)

Catalog description: Concepts and techniques in data mining and multidisciplinary applications. Topics include databases; data cleaning and transformation; concept description; association and correlation rules; data classification and predictive modeling; performance analysis and scalability, data mining in advanced database systems, including text, audio, and images; and emerging themes and trends. Term team project and topical review are required.

Instructor:  Prof. Harry Wechsler http://cs.gmu.edu/~wechsler/

Textbook:

Tan, Steinbach and Kumar, Introduction to Data Mining, Pearson / Addison Wesley, 2006

textbook slides: http://www-users.cs.umn.edu/~kumar/dmbook/

References

WEKA web site for data mining software

http://www.togaware.com/datamining/survivor/Weka.html

UCI Machine Learning Repository Content Summary

http://www.ics.uci.edu/~mlearn/MLSummary.html

Syllabus:

        databases, data warehousing, data mining, knowledge discovery, and the Semantic Web (W3C) http://www.w3.org/2001/sw

        prediction, model selection, validation, and performance evaluation

        data exploration

        data reduction and transformation

        classification

        association

        clustering

        anomaly detection

        applications: biometrics

Grading:

        Homework Assignments 10%

        Quizzes (5/26 & 6/5) 15%

        Mid Term June 9 20%

        Term TEAM Project 25%

        Final June 19 30%

Honor Code

You are expected to abide by the honor code. Homework assignments and exams are individual efforts. Information on the university honor code can be found at: http://jiju.gmu.edu/catalog/apolicies/honor.html