CS 750 / INFS 755 Theory and Applications of Data Mining

Summer   2009

Prerequisite: CS 688 or permission of instructor

A01   41083 / 41088   5/18 MWF  7: 00 pm – 10:05 pm   SITE I 206

Office Hours: TBD (ENGR - 4448)

Catalog description: Concepts and techniques in data mining and multidisciplinary applications. Topics include databases; data cleaning and transformation; concept description; association and correlation rules; data classification and predictive modeling; performance analysis and scalability, data mining in advanced database systems, including text, audio, and images; and emerging themes and trends. Term team project and topical review are required.

Instructor:  Prof. Harry Wechsler http://cs.gmu.edu/~wechsler/

Textbook:

Tan, Steinbach and Kumar, Introduction to Data Mining, Pearson / Addison Wesley, 2006

textbook  slides: http://www-users.cs.umn.edu/~kumar/dmbook/

References

WEKA web site for data mining software

http://www.togaware.com/datamining/survivor/Weka.html

UCI Machine Learning Repository Content Summary

http://www.ics.uci.edu/~mlearn/MLSummary.html

Syllabus:

·        databases,   data warehousing, data mining, knowledge discovery, and the Semantic Web  (W3C)  http://www.w3.org/2001/sw

·        prediction, model selection, validation, and performance evaluation

·        data exploration

·        data reduction and transformation

·        classification

·        association

·        clustering

·        anomaly detection

·        applications: biometrics

Grading:

·        Homework Assignments 10%

·        Quizzes (5/26 & 6/5) 15%

·        Mid Term June 9 20%

·        Term TEAM Project 25%

·        Final  June 19 ­30%

Honor Code

You are expected to abide by the honor code. Homework assignments and exams are individual efforts. Information on the university honor code can be found at: http://jiju.gmu.edu/catalog/apolicies/honor.html