CS 750 Theory and Applications of Data Mining

Spring 2009

Prerequisite: CS 688 or permission of instructor

001   14007   R  7: 20 pm – 10:00 pm   IN 136

Office Hours: R 6:15 pm – 7:15 pm (ST II – rm. 461)

Catalog description: Concepts and techniques in data mining and multidisciplinary applications. Topics include databases; data cleaning and transformation; concept description; association and correlation rules; data classification and predictive modeling; performance analysis and scalability, data mining in advanced database systems, including text, audio, and images; and emerging themes and trends. Term team project and topical review are required.

Instructor:  Prof. Harry Wechsler http://cs.gmu.edu/~wechsler/

Textbook:

Tan, Steinbach and Kumar, Introduction to Data Mining, Pearson / Addison Wesley, 2006

textbook slides: http://www-users.cs.umn.edu/~kumar/dmbook/

References

WEKA web site for data mining software

http://www.togaware.com/datamining/survivor/Weka.html

UCI Machine Learning Repository Content Summary

http://www.ics.uci.edu/~mlearn/MLSummary.html

Syllabus:

á      databases,   data warehousing, data mining, knowledge discovery, and the Semantic Web http://www.w3.org/2001/sw

á      data exploration

á      data reduction and transformation

á      classification

á      association

á      clustering

á      anomaly detection

á      applications: web mining

Grading:

á      Homework Assignments 15%

á      MidTerm 25%

á      Term TEAM Project 30%

á      Final (Wed. 5/7) 30%

Honor Code

You are expected to abide by the honor code. Homework assignments and exams are individual efforts. Information on the university honor code can be found at: http://jiju.gmu.edu/catalog/apolicies/honor.html