Spring 2022: Data Mining [CS484]

Professor: Carlotta Domeniconi, Rm 4424 ENG, carlotta\AT\cs.gmu.edu. Office hours: TBD
Teaching Assistant: TBD
Prerequisites: CS310 and STAT344 (C or better in both).
Students should be familiar with basic probability and statistics concepts, and linear algebra. Programming experience in Python preferred.
Java or C will work as well, but the assignments will use the Python framework. Please expect lots of programming in all the assignments.
Location and Time: We meet in Horizon Hall 2009, MW 12:00pm - 1:15pm
Textbook: P. N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining, Pearson. Book's companion website
Course Web Page

General Description and Preliminary List of Topics

Data mining is the process of automatically discovering useful information in large data repositories. The course covers key concepts and algorithms at the core of data mining.

Topics include: classification, clustering, association analysis, anomaly detection.

Outcomes

The ability to apply computing principles, probability and statistics relevant to the data mining discipline to analyze data.
A thorough understanding of model programming with data mining tools, algorithms for estimation, prediction, and pattern discovery.
The ability to analyze a problem, identifying and defining the computing requirements appropriate to its solution: data collection and preparation, functional requirements, selection of models and prediction algorithms, software, and performance evaluation.
The ability to understand performance metrics used in the data mining field to interpret the results of applying an algorithm or model, to compare methods and to reach conclusions about data.
The ability to communicate effectively to an audience the steps and results followed in solving a data mining problem (through a term project).

Grading

Assignments: 50%
Midterm: 25%
Final: 25%
Extra credit: participation; competition winners for homework

Exams are in class and closed book. A missed exam cannot be made up. All assignments must be performed individually unless otherwise specified.

Honor Code Statement

The GMU Honor Code is in effect at all times. In addition, the CS department has its own Honor Code policies regarding programming assignments. Any deviation from the GMU or the CS department Honor Code is considered a Honor Code violation.

Disabilities

If you have a documented learning disability or other condition which may affect academic performance, make sure this documentation is on file with the Office of Disability Services and come talk to me about accommodations.