|
Course Description
Basic principles and methods for data analysis and
knowledge discovery. Emphasizes developing basic skills for modeling
and prediction, on one side, and performance evaluation, on the other.
Topics include system design; data quality, preprocessing, and
association; event classification; clustering; biometrics; business
intelligence; and mining complex types of data.
Instructor:
Dr.
Jessica Lin
Office:
Engineering Building 4419
Phone:
703-993-4693
Email:
jessica [AT] cs [DOT] gmu [DOT] edu
Office
Hours: TBA
TA
Jatin Mistry
jmistry2 [AT] gmu [DOT] edu
Office Hours: TBA
Classes
Tuesday/Thursday
12-1:15pm
Art & Design Building 2026
Course Outcomes
- The ability to apply
computing principles, probability and statistics relevant to the data
mining discipline to analyze data.
- A thorough understanding
of model
programming with data mining tools, algorithms for estimation,
prediction, and pattern discovery.
- The ability to analyze a
problem,
identifying and defining the computing requirements appropriate to its
solution: data collection and preparation, functional requirements,
selection of models and prediction algorithms, software, and
performance evaluation.
- The ability to
understand
performance metrics used in the data mining field to interpret the
results of applying an algorithm or model, to compare methods and to
reach conclusions about data.
- The ability to
communicate
effectively to an audience the steps and results followed in solving a
data mining problem (through a term project)
Prerequisites:
Grade of C
or better in CS 310 and STAT 344
Grading
Assignments: 20%
Project: 20%
Midterms: 30%
Final: 30%
Exams
There will be two midterm exams and one final exam
covering lectures and
readings. All exams will be in class, closed book. The final exam is
comprehensive. Exams
must be taken at the scheduled time and place, unless prior arrangement
has been made with the instructor. Missed exams cannot be
made up.
Honor Code
Statement
The GMU
Honor Code
is in effect at all times. In addition, the CS Department has further
honor code policies regarding programming projects, which are detailed here.
Any deviation from the GMU or the CS department Honor Code is
considered
an Honor Code violation.
Textbooks
Required: Introduction
to Data Mining by Pang-Ning Tan, Michael
Steinbach, and Vipin Kumar
Recommended:
Data
Mining and Analysis by Mohammed Zaki (Here
is the online pdf
version.)
Topics
Ch.1: Introduction
Ch.2: Data
Ch.4: Classification
Ch.5: Classification: Alternative Techniques
Ch.6: Association Analysis: Basic Concepts and Algorithms
Ch.7: Association Analysis: Advanced Concepts
Ch.8: Cluster Analysis: Basic Concepts and Algorithms
Ch.9: Cluster Analysis: Additional Issues and Algorithms
Ch.10: Anomaly Detection
|
|
Class Website
|
|