|
Course Description
Basic principles and methods for data analysis and
knowledge discovery. Emphasizes developing basic skills for modeling
and prediction, on one side, and performance evaluation, on the other.
Topics include system design; data quality, preprocessing, and
association; event classification; clustering; biometrics; business
intelligence; and mining complex types of data.
Instructor:
Dr.
Jessica Lin
Office:
Engineering Building 4419
Phone:
703-993-4693
Email:
jessica [AT] cs [DOT] gmu [DOT] edu
Office
Hours: Thursday 2-4pm
TA
Tanwistha Saha
Office: TBA
Office Hours: TBA
Email: tsaha [AT] cs [DOT] gmu [DOT] edu
Classes
Monday/Wednesday
1:30-2:45pm
Innovation Hall 206
Course Outcomes
- The ability to apply
computing principles, probability and statistics relevant to the data
mining discipline to analyze data.
- A thorough understanding
of model
programming with data mining tools, algorithms for estimation,
prediction, and pattern discovery.
- The ability to analyze a
problem,
identifying and defining the computing requirements appropriate to its
solution: data collection and preparation, functional requirements,
selection of models and prediction algorithms, software, and
performance evaluation.
- The ability to
understand
performance metrics used in the data mining field to interpret the
results of applying an algorithm or model, to compare methods and to
reach conclusions about data.
- The ability to
communicate
effectively to an audience the steps and results followed in solving a
data mining problem (through a term project)
Prerequisites:
Grade of C
or better in CS 310 and STAT 344
Grading
Assignments: 15%
Class Participation: 5%
Project: 30%
Midterm: 20%
Final: 30%
Exams
Exams will be open-book, open-note. Prior arrangement needs to be made with the
instructor if you cannot make it to the exam. Missed exams cannot be
made up.
Honor Code
Statement
Please
be
familiar with the GMU
Honor Code. Any deviation from this is considered
an Honor Code violation.
Textbooks
Required: Introduction
to Data Mining by Pang-Ning Tan, Michael
Steinbach, and Vipin Kumar
Additional
handouts and reading materials may be given in class.
Topics
Ch.1: Introduction
Ch.2: Data
Ch.4: Classification
Ch.5: Classification: Alternative Techniques
Ch.6: Association Analysis: Basic Concepts and Algorithms
Ch.7: Association Analysis: Advanced Concepts
Ch.8: Cluster Analysis: Basic Concepts and Algorithms
Ch.9: Cluster Analysis: Additional Issues and Algorithms
Ch.10: Anomaly Detection
|
|
Class Website
|
|