CS 484: Data Mining (Syllabus)

Class Information
Class/Sec: CS 484: Data Mining
Instructor: Huzefa Rangwala Room #4423 Engineering Building, rangwala@cs.gmu.edu
Class Time & Location: Tue/Thu 10:30 - 11:45 am Art and Design Building L008
Text Book: Pang-Ning Tan, Michael Steinbach, and Vipin Kumar Introduction to Data Mining, Addison Wesley, 2006. Book's companion website
Teaching Assistant: TBD TBD
Office Hours: Instructor: Tuesday 1:30-2:30 pm in Engineering 4423.
Communication and Class Link: Piazza Link: Piazza
Automated Data Mining Hackathon Host: Miner (Only ON Campus or VPN)

Please note the syllabus is subject to change to enrich the student's learning experience :). Feel free to email rangwala@cs.gmu.edu for questions, concerns, or even say hi.

About the Course
Course Description
Over the past decade there has been an exponential increase in the amount of data. This has lead to development of techniques to discover useful and interesting information from the large collections of data. This course aims to provide a overview of the key data mining methods and techniques like classification, clustering, and association rule mining. The course will also provide interesting application examples of data mining, especially in the field of social media analysis, text analysis and learning analytics.
Course Prerequisites
Programming experience in Python Preferred. Java or C will work as well but Assignments will use the Python framework. Students should be familiar with basic probability and statistics concepts, and linear algebra. Please expect lots of programming in all the assignments and class projects.
Course Format
Lectures will be given by the instructor. Besides material from the textbook, topics not discussed in the book may also be covered. Research papers and handouts of material not covered in the book will be made available. Grading will be based on homework assignments, exam, and a project. Homework assignments will require intensive programming using an automated competition style solution development for data mining challenges. Exams and homework assignments must be done on an individual basis unless stated. Any deviation from this policy will be considered a violation of the GMU Honor Code.
Course Outcomes
As an outcome of taking this class, a student will be able to
  • Understand the theory and implement various classification, clustering, association rule-mining algorithms.
  • Apply the data mining techniques learned to real world scientific and/or industrial applications.

Topics

Introduction
Data and It's Various Forms
Classification: Models, Methods and Applications
Clustering: Methods and Applications
Association Rule Mining
Applications: Biological Data Mining
Applications: Recommender Systems
Applications: Learning Analytics
Applications: Advanced Supervised Learning
Anomalies, Outliers
Assignments/Exams
Deliverable Deadline Grade Weights
HW0 Sep 7 0%
HW1 Sep 21 10%
HW2 Oct 5 10%
HW3 Oct 19 10%
HW4 Nov 9 10%
Exam 1 Nov 14 [May Change] 15%
Project Pitch Oct 12 0%
Project Proposal Oct 26 5%
Project Presentation Nov. 30, Dec 5, 7 10%
Project Report Dec 19 (No Final) 30%
Extra Credits: Participation, Competition Winners If you win a competition (HW-1 to HW-4) then you are eligible to skip an exam. More later
Grade Distribution
Grade Score Range
A >96
A- 92-96
B+ 88-92
B 84-88
B- 80-84
C+ 76-80
C 72-76
C- 68-72
F < 68
Policies:
Attendance
Attendance is not compulsory but highly recommended for doing well in the class. This class has lots of active learning exercises, and they will be a lot of fun.
Assignment Submission
Please ensure that the assignments are submitted on-time. No late submissions are allowed. There will be several assignments and there may be dependencies amongst consecutive assignments. The assignments are structured so that you can have multiple attempts towards the solution and there are no correct/unique solutions towards these challenging real world problems. They are designed to simulate real world data analytics.
Make-Up Exams & Incompletes
Make up exams and incompletes will not be given for this class.
Academic Honesty and GMU Honor Code
Please visit the GMU Honor Code and do not copy assignment solutions from your peers, internet or any source unless stated in the assignment description.
Disability Statement
If you have a documented learning disability or other condition that may affect academic performance you should: 1) make sure this documentation is on file with the Office of Disability Services (SUB I, Rm. 222; 993-2474; www.gmu.edu/student/drc to determine the accommodations you need; and 2) talk with me to discuss your accommodation needs.