MLBio+Laboratory Machine Learning in Biomedical Informatics



INFS 755: Data Mining

Class Information
Instructor: Huzefa Rangwala, Room #4423 EB, rangwala@cs.gmu.edu
Class Time & Location: Arts Building 2003, M 4:30pm - 7:10pm
Text Book: Pang-Ning Tan, Michael Steinbach, and Vipin Kumar Introduction to Data Mining, Addison Wesley, 2006. Book's companion website
Teaching Assistant: Tanwishta Saha (tsaha@gmu.edu)
Office Hours: Instructor: M 2:00-4:00 pm, TA: W: 1:00-3:00 pm, Room #4456

Please note the syllabus is subject to change to enrich the student's learning experience :). Feel free to email rangwala@cs.gmu.edu for questions, concerns, or even say hi

If you have taken CS 750, then you will not receive credit for INFS 755

About the Course
Course Description
Over the past decade there has been an exponential increase in the amount of data. This has lead to development of techniques to discover useful and interesting information from the large collections of data. This course aims to provide a overview of the key data mining methods and techniques like classification, clustering, and association rule mining. The course will also provide interesting application examples of data mining, especially in the field of bioinformatics and spatial data mining.
Course Prerequisites
Some programming experience is expected. Students should be familiar with basic probability and statistics concepts, and linear algebra. Please expect some programming in the assignments and class projects. If you are not sure about the pre-reqs send me an email.
Course Format
Lectures will be given by the instructor. Besides material from the textbook, topics not discussed in the book may also be covered. Research papers and handouts of material not covered in the book will be made available. Grading will be based on homework assignments, exams, and a project. Homework assignments will require some programming.
Course Outcomes
As an outcome of taking this class, a student will be able to
  • Understand the various classification, clustering, association rule-mining algorithms.
  • Apply the data mining techniques learned to real world applications.
  • Read research papers pertaining to data mining and cloud computing

Schedule

08.29.2011 Welcome, Introduction to Data Mining (Chapter 1)
09.12.2011 Data (Chapter 2) [HW1 out]
09.19.2011 Classification (Chapter 4 & 5)
09.26.2011 Classification (Contd) [HW1 due]/[HW2 out]
10.03.2011 Clustering (Chapters 7 & 8)
10.10.2011 Clustering (Contd.) [HW2 due]
10.17.2011 Association Rule Mining (Chapter 6)
10.24.2011 Association Rule Mining (Contd). [HW3 out]
10.31.2011 Jigsaw Activity: Clustering [Proposal Due]
11.07.2011 Outlier Detection (Chapter 10) [HW3 due]
11.14.2011 Exam [30%]
11.21.2011 Web Mining Example
11.28.2011 Project Presentations.
12.05.2011 Project Presentations.
12.12.2011 Project Reports Due /No Final.
Assignments/Exams
Deadline Type % Weight
09.26.2011 HW 1 10
10.10.2011 HW 2 10
11.07.2011 HW 3 10
11.14.2011 Exam 30
- Class Participation 5
Project 35
10.31.2011 Project Proposal (2 pages) 5
Project Presentations 10
12.12.2011 Final Report 20
Grade Distribution
Grade Score Range
A+ >98
A 94-98
A- 90-94
B+ 86-90
B 82-86
B- 78-82
C+ 74-78
C 70-74
C- 66-70
F < 66
Policies:
Attendance
Attendance is not compulsory but highly recommended for doing well in the class. This class has lots of active learning exercises, and they will be a lot of fun.
Assignment Submission
Please ensure that the assignments are submitted on-time. No late submissions.
Make-Up Exams & Incompletes
Make up exams and incompletes will not be given for this class.
Academic Honesty and GMU Honor Code
Please visit the University's Academic Honesty Page and GMU Honor Code .
Disability Statement
If you have a documented learning disability or other condition that may affect academic performance you should: 1) make sure this documentation is on file with the Office of Disability Services (SUB I, Rm. 222; 993-2474; www.gmu.edu/student/drc) to determine the accommodations you need; and 2) talk with me to discuss your accommodation needs.


Powered by Drupal, an open source content management system