George Mason University


CS 757 Mining Massive Datasets – Spring 2014

Instructor: Dr. Daniel Barbará


Description: This course covers the principles of MapReduce and other technologies and their use to scale Data Mining algorithms

Prerequisites: An undergraduate (or graduate) Data Mining Course. A solid background in Java programming and Python. In order to be able to work on the programming projects, the students must be comfortable with the Java programming language.

Meeting Times and Locations:



Office Hours: : By appointment(Office: Eng. Buldg., Room 4420)


Course Web Page:

Course Outcomes: At the end of this course, you will





No early exams will be given and make-up exams are strongly discouraged.

Assignments will be collected on the date indicated in class. Late submissions will be penalized at 15% each day, and will not be allowed after 3 days of the due date.
GMU Honor Code will be enforced. The students are supposed to work individually on the assignments/projects, unless told otherwise. We reserve the right to use MOSS to detect plagiarism. Violations of GMU Honor Code or a total score of 49 (or less) will result in an F.

No smartphones, laptops, or recorders allowed in class. Lectures cannot be recorded without special permission from the instructor

Computer Accounts: All students should have accounts on the central Mason Unix system (also known as and  on IT&E Unix cluster (Instructions and related links are here). Students can  work in  IT&E computer labs  for programming projects during the specified hours.

Students with Disabilities: If you have a documented learning disability or other condition that may affect academic performance you should: 1) make sure this documentation is on file with the Office of Disability Services (SUB I, Rm. 222; 993-2474; to determine the accommodations you need; and 2) talk with me to discuss your accommodation needs.

a>) to determine the accommodations you need; and 2) talk with me to discuss your accommodation needs.