cs482 CS 482
Computer Vision

Time/Location: Tuesday 4:30-7:10pm, Lecture Hall 3
Instructor: Dr. Jana Kosecka
Office hours: Wednesday 2-3pm or by appointment
TA office hours: tbd
Office: Engineering Building 4444
email: kosecka@gmu.edu
Course communications: Piazza, Canvas

This course will cover essentials of Computer Vision, a discipline that strives to develop techniques to help computers "see" and understand images. The course is of interest to anyone seeking to process images and acquire a general background in problems related to real-world perception, object and scene recognition and 3D reconstruction. The geometric aspects of the course will focus camera modeling, calibration and techniques for extracting 3D metric information from 2D images. This will follow by techniques for image classification, object detection (e.g. detecting people, cars or other object of interest), activity recognition, using both traditional machine learning based approaches, as well as deep learning based methods. Applications to 3D modeling, video analysis, video surveillance, image based retrieval, object detection and recognition, image captioning and vision based control will be discussed.  

Prerequisites linear algebra, calculus, probability and statistics
Lecture Materials Lecture slides, lecture notes provided by instructor

Recommended Textbooks, Resources

[1] Foundations of Computer Vision: A. Torralba, P. Isola and B. Freeman, 2024 web site
[2] Invitation to 3D Vision: From Images to Geometric Models: Y. Ma, S. Soatto, J. Kosecka and S. Sastry web site
[3] Computer Vision: Algorithms and Applications. R. Szeliski, 2010, Springer online version of the book
[4] Computer Vision: A Modern Approach: D. Forsyth and J. Ponce, Prentice-Hall, 2003
[5] Image Processing, Analysis, and Machine Vision. Sonka, Hlavac, and Boyle. Thomson.
[6] Computer Vision. Ballard and Brown web site

Grading:

Assignments: 40%
Exam: 30%
Final project: 20%
Quizzes, Participation: 10%

Late policy:

Each student will have a 2 day late submission budget, which could be used towards late submission on the homeworks.

Required Software

Python, OpenCV

Course Outcomes

Basic knowledge of image formation process
Basic knowledge of image processing techniques for color and gray level images: edge detection, corner detection, segmentation
Basics of video processing, motion computation and 3D vision and geometry
Basics of image classification, object detection and recognition video processing
Ability to implement basic vision algorithms in Python/OpenCV (open source computer vision library)
Ability to implement image classification and object detection with convolutional neural networks using Pytorch library
Ability to apply the appropriate technique to a problem, write a project report and present the results in class.

Grading scale:

A+   98%
A   92%
A-   90%
B+   87%
B   82%
B-   80%
C+   77%
C   72%
C-   67%
D   60%
F   < 60%

CS department Honor Code can be found here

Disability Statement

If you have a documented learning disability or other condition that may affect academic performance you should:
  • Make sure this documentation is on file with the Office of Disability Services (SUB I, Rm. 222; www.gmu.edu/student/drc) to determine the accommodations
  • Talk with me to discuss your accommodation needs