This course will cover essentials of Computer Vision, a discipline that strives to develop techniques to help computers "see" and understand images. The course is of interest to anyone seeking to process images and acquire a general background in problems related to real-world perception, object and scene recognition and 3D reconstruction. The geometric aspects of the course will focus camera modeling, calibration and techniques for extracting 3D metric information from 2D images. This will follow by techniques for image classification, object detection (e.g. detecting people, cars or other object of interest), activity recognition, using both traditional machine learning based approaches, as well as deep learning based methods. Applications to 3D modeling, video analysis, video surveillance, image based retrieval, object detection and recognition, image captioning and vision based control will be discussed.
Prerequisites linear algebra, calculus, probability and statistics
Lecture Materials Lecture slides, lecture notes provided by instructor
Recommended Textbooks, Resources
[1] Foundations of Computer Vision: A. Torralba, P. Isola and B. Freeman, 2024 web site
[2] Invitation to 3D Vision: From Images to Geometric Models: Y. Ma, S. Soatto, J. Kosecka and S. Sastry web site
[3] Computer Vision: Algorithms and Applications. R. Szeliski, 2010, Springer online version of the book
[4] Computer Vision: A Modern Approach: D. Forsyth and J. Ponce, Prentice-Hall, 2003
[5] Image Processing, Analysis, and Machine Vision. Sonka, Hlavac, and Boyle. Thomson.
[6] Computer Vision. Ballard and Brown web site
Grading:
Late policy:
Required Software
Python, OpenCV
Course Outcomes
Basic knowledge of image formation process
Basic knowledge of image processing techniques for color and gray level images: edge detection, corner detection, segmentation
Basics of video processing, motion computation and 3D vision and geometry
Basics of image classification, object detection and recognition video processing
Ability to implement basic vision algorithms in Python/OpenCV (open source computer vision library)
Ability to implement image classification and object detection with convolutional neural networks using Pytorch library
Ability to apply the appropriate technique to a problem, write a project report and present the results in class.
Grading scale:
A+   | 98% |
A   | 92% |
A-   | 90% |
B+   | 87% |
B   | 82% |
B-   | 80% |
C+   | 77% |
C   | 72% |
C-   | 67% |
D   | 60% |
F   | < 60% |
CS department Honor Code
can be found
here
Disability Statement