Homework 2, due in 2 weeks. The goal of this homework is to familiarize your self with the sliding window based detection schemes using a single HOG descriptor. Your goal is to train an single object detector for all objects of aspect ratio 0.7 using HOG descriptor and SVM implementations available in vlfeat library. The data and some scripts for this exercise can be found in course web page subdirectory cs884/object_detector/. The starting point of your work should be a script compute_descriptors_v2.m which at this point only contains visualization of the bounding boxes. At the moment only the bounding boxes of objects who's aspect ratio is close to 0.7 are visualized. There are 239 examples of such bounding boxes (variable 'groups') of 78 different names/ID's. There is one big mat file containing the whole kitchen dataset which is uploaded initially by the script. The subdirectory sceneGT, contains bounding boxes of various objects and groups.mat file groups all the objects based on their aspect ratio and also has information about their names and labels. The individual frames can be found in the kitchen directory as a tar file. The current script does not use the tar file, but it is provided there for your convenience (tar_kitchen_mat.tar.gz) There are few caveats to this project: Some of the bounding boxes are too small, so cannot be considered, due to the resolution of the images. There are many different types of objects in these bounding boxes which have the same aspect ratio, and their HOG descriptors will likely look very different. You may need to split the data into groups whose HOG descriptors resemble each other. In case the number of bounding boxes after splitting is too small, then you can generate additional bounding boxes. You will need to experiment with various methods how to generate negative bounding boxes and possibly increase the number of positive bounding boxes by sampling around true bounding boxes. When evaluating the performance of the classifier, note that some of the objects appearing in the images are not marked with bounding boxes. This is due to the errors of Hand in the code and precision/recall curve of the best results you can achieve along with the configuration of the parameters. For the classification I would suggest you using liner svm or linear svm with histogram intersection kernel, but you can also try other classifiers. I will provide additional information about testing shortly.