Banner
PhD Dissertation Defense Abstract: Wei Zhang

Sampling Based Methods for Robust Motion Estimation and Image Based Localization

With the recent advances in the imaging technology, cameras are becoming less costly and widely available. Many mobile devices such as PDA's and cell phones are equipped with cameras, making it feasible to integrate computer vision techniques into people's everyday life. In this thesis, I will address the localization problem in urban environments. In general, there are two solutions to this problem. The Global Positioning System (GPS) can be used which gives the absolute position information. Alternatively, the local coordinates with respect to some chosen locations/landmarks can be obtained using image based localization. Localization based on images can provide complementary location information to GPS. If GPS position, which typically has an error of around 2 meters, is augmented with the position obtained by image-based localization, better accuracy can be expected. Besides, image based localization can offer the orientation information as well. Moreover, GPS positioning is sometimes not available in big cities, because GPS signals can be blocked by buildings. Image based localization is even more important in this case.

In order to obtain the relative coordinates, the camera pose (rotation and translation) with respect to the chosen reference locations/landmarks needs to be recovered. In other words, we need to solve the problem of motion estimation between the current and the reference view. The motion estimation between widely separated views is essential for the image based localization problem. Usually, correspondences between views need to be found for motion estimation, which are obtained by the wide baseline matching. The wide baseline matching problem is typically addressed by matching of local descriptors associated with local features. Even though many techniques have been proposed for finding reliable features across views, matching features across widely separated views is still a difficult problem and usually many false correspondences are introduced. From the estimation perspective, these erroneous matches are considered as outliers to the true motion model. Therefore, robust estimation techniques, which can tolerate a large portion of outliers, are indispensable for our task.

The first part of my talk will be devoted to robust estimation techniques. It is motivated by the popular sampling based techniques like the RANSAC algorithm, in a way that a number of hypotheses are generated using sampling. Instead of evaluating each hypothesis and trying to find the best of them, we propose to investigate the distribution of errors of all the hypotheses with respect to a point. By studying the statistics of the distributions, we can classify data points as inliers/outliers directly. This brings several important benefits. First, we avoid the need of a predefined inliers threshold. Second, we do not need outlier free samples, thus the number of required samples is considerably less than the RANSAC algorithm. In addition, we do not need to know outlier ratio a priori. Then I will present a new framework for solving the problem of multiple model estimation based on the analysis of residual distributions. First, a number of hypotheses are generated and we compute the error distribution of those hypotheses with respect to each point. We show that each mode in those distributions correspond to a model. The number of modes in a distribution gives an estimate of the number of models in the data. By surveying the estimates from all the data points, we can obtain an accurate estimate of the number of models. Then parameters of each model can be determined. Finally, I will describe an image based localization system, which was built upon the proposed robust technique. It has demonstrated best performance in the ICCV'05 Computer Vision Contest.