•   When: Monday, June 04, 2018 from 02:00 PM to 04:00 PM
  •   Speakers: Md. Alimoor Reza
  •   Location: ENGR 4201
  •   Export to iCal

Scene understanding from images opens the doorway for solving various complex tasks such as visual navigation, object manipulation and reasoning about the world in 3D. These challenging tasks can be better tackled with the availability depth information. This thesis covers techniques that effectively exploit both the image data and the depth data for scene understanding. First, it introduces an alternative approach for semantic segmentation, a task which entails simultaneous labeling and segmentation of each pixel in a scene. The proposed method offers simplicity and modularity by learning to combine multiple binary segmentations within a reinforcement learning framework. Second, a semantic label propagation framework is presented that utilizes a few annotated keyframes and propagates the annotated labels to a large number of unlabeled video frames. This technique alleviates the need for a labor-intensive manual annotation effort, and the propagated labels have been utilized to improve a deep neural network model for semantic segmentation. Third, a novel method for instance detection of small hand-held objects in a realistic environment is developed (e.g., cups, soda-cans, etc. on a tabletop). The proposed 3D multi-view approach has been shown to perform better than an alternative 3D single-view approach. Fourth and finally, this thesis concludes with an approach to generate training data for distant objects in outdoor scenes and proposes two deep learning solutions for depth recovery of those objects.

Posted 9 months ago