- When: Monday, March 07, 2022 from 02:00 PM to 03:00 PM
- Speakers: Alex Wong
- Location: ZOOM only
- Export to iCal
Abstract:
Embodied cognition is the ability to adapt one’s understanding of the physical world through interactions with the surrounding space. Amongst the many aspects of intelligence encompassed by embodied cognition, we focus on depth perception to support agents performing spatial tasks. While deep learning has seen a number of empirical successes, many of the existing works are not suitable for realizing embodied cognition due to (i) their computational requirements i.e. model sizes up to trillions of parameters, (ii) their need for expensive human annotations as supervision, and (iii) their sensitivity to small perturbations in their inputs. To address these areas, we begin by proposing a method that enables an agent to learn to infer the structure of the 3-dimensional scene from multi-sensory data -- online and without human supervision. By leveraging priors about our physical world during the learning process or as an inductive bias, we show that it is not only possible to reduce the model size, but also gain performance to yield real-time depth perception systems with state-of-the-art accuracies. These priors can also improve the robustness of such systems against common perturbations of their input as well as adversarial perturbations that are designed to disrupt their normal operations. The culmination of our work is being realized in an interdepartmental collaboration at UCLA to build the first fully autonomous cataract surgery robot. My future research plans will take another step towards embodied cognition, by building on top of my progress in depth perception, to enable an agent to learn the semantics of objects populating the scene without human intervention.
Bio:
Alex Wong is a postdoctoral scholar at the University of California, Los Angeles (UCLA) under the guidance of Stefano Soatto. He received his Ph.D. from UCLA, and was co-advised by Stefano Soatto and Alan Yuille. His research lies in the intersection of machine learning, computer vision, and robotics. His work has received the outstanding student paper award at the Conference on Neural Information Processing Systems (NeurIPS) 2011 and the best paper award in robot vision at the International Conference on Robotics and Automation (ICRA) 2019.