Querying Similar (Tropical Cyclone) Events via Metric Learning on Multivariate Spatial-Temporal Data Sequences

GRAND Seminar 12:00 noon, Nov. 30, Tue., 2010, ENGR 4201

Shen-Shyang Ho
Assistant research scientist
Center for Automated Research (CfAR)
Institute of Advanced Computer Studies
University of Maryland.

Abstract:

In this talk, I will first provide an overview of my projects that utilize computer science research advances for technology development to support hurricane research. In particular, I will briefly discuss two projects, namely (1) hurricane tracking using heterogeneous satellite data sources, and (2) moving objects database technology to support ad-hoc spatio-temporal query and hurricane data analysis.

Then, I will describe our solution for ad-hoc similarity query based on user-defined instance-level constraints for tropical cyclone events, represented by arbitrary length multivariate trajectory data sequences. A critical component for the solution of such a problem is the similarity/metric function to compare the data sequences. Our solution is a novel Longest Common Subsequence (LCSS) parameter learning approach driven by nonlinear dimensionality reduction and distance metric learning. Intuitively, arbitrary length multivariate data sequences are projected into a fixed dimensional manifold for LCSS parameter learning. Similarity search is achieved through consensus among the (similar) instance-level constraints based on ranking orders computed using the LCSS-based similarity measure.

Experimental results using a combination of synthetic and real tropical cyclone event data sequences are presented to demonstrate the feasibility of our parameter learning approach and its robustness to variability in the instance constraints. I will use a similarity query example on real tropical cyclone events from 2000 to 2008 to discuss (i) a problem of scientific interest, and (ii) challenges and issues related to the weather event similarity search and query problem.

Bio:

Dr. Shen-Shyang Ho received his PhD in Computer Science from George Mason University in 2007 and his Bachelor (Honors) in Science (Mathematics and Computational Science) from the National University of Singapore in 1999. From 2007 to 2010, he was a NASA postdoctoral fellow and a Caltech Postdoctoral Scholar working at the Jet Propulsion Laboratory (JPL) at the California Institute of Technology. His research interests include artificial intelligence, machine learning, pattern recognition, and data mining for streaming data and on mobile devices. Currently, he is a researcher in the Center for Automated Research (CfAR) of the Institute for Advanced Computer Studies (UMIACS) at the University of Maryland. His current research is a collaboration with JPL and University of Florida, Gainesville, and is funded by NASA.