Re-Projection of Terabyte-sized 3D Images for Interactive Visualization

GRAND Seminar Feb. 18, 12 PM., Tuesday. 2014, ENGR 4207

Peter Bajcsy
Information Technology Laboratory
NIST

Abstract:

This talk will present the computational challenges and approaches to knowledge discovery from terabyte-sized images. The motivation comes from experimental systems for imaging and analyzing human pluripotent stem cell cultures and material science specimens at the spatial and temporal coverage that lead to terabyte-sized image data. The objective of such an unprecedented cell study and material imaging is to characterize specimens at high statistical significance in order to guide a repeatable growth of high quality stem cell colonies and to understand metallurgical processes. To pursue this objective, multiple computer and computational science problems have to be overcome including image correction (flat-field, dark current and background), stitching, segmentation, tracking, re-projection, feature extraction and then representation of large images for interactive visualization and sampling in a web browser.

In this presentation, we will focus on the problem of re-projecting terabyte-sized 3D images for interactive visualization from multiple orthogonal viewpoints. The current solutions are limited to gigabyte-sized images using specialized hardware to achieve interactivity and are lacking the ability to share data for collaborative research. We overcome these limitations by pre-computing re-projected views of terabyte-sized images and by using the Deep Zoom framework for accessing multiple orthogonal views. Our approach is based on researching extensions to Amdahl’s law for Map-Reduce computations, establishing benchmarks for image processing on a Hadoop platform, and introducing a computer cluster node utilization coefficient for re-projection computations running on a computer cluster/cloud. The theoretical models of algorithmic complexity and cluster utilization at terabyte scale are applied to selecting an optimal computer cluster configuration. Additional interactive measurement capabilities are added as plugins to the open source OpenSeadragon project with the Deep Zoom capabilities. This presentation will conclude with illustrations of enabled scientific discoveries, as well as with several collaboration opportunities to create reference resources for scientific discoveries from terabyte-sized images.

Short Bio:

Peter Bajcsy received his Ph.D. in Electrical and Computer Engineering in 1997 from the University of Illinois at Urbana-Champaign (UIUC) and a M.S. in Electrical and Computer Engineering in 1994 from the University of Pennsylvania (UPENN). He worked for machine vision, government contracting, and research and educational institutions before joining the National Institute of Standards and Technology (NIST) in 2011. At NIST, he has been leading a project focusing on the application of computational science in biological metrology, and specifically stem cell characterization at very large scales. Peter’s area of research is large-scale image-based analyses and syntheses using mathematical, statistical and computational models while leveraging computer science fields such as image processing, machine learning, computer vision, and pattern recognition. He has co-authored more than more than 24 journal papers, eight book chapters and close to 100 conference papers