Banner
Faculty Recruitment Seminar

Thursday, April 03, 2008
11:00AM-12:00Noon, Johnson Center, Gold Room

Detecting Functional Modules in Heterogeneous Biological Data

Alexander Schliep

PhD
Department of Computational Molecular Biology
Max Planck Institute for Molecular Biology

Abstract

Statistical models are widely used in computational molecular biology, primarily in analyzing large data sets from high-throughput experiments. For example, gene expression levels can be measured for several different cell types in the development of the lymphoid system. Functional modules, groups of interacting genes, can be found from this data with a clustering approach, as similarity of their expression levels indicates co-regulation of genes. This helps in the understanding of development of the lymphoid system and is furthermore a prerequisite for a finer-grained causal modeling.

Biology, the dimensionality and the quality of the data suggest the use of statistical approaches to clustering, taking inherent dependencies between measurements into account. Several heterogeneous sources of data, often of variable abundance and quality, can be combined in the analysis using semi-supervised learning. We will show several case studies, for example the detection of groups of syn-expressed genes from in-situ images and gene expression time-courses during embryogenesis.