Banner
Computer Science Department Seminars

2003-2004 Academic Year

  Text Mining: Exploring Ideas using Text Collections
                Padmini Srinivasan
        School of Library and Information Science,
        Department of Management Sciences
                The University of Iowa
                Iowa City, IA, 52242
        padmini-srinivasan@uiowa.edu

Hypothesis generation, a crucial initial step for making scientific discoveries, 
relies on prior knowledge, experience and intuition.  Connections made  
serendipitously between seemingly distinct subareas sometimes turn out to 
be fruitful.  The goal in text mining is to assist in this process by 
automatically discovering a small set of interesting hypotheses from a 
suitable  text collection.  In this talk we present our research on text 
mining algorithms and highlight some of the challenges that we face.  Our aim is 
to explore functions  and capabilities to support text based knowledge discovery.  We 
seek to design domain independent methods that may be applied to a variety  
of problem contexts.  Our overall goal is to build a working text mining system 
while also investigating research questions related to such efforts. We have used our 
system to explore for example, the global distribution of disease research and 
their correlation with the prevalence of these diseases.  Interesting trends in 
disease research were identified.  The application area that will be 
emphasized in this talk is the mining of relationships between concepts 
such as genes and diseases in the bioscience domain - a specialized text 
mining problem that has been recently termed `conceptual biology'.   We 
will also present our experiments that are designed with this theme.