• LOGIN

MLBio+ Laboratory - Machine Learning in Biomedical Informatics

Home › Feed aggregator › Categories

Navigation

  • MLBio+ Laboratory
  • Classes
  • Feed aggregator
    • Categories
      • Bioinformatics & Data Mining
    • Sources

Contact Me

Office: 4423 Engr Building
Office Hours: T 4:00-5:00 pm
rangwala@cs.gmu.edu
703-993-3826

Bioinformatics & Data Mining

PrePrint: Skewed Rotation Symmetry Group Detection

TPAMI - Tue, 10/06/2009 - 11:01
We present a novel and effective algorithm for affinely skewed rotation symmetry group detection from real-world images. We define a complete skewed rotation symmetry detection problem as discovering five independent properties of a rotation symmetry group: (1) the center of rotation; (2) the affine deformation; (3) the type of the symmetry group; (4) the cardinality of the symmetry group; and (5) the supporting region of the symmetry group in the image. We propose a frieze-expansion (FE) method that transforms rotation symmetry group detection into a simple one dimensional translation symmetry detection problem. We define and construct a pair of rotational symmetry saliency maps, complemented by a local feature method. Frequency analysis, using Discrete Fourier Transform (DFT), is applied to the Frieze-expansion patterns (FEPs) to uncover the types (cyclic, dihedral and O(2)), the cardinalities and the corresponding supporting regions of multiple rotation symmetry groups in an image, concentric or otherwise. The phase information of the FEP is used to rectify affinely skewed rotation symmetry groups. Our result advances the state of the art in symmetry detection by offering a unique combination of region-based, feature-based and frequency-based approach. Experimental results on 170 synthetic and natural images demonstrate superior performance of our rotation symmetry detection algorithm over existing methods.

Categories: Bioinformatics & Data Mining

PrePrint: Object Detection with Discriminatively Trained Part Based Models

TPAMI - Tue, 10/06/2009 - 11:01
We describe an object detection system based on mixtures of multiscale deformable part models. Our system is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges. While deformable part models have become quite popular, their value had not been demonstrated on difficult benchmarks such as the PASCAL datasets. Our system relies on new methods for discriminative training with partially labeled data. We combine a margin-sensitive approach for data-mining hard negative examples with a formalism we call \emph{latent SVM}. A latent SVM is a reformulation of MI-SVM in terms of latent variables. A latent SVM is semi-convex and the training problem becomes convex once latent information is specified for the positive examples. This leads to an iterative training algorithm that alternates between fixing latent values for positive examples and optimizing the latent SVM objective function.

Categories: Bioinformatics & Data Mining

PrePrint: Large Scale Discovery of Spatially Related Images

TPAMI - Tue, 10/06/2009 - 11:01
We propose a randomized data mining method that finds clusters of spatially overlapping images. The core of the method relies on the min-Hash algorithm for fast detection of pairs of images with spatial overlap, the so-called cluster seeds. The seeds are then used as visual queries to obtain clusters which are formed as transitive closures of sets of partially overlapping images that include the seed. We show that the probability of finding a seed for an image cluster rapidly increases with the size of the cluster. The properties and performance of the algorithm are demonstrated on datasets with 10^4, 10^5, and 5 · 10^6 images. The speed of the method depends on the size of the database and on the number of clusters. The first stage of seed generation is close to linear for databases sizes up to approximately 2^34 (10^10) images. On a single 2.4GHz PC, the clustering process took only 24 minutes for a standard database of more than hundred thousand images, i.e. only 0.014 seconds per image.

Categories: Bioinformatics & Data Mining

PrePrint: Epitomic Location Recognition

TPAMI - Tue, 10/06/2009 - 11:01
This paper presents a novel method for location recognition, which exploits an epitomic representation to achieve both high efficiency and good generalization. A generative model based on epitomic image analysis captures the appearance and geometric structure of an environment while allowing for variations due to motion, occlusions and non-Lambertian effects. The ability to model translation and scale invariance together with the fusion of diverse visual features yields enhanced generalization with economical training. Experiments on both existing and new labelled image databases result in recognition accuracy superior to state of the art with real-time computational performance.

Categories: Bioinformatics & Data Mining

PrePrint: Class Conditional Nearest Neighbor for Large Margin Instance Selection

TPAMI - Tue, 10/06/2009 - 11:01
This paper presents a relational framework for studying properties of labeled data points related to proximity and labeling information in order to improve the performance of the 1NN rule. Specifically, the class conditional nearest neighbor (ccnn) relation over pairs of points in a labeled training set is introduced. For a given class label c this relation associates to each point a its nearest neighbor computed among only those points with class label c (excluded a). A characterization of ccnn in terms of two graphs is given. These graphs are used for defining a novel scoring function over instances by means of an information-theoretic divergence measure applied to the degree distributions of these graphs. The scoring function is employed to develop an effective large margin instance selection method, which is empirically demonstrated to improve storage and accuracy performance of the 1NN rule on artificial and reallife data sets.

Categories: Bioinformatics & Data Mining

PrePrint: Visualization of Spatio-Temporal Behavior of Discrete Maps via Generation of Recursive Median Elements

TPAMI - Tue, 10/06/2009 - 11:01
Spatial interpolation is one of the demanding techniques in Geographic Information Science (GISci) to generate interpolated maps in a continuous manner by using two discrete spatial and/or temporal data sets. Noise-free data (thematic layers) depicting a specific theme at varied spatial or temporal resolutions consist of connected components either in aggregated or in disaggregated forms. This short paper provides a simple framework (i) to categorize the connected components of layered sets of two different time instants through their spatial relationships and the Hausdorff distances between the companion connected components, and (ii) to generate sequential maps (interpolations) between the discrete thematic maps. Development of the median set, using Hausdorff erosion and dilation distances to interpolate between temporal frames is demonstrated on lake geometries mapped at two different times, and also on the bubonic plague epidemic spread data available for eleven consecutive years. We documented the significantly fair quality of the median sets generated for epidemic data between alternative years by visually comparing the interpolated maps with actual maps. They can be used to visualize (animate) the spatio-temporal behavior of a specific theme in a continuous sequence.

Categories: Bioinformatics & Data Mining

PrePrint: Range Flow in Varying Illumination: Algorithms and Comparisons

TPAMI - Tue, 10/06/2009 - 11:01
We extend estimation of range flow to handle brightness changes in image data caused by inhomogeneous illumination. Standard range flow computes 3D velocity fields using both range and intensity image sequences. Towards this end, range flow estimation combines a depth change model with a brightness constancy model. However, local brightness is generally not preserved when object surfaces rotate relative to the camera or the light sources, or when surfaces move in inhomogeneous illumination. We describe and investigate different approaches to handle such brightness changes. A straightforward approach is to prefilter the intensity data such that brightness changes are suppressed, for instance by a highpass or a homomorphic filter. Such prefiltering may, though, reduce the signal to noise ratio. An alternative novel approach is to replace the brightness constancy model by (1) a gradient constancy model, or (2) by a combination of gradient and brightness constancy constraints used earlier successfully for optical flow, or (3) by a physics-based brightness change model. In performance tests, the standard version and the novel versions of range flow estimation are investigated using prefiltered or non-prefiltered synthetic data with available ground truth. Furthermore, the influences of additive Gaussian noise and simulated shot noise are investigated. We finally compare all range flow estimators on real data.

Categories: Bioinformatics & Data Mining

PrePrint: Accurate, Dense, and Robust Multi-View Stereopsis

TPAMI - Tue, 10/06/2009 - 11:01
This article proposes a novel algorithm for multi-view stereopsis that outputs a dense set of small rectangular patches covering the surfaces visible in the images. Stereopsis is implemented as a match, expand, and filter procedure, starting from a sparse set of matched keypoints, and repeatedly expanding these before using visibility constraints to filter away false matches. The keys to the performance of the proposed algorithm are effective techniques for enforcing local photometric consistency and global visibility constraints. Simple but effective methods are also proposed to turn the resulting patch model into a mesh which can be further refined by an algorithm that enforces both photometric consistency and regularization constraints. The proposed approach automatically detects and discards outliers and obstacles, and does not require any initialization in the form of a visual hull, a bounding box, or valid depth ranges. We have tested our algorithm on various datasets including objects with fine surface details, deep concavities, and thin structures, outdoor scenes observed from a restricted set of viewpoints, and "crowded" scenes where moving obstacles appear in front of a static structure of interest. A quantitative evaluation on the Middlebury benchmark shows that the proposed method outperforms all others submitted so far for four out of the six datasets.

Categories: Bioinformatics & Data Mining

PrePrint: Responsive Knowledge Management for Public Administration: an Event-Driven Approach

Intelligent Systems - Tue, 10/06/2009 - 11:01
The dynamic landscape of public administration raises the issue of responsiveness in supporting knowledge management systems. Responsiveness refers to the ability of a system to respond to changing circumstances, such as alterations in information resources and contexts of work and collaboration, delivering relevant, timely and apposite information correctly and consistently. Responsiveness implies several challenges for knowledge management systems including proactivity, personalization and context-awareness. SAKE represents a responsive knowledge management system for public administration: one which detects changes in information resources and contexts of work and delivers relevant resources proactively and in a personalized manner. We follow an ontology-based approach for modeling information resources, context and preferences, implemented on an event-driven architecture. The main contribution of our approach is a unified, event-based representation and processing of changes in information resources and contexts of work and collaboration that ameliorates the process of defining and detecting relevant changes and increases system responsiveness.

Categories: Bioinformatics & Data Mining

PrePrint: Digital Intuition: Applying Common Sense Using Dimensionality Reduction

Intelligent Systems - Tue, 10/06/2009 - 11:01
Understanding the world we live in requires access to a large amount of background knowledge: the common sense knowledge that most people know and most computer systems don't. Many of the limitations of artificial intelligence today relate to the problem of acquiring and understanding common sense. The Open Mind Common Sense project began to collect common sense from volunteers on the Internet starting in 2000. The collected information is converted to a semantic network called ConceptNet. Reducing the dimensionality of ConceptNet's graph structure gives a matrix representation called AnalogySpace, which reveals large-scale patterns in the data, smooths over noise, and predicts new knowledge. Extending this work, we have created a method that uses singular value decomposition to aid in the integration of systems or representations. This technique, called blending, can be harnessed to find and exploit correlations between different resources, enabling common sense reasoning over a broader domain.

Categories: Bioinformatics & Data Mining

PrePrint: Will intelligent assets take off? Towards self-serving aircrafts

Intelligent Systems - Tue, 10/06/2009 - 11:01
In this article we present the self-serving-asset, developed as part of a research project at the Boeing Company and the University of Cambridge. The self-serving asset is self-aware, and has the goal to maximise its life in service by contacting, selecting and procuring service providers autonomously. The result is an open, consistent service chain where complex database transactions are eliminated, and an emergent, yet rather self-capable system starts to materialise. Among various supporting technology multi-agent systems provide the backbone for the “intelligence” characteristic required from the self-serving asset. Intelligent asset agents monitor assets, contact suppliers, use multi-criteria decision making to select among proposals, and handle competition. In this paper we aim to outline the self-serving asset concept, describe the multi-agent platform designed to support the asset, and present experimental results on the preliminary agent architecture in terms of decision optimality, scalability and stability.

Categories: Bioinformatics & Data Mining

PrePrint: Flexible Inference with Structured Knowledge through Reasoned Unification

Intelligent Systems - Tue, 10/06/2009 - 11:01
Systems with human-level intelligence must both be flexible and be able to reason in an appropriate time scale. These two goals are in tension, as manifest by the contrasting properties of structured knowledge-based systems (e.g., involving scripts and frames) and general inference algorithms. The problem of resolving ambiguous, implicit and non-literal references exemplifies many of these difficulties. We describe an approach, called reasoned unification, for dealing with these challenges by representing and jointly reasoning over linguistic and non-linguistic knowledge (including structures such as scripts and frames) within the same inference framework. Reasoned unification enables a treatment of several reference resolution phenomena that to our knowledge have not previously been the subject of a unified analysis. This analysis illustrates how reasoned unification can resolve many difficult problems with using complex knowledge structures while maintaining their benefits.

Categories: Bioinformatics & Data Mining

PrePrint: Companion Cognitive Systems: Design Goals and Some Lessons Learned

Intelligent Systems - Tue, 10/06/2009 - 11:01
The Companion cognitive architecture is designed to support experiments in achieving human-level intelligence. This paper describes seven key design goals of Companions, relating them to properties of human reasoning and learning, and to engineering concerns raised by attempting to build large-scale cognitive systems. We summarize our experiences to date with Companions in two kinds of domains, test taking and game playing. We close by summarizing some of the challenges that remain.

Categories: Bioinformatics & Data Mining

PrePrint: Reference Resolution Challenges for an Intelligent Agent: The Need for Knowledge

Intelligent Systems - Tue, 10/06/2009 - 11:01
This paper presents a vision of how language-endowed, next- generation intelligent agents might resolve – i.e., fully interpret – references to objects and events in language input. It describes some of the more difficult reference phenomena that are not being sufficiently treated by practical systems and suggests what kinds of knowledge must be available to intelligent agents to enable them to reach human competence in reference resolution.

Categories: Bioinformatics & Data Mining

PrePrint: Converting a Historical Encyclopedia of Architecture into a Semantic Knowledge Base

Intelligent Systems - Tue, 10/06/2009 - 11:01
The historic Encyclopedia of Architecture, written in German between 1880-1943, was one of the largest projects aiming at conserving all architectural knowledge available at that time. Today, its vast amount of content is mostly lost: few complete sets are available, and its complex structure does not lend itself easily to contemporary application. We show how modern semantic technologies can be applied to make these heritage documents accessible again. In particular, we demonstrate how to go beyond classical digitization projects by transforming the historical documents into a semantic knowledge base. Using techniques from natural language processing and the Semantic Web, we show how to automatically populate an ontology that can be used for various application scenarios: Building historians can use it to navigate and query the encyclopedia, while architects can directly integrate it into contemporary construction tools. Additionally, all content is made accessible in a user-friendly Wiki interface that combines original text with NLP-derived metadata and adds annotation capabilities for collaborative use.

Categories: Bioinformatics & Data Mining

IEEE Intelligent Systems - September/October 2009 (Vol. 24, No. 5)

Intelligent Systems - Tue, 10/06/2009 - 11:01
IEEE Intelligent Systems

Categories: Bioinformatics & Data Mining

PrePrint: WLD: A Robust Local Image Descriptor

TPAMI - Tue, 10/06/2009 - 11:01
Inspired by Weber's Law, this paper proposes a simple, yet very powerful and robust local descriptor, called the Weber Local Descriptor (WLD). It is based on the fact that human perception of a pattern depends not only on the change of a stimulus (such as sound, lighting) but also on the original intensity of the stimulus. Specifically, WLD consists of two components: differential excitation and orientation. The differential excitation component is a function of the ratio between two terms: one is the relative intensity differences of a current pixel against its neighbors; the other is the intensity of the current pixel. The orientation component is the gradient orientation of the current pixel. For a given image, we use the two components to construct a concatenated WLD histogram. Experimental results on the Brodatz and KTH-TIPS2-a texture databases show that WLD impressively outperforms the other widely used descriptors (e.g., Gabor and SIFT). In addition, experimental results on human face detection also show a promising performance comparable to the best known results on the MIT+CMU frontal face test set, the AR face dataset and the CMU profile test set.

Categories: Bioinformatics & Data Mining

PrePrint: Evaluating Color Descriptors for Object and Scene Recognition

TPAMI - Tue, 10/06/2009 - 11:01
Image category recognition is important to access visual information on the level of objects and scene types. So far, intensity-based descriptors have been widely used for feature extraction at salient points. To increase illumination invariance and discriminative power, color descriptors have been proposed. Because many different descriptors exist, a structured overview is required of color invariant descriptors in the context of category recognition. Therefore, this paper studies the invariance properties and the distinctiveness of color descriptors in a structured way. The analytical invariance properties of color descriptors are explored, using a taxonomy based on invariance properties with respect to photometric transformations, and tested experimentally using a dataset with known illumination conditions. In addition, the distinctiveness of color descriptors is assessed experimentally using two category recognition benchmarks. From the theoretical and experimental results, it can be derived that invariance to light intensity changes and light color changes affects category recognition. Results reveal further that, for light intensity changes, the usefulness of invariance is category-specific. Overall, when choosing a single descriptor and no prior knowledge about the categories and the dataset is available, OpponentSIFT is recommended. Furthermore, a combined set of color descriptors outperforms intensity-based SIFT and improves category recognition by 8% on PASCAL VOC2007 and by 10% on the Mediamill Challenge.

Categories: Bioinformatics & Data Mining

PrePrint: Training-free, Generic Object Detection using Locally Adaptive Regression Kernels

TPAMI - Tue, 10/06/2009 - 11:01
We present a generic detection/localization algorithm capable of searching for a visual object of interest without training. The proposed method operates using a single example of an object of interest to find similar matches; does not require prior knowledge (learning) about objects being sought; and does not require any pre-processing step or segmentation of a target image. Our method is based on the computation of local regression kernels as descriptors from a query, which measure the likeness of a pixel to its surroundings. Salient features are extracted from said descriptors and compared against analogous features from the target image. This comparison is done using a matrix generalization of the cosine similarity measure. We illustrate optimality properties of the algorithm using a naive-Bayes framework. The algorithm yields a scalar resemblance map, indicating the likelihood of similarity between the query and all patches in the target image. By employing nonparametric significance tests and non-maxima suppression, we detect the presence and location of objects similar to the given query. The approach is extended to account for large variations in scale and rotation. High performance is demonstrated on several challenging datasets, indicating successful detection of objects in diverse contexts and under different imaging conditions.

Categories: Bioinformatics & Data Mining

PrePrint: Geometric Feature Extraction by a Multi-Marked Point Process

TPAMI - Tue, 10/06/2009 - 11:01
This paper presents a new stochastic marked point process for describing images in terms of a finite library of geometric objects. Image analysis based on conventional marked point processes has already produced convincing results but at the expense of easy parameter tuning, short computing time, and unspecific models. Our more general multi-marked point process has simpler parametric setting, yields notably shorter computing times and can be applied to a variety of applications. Both linear and areal primitives extracted from a library of geometric objects are matched to a given image using a probabilistic Gibbs model, and a Jump-Diffusion process is performed to search for the optimal object configuration. Experiments with remotely sensed images and natural textures show the proposed approach has good potential. We conclude with a discussion about the insertion of more complex object interactions in the model by studying the compromise between model complexity and efficiency.

Categories: Bioinformatics & Data Mining
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • …
  • next ›
  • last »
Syndicate content

News Highlights

  • Syed F to join the Lab.
  • Paper Accepted at Journal of Chemical Information & Modeling
  • Huzefa to serve on program committee for SIAM Data Mining Conference 2010 (SDM 2010)
  • Huzefa to serve on program committee for HiCOMB 2010
  • New funding received from NSF IIS for bridging chemical and biological spaces.
  • Two open positions for graduate students (MLBio+ Laboratory)
  • Ammar submits his 1st paper!
  • Salman's paper accepted at WISM-AICI 2009.
  • Huzefa presents 2 posters at ISMB 2009
  • Sheng Li and Anveshi join the lab this Fall
more

Bioinformatics & Data Mining

  • PrePrint: Skewed Rotation Symmetry Group Detection
  • PrePrint: Object Detection with Discriminatively Trained Part Based Models
  • PrePrint: Large Scale Discovery of Spatially Related Images
  • PrePrint: Epitomic Location Recognition
  • PrePrint: Class Conditional Nearest Neighbor for Large Margin Instance Selection
more

(c) Rangwala 2008, George Mason University, Fairfax, VA