MLBio+Laboratory Machine Learning in Biomedical Informatics



Welcome to the MLBio+ Laboratory

The Machine Learning in Biomedical Informatics (MLBio+) Laboratory performs interdisciplinary research aimed at development of machine learning and data mining methods for problems in biology, medicine and social media analysis.
There is an emphasis on development of novel algorithms, and engineering of effective solutions to discover knowledge from emerging data. Our research has resulted in development in useful and efficient software tools that aid biologists to make key discoveries, and have also advanced the field of computer science. Please refer to the publications page.

Some of the projects that we currently focus on relate to the areas of structural bioinformatics, next generation sequencing projects (genomics and metagenomics), cyber security and social media analysis.


Protein Interaction and Sequence Analysis.
  • Represent proteins using a combination of graphical and discriminatory models.
  • Development and improvement of protein structure prediction methods to participate at a protein structure prediction competition called CASP
  • Develop fast and accurate biological string kernels for protein sequence classification.
  • Develop software packages like svmPRAT, TOPTMH and MONSTER for protein residue annotation.
  • Develop multi-task and transfer learning approaches for analyzing protein ligand interactions.
  • Develop co-clustering type approaches for analyzing protein-ligand interactions.
Funding: NSF
Next-generation Sequencing and Metagenome Analysis.
  • Develop LSH-based methods for clustering metagenome sequences.
  • Develop species diversity and richness estimators from metagenome samples.
  • Develop hierarchical classification approaches for annotating metagenomes.
  • Develop biologist-friendly computational pipelines using DRUPAL and GALAXY for analyzing human microbiome data. Microbiome Analysis Center
  • Develop sequence assembly and mapping approaches for next generation sequencing technologies.
F: NIH, nVidia
C: Patrick Gillevet
Social Media and Network Analysis.
  • Mine comment information across social media sites (e.g., Digg) to create implicit networks.
  • Develop community detection, network classification and dynamic network analysis algorithms.
  • Develop sentiment and buzz detection approaches for monitoring social media forums.
  • Develop adaptive social systems to improve network response times and
    user experience.
Cyber Security.
  • Represent malware application behaviors as a "cyber"-genome and extract patterns, signatures for classification and clustering.
  • Use hash-based data structures to store, retrieve and analyze long cyber-genomes efficiently and accurately.
  • Develop implementations of the analysis methods on distributed computing platforms.
  • Apply bioinformatic approaches on malware datasets for provenance analysis.
F: DARPA.
C: Angelos Stavrou


Powered by Drupal, an open source content management system