Journal Articles

Interactive analysis of large cancer copy number studies with Copy Number Explorer

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Summary: Copy number abnormalities (CNAs) such as somatically-acquired chromosomal deletions and duplications drive the development of cancer. As individual tumor genomes can contain tens or even hundreds of large and/or focal CNAs, a major difficulty is differentiating between important, recurrent pathogenic changes and benign changes unrelated to the subject’s phenotype. Here we present Copy Number Explorer, an interactive tool for mining large copy number datasets. Copy Number Explorer facilitates rapid visual and statistical identification of recurrent regions of gain or loss, identifies the genes most likely to drive CNA formation using the cghMCR method and identifies recurrently broken genes that may be disrupted or fused. The software also allows users to identify recurrent CNA regions that may be associated with differential survival.

Availability and Implementation: Copy Number Explorer is available under the GNU public license (GPL-3). Source code is available at: https://sourceforge.net/projects/copynumberexplorer/

Contact: scott.newman@emory.edu

Categories: Journal Articles

kSNP3.0: SNP detection and phylogenetic analysis of genomes without genome alignment or reference genome

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Summary:We announce the release of kSNP3.0, a program for SNP identification and phylogenetic analysis without genome alignment or the requirement for reference genomes. kSNP3.0 is a significantly improved version of kSNP v2.

Availability and implementation: kSNP3.0 is implemented as a package of stand-alone executables for Linux and Mac OS X under the open-source BSD license. The executable packages, source code and a full User Guide are freely available at https://sourceforge.net/projects/ksnp/files/

Contact: barryghall@gmail.com

Categories: Journal Articles

Identification of C2H2-ZF binding preferences from ChIP-seq data using RCADE

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Summary: Current methods for motif discovery from chromatin immunoprecipitation followed by sequencing (ChIP-seq) data often identify non-targeted transcription factor (TF) motifs, and are even further limited when peak sequences are similar due to common ancestry rather than common binding factors. The latter aspect particularly affects a large number of proteins from the Cys2His2 zinc finger (C2H2-ZF) class of TFs, as their binding sites are often dominated by endogenous retroelements that have highly similar sequences. Here, we present recognition code-assisted discovery of regulatory elements (RCADE) for motif discovery from C2H2-ZF ChIP-seq data. RCADE combines predictions from a DNA recognition code of C2H2-ZFs with ChIP-seq data to identify models that represent the genuine DNA binding preferences of C2H2-ZF proteins. We show that RCADE is able to identify generalizable binding models even from peaks that are exclusively located within the repeat regions of the genome, where state-of-the-art motif finding approaches largely fail.

Availability and implementation: RCADE is available as a webserver and also for download at http://rcade.ccbr.utoronto.ca/.

Supplementary information: Supplementary data are available at Bioinformatics online.

Contact: t.hughes@utoronto.ca

Categories: Journal Articles

Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Motivation: The characterization of phylogenetic and functional diversity is a key element in the analysis of microbial communities. Amplicon-based sequencing of marker genes, such as 16S rRNA, is a powerful tool for assessing and comparing the structure of microbial communities at a high phylogenetic resolution. Because 16S rRNA sequencing is more cost-effective than whole metagenome shotgun sequencing, marker gene analysis is frequently used for broad studies that involve a large number of different samples. However, in comparison to shotgun sequencing approaches, insights into the functional capabilities of the community get lost when restricting the analysis to taxonomic assignment of 16S rRNA data.

Results: Tax4Fun is a software package that predicts the functional capabilities of microbial communities based on 16S rRNA datasets. We evaluated Tax4Fun on a range of paired metagenome/16S rRNA datasets to assess its performance. Our results indicate that Tax4Fun provides a good approximation to functional profiles obtained from metagenomic shotgun sequencing approaches.

Availability and implementation: Tax4Fun is an open-source R package and applicable to output as obtained from the SILVAngs web server or the application of QIIME with a SILVA database extension. Tax4Fun is freely available for download at http://tax4fun.gobics.de/.

Contact: kasshau@gwdg.de

Supplementary information: Supplementary data are available at Bioinformatics online.

Categories: Journal Articles

BFC: correcting Illumina sequencing errors

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Summary: BFC is a free, fast and easy-to-use sequencing error corrector designed for Illumina short reads. It uses a non-greedy algorithm but still maintains a speed comparable to implementations based on greedy methods. In evaluations on real data, BFC appears to correct more errors with fewer overcorrections in comparison to existing tools. It particularly does well in suppressing systematic sequencing errors, which helps to improve the base accuracy of de novo assemblies.

Availability and implementation: https://github.com/lh3/bfc

Contact: hengli@broadinstitute.org

Supplementary information: Supplementary data are available at Bioinformatics online.

Categories: Journal Articles

pez: phylogenetics for the environmental sciences

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Summary: pez is an R package that permits measurement, modelling and simulation of phylogenetic structure in ecological data. pez contains the first implementation of many methods in R, and aggregates existing data structures and methods into a single, coherent package.

Availability and implementation: pez is released under the GPL v3 open-source license, available on the Internet from CRAN (http://cran.r-project.org). The package is under active development, and the authors welcome contributions (see http://github.com/willpearse/pez).

Contact: will.pearse@gmail.com

Categories: Journal Articles

iFoldRNA v2: folding RNA with constraints

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Summary: A key to understanding RNA function is to uncover its complex 3D structure. Experimental methods used for determining RNA 3D structures are technologically challenging and laborious, which makes the development of computational prediction methods of substantial interest. Previously, we developed the iFoldRNA server that allows accurate prediction of short (<50 nt) tertiary RNA structures starting from primary sequences. Here, we present a new version of the iFoldRNA server that permits the prediction of tertiary structure of RNAs as long as a few hundred nucleotides. This substantial increase in the server capacity is achieved by utilization of experimental information such as base-pairing and hydroxyl-radical probing. We demonstrate a significant benefit provided by integration of experimental data and computational methods.

Availability and implementation: http://ifoldrna.dokhlab.org

Contact: dokh@unc.eu

Categories: Journal Articles

PDBest: a user-friendly platform for manipulating and enhancing protein structures

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Summary: PDBest (PDB Enhanced Structures Toolkit) is a user-friendly, freely available platform for acquiring, manipulating and normalizing protein structures in a high-throughput and seamless fashion. With an intuitive graphical interface it allows users with no programming background to download and manipulate their files. The platform also exports protocols, enabling users to easily share PDB searching and filtering criteria, enhancing analysis reproducibility.

Availability and implementation: PDBest installation packages are freely available for several platforms at http://www.pdbest.dcc.ufmg.br

Contact: wellisson@dcc.ufmg.br, dpires@dcc.ufmg.br, raquelcm@dcc.ufmg.br

Supplementary information: Supplementary data are available at Bioinformatics online.

Categories: Journal Articles

MemGen: a general web server for the setup of lipid membrane simulation systems

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Motivation: Molecular dynamics simulations provide atomic insight into the physicochemical characteristics of lipid membranes and hence, a wide range of force field families capable of modelling various lipid types have been developed in recent years. To model membranes in a biologically realistic lipid composition, simulation systems containing multiple different lipids must be assembled.

Results: We present a new web service called MemGen that is capable of setting up simulation systems of heterogenous lipid membranes. MemGen is not restricted to certain lipid force fields or lipid types, but instead builds membranes from uploaded structure files which may contain any kind of amphiphilic molecule. MemGen works with any all-atom or united-atom lipid representation.

Availability and implementation: MemGen is freely available without registration at http://memgen.uni-goettingen.de.

Contact: jhub@gwdg.de

Supplementary information: Supplementary data are available at Bioinformatics online.

Categories: Journal Articles

chipPCR: an R package to pre-process raw data of amplification curves

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Motivation: Both the quantitative real-time polymerase chain reaction (qPCR) and quantitative isothermal amplification (qIA) are standard methods for nucleic acid quantification. Numerous real-time read-out technologies have been developed. Despite the continuous interest in amplification-based techniques, there are only few tools for pre-processing of amplification data. However, a transparent tool for precise control of raw data is indispensable in several scenarios, for example, during the development of new instruments.

Results: chipPCR is an R package for the pre-processing and quality analysis of raw data of amplification curves. The package takes advantage of R’s S4 object model and offers an extensible environment. chipPCR contains tools for raw data exploration: normalization, baselining, imputation of missing values, a powerful wrapper for amplification curve smoothing and a function to detect the start and end of an amplification curve. The capabilities of the software are enhanced by the implementation of algorithms unavailable in R, such as a 5-point stencil for derivative interpolation. Simulation tools, statistical tests, plots for data quality management, amplification efficiency/quantification cycle calculation, and datasets from qPCR and qIA experiments are part of the package. Core functionalities are integrated in GUIs (web-based and standalone shiny applications), thus streamlining analysis and report generation.

Availability and implementation: http://cran.r-project.org/web/packages/chipPCR. Source code: https://github.com/michbur/chipPCR.

Contact: stefan.roediger@b-tu.de

Supplementary information: Supplementary data are available at Bioinformatics online.

Categories: Journal Articles

ms-data-core-api: an open-source, metadata-oriented library for computational proteomics

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Summary: The ms-data-core-api is a free, open-source library for developing computational proteomics tools and pipelines. The Application Programming Interface, written in Java, enables rapid tool creation by providing a robust, pluggable programming interface and common data model. The data model is based on controlled vocabularies/ontologies and captures the whole range of data types included in common proteomics experimental workflows, going from spectra to peptide/protein identifications to quantitative results. The library contains readers for three of the most used Proteomics Standards Initiative standard file formats: mzML, mzIdentML, and mzTab. In addition to mzML, it also supports other common mass spectra data formats: dta, ms2, mgf, pkl, apl (text-based), mzXML and mzData (XML-based). Also, it can be used to read PRIDE XML, the original format used by the PRIDE database, one of the world-leading proteomics resources. Finally, we present a set of algorithms and tools whose implementation illustrates the simplicity of developing applications using the library.

Availability and implementation: The software is freely available at https://github.com/PRIDE-Utilities/ms-data-core-api.

Supplementary information: Supplementary data are available at Bioinformatics online

Contact: juan@ebi.ac.uk

Categories: Journal Articles

Gener: a minimal programming module for chemical controllers based on DNA strand displacement

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Summary: Gener is a development module for programming chemical controllers based on DNA strand displacement. Gener is developed with the aim of providing a simple interface that minimizes the opportunities for programming errors: Gener allows the user to test the computations of the DNA programs based on a simple two-domain strand displacement algebra, the minimal available so far. The tool allows the user to perform stepwise computations with respect to the rules of the algebra as well as exhaustive search of the computation space with different options for exploration and visualization. Gener can be used in combination with existing tools, and in particular, its programs can be exported to Microsoft Research’s DSD tool as well as to LaTeX.

Availability and implementation: Gener is available for download at the Cosbi website at http://www.cosbi.eu/research/prototypes/gener as a windows executable that can be run on Mac OS X and Linux by using Mono.

Contact: ozan@cosbi.eu

Categories: Journal Articles

phylogeo: an R package for geographic analysis and visualization of microbiome data

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Motivation: We have created an R package named phylogeo that provides a set of geographic utilities for sequencing-based microbial ecology studies. Although the geographic location of samples is an important aspect of environmental microbiology, none of the major software packages used in processing microbiome data include utilities that allow users to map and explore the spatial dimension of their data. phylogeo solves this problem by providing a set of plotting and mapping functions that can be used to visualize the geographic distribution of samples, to look at the relatedness of microbiomes using ecological distance, and to map the geographic distribution of particular sequences. By extending the popular phyloseq package and using the same data structures and command formats, phylogeo allows users to easily map and explore the geographic dimensions of their data from the R programming language.

Availability and Implementation: phylogeo is documented and freely available http://zachcp.github.io/phylogeo

Contact: zcharlop@rockefeller.edu

Categories: Journal Articles

GOplot: an R package for visually combining expression data with functional analysis

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Summary: Despite the plethora of methods available for the functional analysis of omics data, obtaining comprehensive-yet detailed understanding of the results remains challenging. This is mainly due to the lack of publicly available tools for the visualization of this type of information. Here we present an R package called GOplot, based on ggplot2, for enhanced graphical representation. Our package takes the output of any general enrichment analysis and generates plots at different levels of detail: from a general overview to identify the most enriched categories (bar plot, bubble plot) to a more detailed view displaying different types of information for molecules in a given set of categories (circle plot, chord plot, cluster plot). The package provides a deeper insight into omics data and allows scientists to generate insightful plots with only a few lines of code to easily communicate the findings.

Availability and Implementation: The R package GOplot is available via CRAN-The Comprehensive R Archive Network: http://cran.r-project.org/web/packages/GOplot. The shiny web application of the Venn diagram can be found at: https://wwalter.shinyapps.io/Venn/. A detailed manual of the package with sample figures can be found at https://wencke.github.io/

Contact: fscabo@cnic.es or mricote@cnic.es

Categories: Journal Articles

HTT-DB: Horizontally transferred transposable elements database

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Motivation: Horizontal transfer of transposable (HTT) elements among eukaryotes was discovered in the mid-1980s. As then, >300 new cases have been described. New findings about HTT are revealing the evolutionary impact of this phenomenon on host genomes. In order to provide an up to date, interactive and expandable database for such events, we developed the HTT-DB database.

Results: HTT-DB allows easy access to most of HTT cases reported along with rich information about each case. Moreover, it allows the user to generate tables and graphs based on searches using Transposable elements and/or host species classification and export them in several formats.

Availability and implementation: This database is freely available on the web at http://lpa.saogabriel.unipampa.edu.br:8080/httdatabase. HTT-DB was developed based on Java and MySQL with all major browsers supported. Tools and software packages used are free for personal or non-profit projects.

Contact: bdotto82@gmail.com or gabriel.wallau@gmail.com

Categories: Journal Articles

'Flatten plus': a recent implementation in WSxM for biological research

Bioinformatics Journal - Mon, 08/24/2015 - 09:21

Summary: Scanning probe microscopy (SPM) is already a relevant tool in biological research at the nanoscale. We present ‘Flatten plus’, a recent and helpful implementation in the well-known WSxM free software package. ‘Flatten plus’ allows reducing low-frequency noise in SPM images in a semi-automated way preventing the appearance of typical artifacts associated with such filters.

Availability and implementation: WSxM is a free software implemented in C++ supported on MS Windows, but it can also be run under Mac or Linux using emulators such as Wine or Parallels. WSxM can be downloaded from http://www.wsxmsolutions.com/.

Contact: ignacio.horcas@wsxmsolutions.com

Categories: Journal Articles

Hurricane Katrina’s psychological scars revealed

Nature - Sun, 08/23/2015 - 23:00

Hurricane Katrina’s psychological scars revealed

Nature 524, 7566 (2015). http://www.nature.com/doifinder/10.1038/524395a

Author: Sara Reardon

Mental health worsened in the disaster’s aftermath, but survivors also showed resilience.

Categories: Journal Articles

Changes in Postural Syntax Characterize Sensory Modulation and Natural Variation of C. elegans Locomotion

PLoS Computational Biology - Fri, 08/21/2015 - 16:00

by Roland F. Schwarz, Robyn Branicky, Laura J. Grundy, William R. Schafer, André E. X. Brown

Locomotion is driven by shape changes coordinated by the nervous system through time; thus, enumerating an animal's complete repertoire of shape transitions would provide a basis for a comprehensive understanding of locomotor behaviour. Here we introduce a discrete representation of behaviour in the nematode C. elegans. At each point in time, the worm’s posture is approximated by its closest matching template from a set of 90 postures and locomotion is represented as sequences of postures. The frequency distribution of postural sequences is heavy-tailed with a core of frequent behaviours and a much larger set of rarely used behaviours. Responses to optogenetic and environmental stimuli can be quantified as changes in postural syntax: worms show different preferences for different sequences of postures drawn from the same set of templates. A discrete representation of behaviour will enable the use of methods developed for other kinds of discrete data in bioinformatics and language processing to be harnessed for the study of behaviour.

Categories: Journal Articles

Tipping the Scale from Disorder to Alpha-helix: Folding of Amphiphilic Peptides in the Presence of Macroscopic and Molecular Interfaces

PLoS Computational Biology - Fri, 08/21/2015 - 16:00

by Cahit Dalgicdir, Christoph Globisch, Christine Peter, Mehmet Sayar

Secondary amphiphilicity is inherent to the secondary structural elements of proteins. By forming energetically favorable contacts with each other these amphiphilic building blocks give rise to the formation of a tertiary structure. Small proteins and peptides, on the other hand, are usually too short to form multiple structural elements and cannot stabilize them internally. Therefore, these molecules are often found to be structurally ambiguous up to the point of a large degree of intrinsic disorder in solution. Consequently, their conformational preference is particularly susceptible to environmental conditions such as pH, salts, or presence of interfaces. In this study we use molecular dynamics simulations to analyze the conformational behavior of two synthetic peptides, LKKLLKLLKKLLKL (LK) and EAALAEALAEALAE (EALA), with built-in secondary amphiphilicity upon forming an alpha-helix. We use these model peptides to systematically study their aggregation and the influence of macroscopic and molecular interfaces on their conformational preferences. We show that the peptides are neither random coils in bulk water nor fully formed alpha helices, but adopt multiple conformations and secondary structure elements with short lifetimes. These provide a basis for conformation-selection and population-shift upon environmental changes. Differences in these peptides’ response to macroscopic and molecular interfaces (presented by an aggregation partner) can be linked to their inherent alpha-helical tendencies in bulk water. We find that the peptides’ aggregation behavior is also strongly affected by presence or absence of an interface, and rather subtly depends on their surface charge and hydrophobicity.

Categories: Journal Articles

Formation and Dynamics of Waves in a Cortical Model of Cholinergic Modulation

PLoS Computational Biology - Fri, 08/21/2015 - 16:00

by James P. Roach, Eshel Ben-Jacob, Leonard M. Sander, Michal R. Zochowski

Acetylcholine (ACh) is a regulator of neural excitability and one of the neurochemical substrates of sleep. Amongst the cellular effects induced by cholinergic modulation are a reduction in spike-frequency adaptation (SFA) and a shift in the phase response curve (PRC). We demonstrate in a biophysical model how changes in neural excitability and network structure interact to create three distinct functional regimes: localized asynchronous, traveling asynchronous, and traveling synchronous. Our results qualitatively match those observed experimentally. Cortical activity during slow wave sleep (SWS) differs from that during REM sleep or waking states. During SWS there are traveling patterns of activity in the cortex; in other states stationary patterns occur. Our model is a network composed of Hodgkin-Huxley type neurons with a M-current regulated by ACh. Regulation of ACh level can account for dynamical changes between functional regimes. Reduction of the magnitude of this current recreates the reduction in SFA the shift from a type 2 to a type 1 PRC observed in the presence of ACh. When SFA is minimal (in waking or REM sleep state, high ACh) patterns of activity are localized and easily pinned by network inhomogeneities. When SFA is present (decreasing ACh), traveling waves of activity naturally arise. A further decrease in ACh leads to a high degree of synchrony within traveling waves. We also show that the level of ACh determines how sensitive network activity is to synaptic heterogeneity. These regimes may have a profound functional significance as stationary patterns may play a role in the proper encoding of external input as memory and traveling waves could lead to synaptic regularization, giving unique insights into the role and significance of ACh in determining patterns of cortical activity and functional differences arising from the patterns.

Categories: Journal Articles

Welcome to the Shehu Laboratory

Journal Articles

Interactive analysis of large cancer copy number studies with Copy Number Explorer

kSNP3.0: SNP detection and phylogenetic analysis of genomes without genome alignment or reference genome

Identification of C2H2-ZF binding preferences from ChIP-seq data using RCADE

Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data

BFC: correcting Illumina sequencing errors

pez: phylogenetics for the environmental sciences

iFoldRNA v2: folding RNA with constraints

PDBest: a user-friendly platform for manipulating and enhancing protein structures

MemGen: a general web server for the setup of lipid membrane simulation systems

chipPCR: an R package to pre-process raw data of amplification curves

ms-data-core-api: an open-source, metadata-oriented library for computational proteomics

Gener: a minimal programming module for chemical controllers based on DNA strand displacement

phylogeo: an R package for geographic analysis and visualization of microbiome data

GOplot: an R package for visually combining expression data with functional analysis

HTT-DB: Horizontally transferred transposable elements database

'Flatten plus': a recent implementation in WSxM for biological research

Hurricane Katrina’s psychological scars revealed

Changes in Postural Syntax Characterize Sensory Modulation and Natural Variation of C. elegans Locomotion

Tipping the Scale from Disorder to Alpha-helix: Folding of Amphiphilic Peptides in the Presence of Macroscopic and Molecular Interfaces

Formation and Dynamics of Waves in a Cortical Model of Cholinergic Modulation

Nature

Proceedings of the Natural Academy of Sciences

PLoS Computational Biology

Algorithmica

Proteins: Structure, Function, Bioinformatics

Protein Science

Journal of Molecular Biology

Biophysical Journal

Journal of American Chemical Society

Journal of Structural Biology

BMC Structural Biology

BMC Bioinformatics

Bioinformatics Journal

Nucleic Acids Research

Science