Journal Articles

  • DREAM: a webserver for the identification of editing sites in mature miRNAs using deep sequencing data
    [Jul 2015]

    Summary: DREAM: detecting RNA editing associated with microRNAs, is a webserver for the identification of mature microRNA editing events using deep sequencing data. Raw microRNA sequencing reads can be provided as input, the reads are aligned against the genome and custom scripts process the data, search for potential editing sites and assess the statistical significance of the findings. The output is a text file with the location and the statistical description of all the putative editing sites detected.

    Availability and implementation: DREAM is freely available on the web at http://www.cs.tau.ac.il/~mirnaed/.

    Contact: elieis@post.tau.ac.il

    Categories: Journal Articles
  • MSA-PAD: DNA multiple sequence alignment framework based on PFAM accessed domain information
    [Jul 2015]

    Summary: Here we present the MSA-PAD application, a DNA multiple sequence alignment framework that uses PFAM protein domain information to align DNA sequences encoding either single or multiple protein domains. MSA-PAD has two alignment options: gene and genome mode.

    Availability and Implementation: MSA-PAD is available as a web application (https://recasgateway.ba.infn.it/) and as two Taverna workflows corresponding to two alignment modes (Gene mode: http://www.myexperiment.org/workflows/4549.html; Genome Mode: http://www.myexperiment.org/workflows/4551.html).

    Contact: g.pesole@ibbe.cnr.it

    Supplementary information: Supplementary data are available at Bioinformatics online.

    Categories: Journal Articles
  • KeBABS: an R package for kernel-based analysis of biological sequences
    [Jul 2015]

    Summary: KeBABS provides a powerful, flexible and easy to use framework for kernel-based analysis of biological sequences in R. It includes efficient implementations of the most important sequence kernels, also including variants that allow for taking sequence annotations and positional information into account. KeBABS seamlessly integrates three common support vector machine (SVM) implementations with a unified interface. It allows for hyperparameter selection by cross validation, nested cross validation and also features grouped cross validation. The biological interpretation of SVM models is supported by (1) the computation of weights of sequence patterns and (2) prediction profiles that highlight the contributions of individual sequence positions or sections.

    Availability and implementation: The R package kebabs is available via the Bioconductor project: http://bioconductor.org/packages/release/bioc/html/kebabs.html. Further information and the R code of the example in this paper are available at http://www.bioinf.jku.at/software/kebabs/.

    Contact: kebabs@bioinf.jku.at or bodenhofer@bioinf.jku.at

    Categories: Journal Articles
  • ExaML version 3: a tool for phylogenomic analyses on supercomputers
    [Jul 2015]

    Motivation: Phylogenies are increasingly used in all fields of medical and biological research. Because of the next generation sequencing revolution, datasets used for conducting phylogenetic analyses grow at an unprecedented pace. We present ExaML version 3, a dedicated production-level code for inferring phylogenies on whole-transcriptome and whole-genome alignments using supercomputers.

    Results: We introduce several improvements and extensions to ExaML: Extensions of substitution models and supported data types, the integration of a novel load balance algorithm as well as a parallel I/O optimization that significantly improve parallel efficiency, and a production-level implementation for Intel MIC-based hardware platforms.

    Availability and implementation: The code is available under GNU GPL at https://github.com/stamatak/ExaML.

    Contact: Alexandros.Stamatakis@h-its.org

    Supplementary information: Supplementary data are available at Bioinformatics online.

    Categories: Journal Articles
  • Solubis: optimize your protein
    [Jul 2015]

    Motivation: Protein aggregation is associated with a number of protein misfolding diseases and is a major concern for therapeutic proteins. Aggregation is caused by the presence of aggregation-prone regions (APRs) in the amino acid sequence of the protein. The lower the aggregation propensity of APRs and the better they are protected by native interactions within the folded structure of the protein, the more aggregation is prevented. Therefore, both the local thermodynamic stability of APRs in the native structure and their intrinsic aggregation propensity are a key parameter that needs to be optimized to prevent protein aggregation.

    Results: The Solubis method presented here automates the process of carefully selecting point mutations that minimize the intrinsic aggregation propensity while improving local protein stability.

    Availability and implementation: All information about the Solubis plugin is available at http://solubisyasara.switchlab.org/.

    Contact: joost.schymkowitz@switch.vib-kuleuven.be or Frederic.Rousseau@switch.vib-kuleuven.be

    Supplementary information: Supplementary data are available at Bioinformatics online.

    Categories: Journal Articles
  • do_x3dna: a tool to analyze structural fluctuations of dsDNA or dsRNA from molecular dynamics simulations
    [Jul 2015]

    Summary: The do_x3dna package has been developed to analyze the structural fluctuations of DNA or RNA during molecular dynamics simulations. It extends the capability of the 3DNA package to GROMACS MD trajectories and includes new methods to calculate the global-helical axis of DNA and bending fluctuations during simulations. The package also includes a Python module dnaMD to perform and visualize statistical analyses of complex data obtained from the trajectories.

    Availability and Implementation: The source code of the do_x3dna is available at https://github.com/rjdkmr/do_x3dna under GNU GPLv3 license. A detailed documentation, including tutorials and required input data, are freely available at http://rjdkmr.github.io/do_x3dna/.

    Contact: rjdkmr@gmail.com

    Categories: Journal Articles
  • RiboTools: a Galaxy toolbox for qualitative ribosome profiling analysis
    [Jul 2015]

    Motivation: Ribosome profiling provides genome-wide information about translational regulation. However, there is currently no standard tool for the qualitative analysis of Ribo-seq data. We present here RiboTools, a Galaxy toolbox for the analysis of ribosome profiling (Ribo-seq) data. It can be used to detect translational ambiguities, stop codon readthrough events and codon occupancy. It provides a large number of plots for the visualisation of these events.

    Availability and implementation: RiboTools is available from https://testtoolshed.g2.bx.psu.edu/view/rlegendre/ribo_tools as part of the Galaxy Project, under the GPLv3 licence. It is written in python2.7 and uses standard python libraries, such as matplotlib and numpy.

    Contact: olivier.namy@igmors.u-psud.fr

    Supplementary Information: Supplementary data are available from Bioinformatics online.

    Categories: Journal Articles
  • edgeRun: an R package for sensitive, functionally relevant differential expression discovery using an unconditional exact test
    [Jul 2015]

    Summary: Next-generation sequencing platforms for measuring digital expression such as RNA-Seq are displacing traditional microarray-based methods in biological experiments. The detection of differentially expressed genes between groups of biological conditions has led to the development of numerous bioinformatics tools, but so far, few exploit the expanded dynamic range afforded by the new technologies. We present edgeRun, an R package that implements an unconditional exact test that is a more powerful version of the exact test in edgeR. This increase in power is especially pronounced for experiments with as few as two replicates per condition, for genes with low total expression and with large biological coefficient of variation. In comparison with a panel of other tools, edgeRun consistently captures functionally similar differentially expressed genes.

    Availability and implementation: The package is freely available under the MIT license from CRAN (http://cran.r-project.org/web/packages/edgeRun).

    Contact: edimont@mail.harvard.edu

    Supplementary information: Supplementary data are available at Bioinformatics online.

    Categories: Journal Articles
  • EW_dmGWAS: edge-weighted dense module search for genome-wide association studies and gene expression profiles
    [Jul 2015]

    Summary: We previously developed dmGWAS to search for dense modules in a human protein–protein interaction (PPI) network; it has since become a popular tool for network-assisted analysis of genome-wide association studies (GWAS). dmGWAS weights nodes by using GWAS signals. Here, we introduce an upgraded algorithm, EW_dmGWAS, to boost GWAS signals in a node- and edge-weighted PPI network. In EW_dmGWAS, we utilize condition-specific gene expression profiles for edge weights. Specifically, differential gene co-expression is used to infer the edge weights. We applied EW_dmGWAS to two diseases and compared it with other relevant methods. The results suggest that EW_dmGWAS is more powerful in detecting disease-associated signals.

    Availability and implementation: The algorithm of EW_dmGWAS is implemented in the R package dmGWAS_3.0 and is available at http://bioinfo.mc.vanderbilt.edu/dmGWAS.

    Contact: zhongming.zhao@vanderbilt.edu or peilin.jia@vanderbilt.edu

    Supplementary information: Supplementary materials are available at Bioinformatics online.

    Categories: Journal Articles
  • PRROC: computing and visualizing precision-recall and receiver operating characteristic curves in R
    [Jul 2015]

    Summary: Precision-recall (PR) and receiver operating characteristic (ROC) curves are valuable measures of classifier performance. Here, we present the R-package PRROC, which allows for computing and visualizing both PR and ROC curves. In contrast to available R-packages, PRROC allows for computing PR and ROC curves and areas under these curves for soft-labeled data using a continuous interpolation between the points of PR curves. In addition, PRROC provides a generic plot function for generating publication-quality graphics of PR and ROC curves.

    Availability and implementation: PRROC is available from CRAN and is licensed under GPL 3.

    Contact: grau@informatik.uni-halle.de

    Categories: Journal Articles
  • ASBench: benchmarking sets for allosteric discovery
    [Jul 2015]

    Summary: Allostery allows for the fine-tuning of protein function. Targeting allosteric sites is gaining increasing recognition as a novel strategy in drug design. The key challenge in the discovery of allosteric sites has strongly motivated the development of computational methods and thus high-quality, publicly accessible standard data have become indispensable. Here, we report benchmarking data for experimentally determined allosteric sites through a complex process, including a ‘Core set’ with 235 unique allosteric sites and a ‘Core-Diversity set’ with 147 structurally diverse allosteric sites. These benchmarking sets can be exploited to develop efficient computational methods to predict unknown allosteric sites in proteins and reveal unique allosteric ligand–protein interactions to guide allosteric drug design.

    Availability and implementation: The benchmarking sets are freely available at http://mdl.shsmu.edu.cn/asbench.

    Contact: jian.zhang@sjtu.edu.cn

    Supplementary information: Supplementary data are available at Bioinformatics online

    Categories: Journal Articles
  • Object-based representation and analysis of light and electron microscopic volume data using Blender
    [Jul 2015]

    Background: Rapid improvements in light and electron microscopy imaging techniques and the development of 3D anatomical atlases necessitate new approaches for the visualization and analysis of image data. Pixel-based representations of raw light microscopy data suffer from limitations in the number of channels that can be visualized simultaneously. Complex electron microscopic reconstructions from large tissue volumes are also challenging to visualize and analyze. Results: Here we exploit the advanced visualization capabilities and flexibility of the open-source platform Blender to visualize and analyze anatomical atlases. We use light-microscopy-based gene expression atlases and electron microscopy connectome volume data from larval stages of the marine annelid Platynereis dumerilii. We build object-based larval gene expression atlases in Blender and develop tools for annotation and coexpression analysis. We also represent and analyze connectome data including neuronal reconstructions and underlying synaptic connectivity. Conclusions: We demonstrate the power and flexibility of Blender for visualizing and exploring complex anatomical atlases. The resources we have developed for Platynereis will facilitate data sharing and the standardization of anatomical atlases for this species. The flexibility of Blender, particularly its embedded Python application programming interface, means that our methods can be easily extended to other organisms.
    Categories: Journal Articles
  • GESPA: classifying nsSNPs to predict disease association
    [Jul 2015]

    Background: Non-synonymous single nucleotide polymorphisms (nsSNPs) are the most common DNA sequence variation associated with disease in humans. Thus determining the clinical significance of each nsSNP is of great importance. Potential detrimental nsSNPs may be identified by genetic association studies or by functional analysis in the laboratory, both of which are expensive and time consuming. Existing computational methods lack accuracy and features to facilitate nsSNP classification for clinical use. We developed the GESPA (GEnomic Single nucleotide Polymorphism Analyzer) program to predict the pathogenicity and disease phenotype of nsSNPs. Results: GESPA is a user-friendly software package for classifying disease association of nsSNPs. It allows flexibility in acceptable input formats and predicts the pathogenicity of a given nsSNP by assessing the conservation of amino acids in orthologs and paralogs and supplementing this information with data from medical literature. The development and testing of GESPA was performed using the humsavar, ClinVar and humvar datasets. Additionally, GESPA also predicts the disease phenotype associated with a nsSNP with high accuracy, a feature unavailable in existing software. GESPA’s overall accuracy exceeds existing computational methods for predicting nsSNP pathogenicity. The usability of GESPA is enhanced by fast SQL-based cloud storage and retrieval of data. Conclusions: GESPA is a novel bioinformatics tool to determine the pathogenicity and phenotypes of nsSNPs. We anticipate that GESPA will become a useful clinical framework for predicting the disease association of nsSNPs. The program, executable jar file, source code, GPL 3.0 license, user guide, and test data with instructions are available at http://sourceforge.net/projects/gespa.
    Categories: Journal Articles
  • Stepwise Motion in a Multivalent [2](3)Catenane
    [Jul 2015]

    Journal of the American Chemical SocietyDOI: 10.1021/jacs.5b05758
    Categories: Journal Articles
  • Iron-Catalyzed Reduction of CO2 into Methylene: Formation of C–N, C–O, and C–C Bonds
    [Jul 2015]

    Journal of the American Chemical SocietyDOI: 10.1021/jacs.5b06077
    Categories: Journal Articles
  • A set of powerful negative selection systems for unmodified Enterobacteriaceae
    [Jul 2015]

    Creation of defined genetic mutations is a powerful method for dissecting mechanisms of bacterial disease; however, many genetic tools are only developed for laboratory strains. We have designed a modular and general negative selection strategy based on inducible toxins that provides high selection stringency in clinical Escherichia coli and Salmonella isolates. No strain- or species-specific optimization is needed, yet this system achieves better selection stringency than all previously reported negative selection systems usable in unmodified E. coli strains. The high stringency enables use of negative instead of positive selection in phage-mediated generalized transduction and also allows transfer of alleles between arbitrary strains of E. coli without requiring phage. The modular design should also allow further extension to other bacteria. This negative selection system thus overcomes disadvantages of existing systems, enabling definitive genetic experiments in both lab and clinical isolates of E. coli and other Enterobacteriaceae.

    Categories: Journal Articles
  • A new approach for annotation of transposable elements using small RNA mapping
    [Jul 2015]

    Transposable elements (TEs) are mobile genomic DNA sequences found in most organisms. They so densely populate the genomes of many eukaryotic species that they are often the major constituents. With the rapid generation of many plant genome sequencing projects over the past few decades, there is an urgent need for improved TE annotation as a prerequisite for genome-wide studies. Analogous to the use of RNA-seq for gene annotation, we propose a new method for de novo TE annotation that uses as a guide 24 nt-siRNAs that are a part of TE silencing pathways. We use this new approach, called TASR (for Transposon Annotation using Small RNAs), for de novo annotation of TEs in Arabidopsis, rice and soybean and demonstrate that this strategy can be successfully applied for de novo TE annotation in plants.

    Executable PERL is available for download from: http://tasr-pipeline.sourceforge.net/

    Categories: Journal Articles
  • High-throughput assay and engineering of self-cleaving ribozymes by sequencing
    [Jul 2015]

    Self-cleaving ribozymes are found in all domains of life and are believed to play important roles in biology. Additionally, self-cleaving ribozymes have been the subject of extensive engineering efforts for applications in synthetic biology. These studies often involve laborious assays of multiple individual variants that are either designed rationally or discovered through selection or screening. However, these assays provide only a limited view of the large sequence space relevant to the ribozyme function. Here, we report a strategy that allows quantitative characterization of greater than 1000 ribozyme variants in a single experiment. We generated a library of predefined ribozyme variants that were converted to DNA and analyzed by high-throughput sequencing. By counting the number of cleaved and uncleaved reads of every variant in the library, we obtained a complete activity profile of the ribozyme pool which was used to both analyze and engineer allosteric ribozymes.

    Categories: Journal Articles
  • Syndicate content