Bioinformatics Journal

Syndicate content
Bioinformatics - RSS feed of current issue
Updated: 1 year 50 weeks ago

CiVi: circular genome visualization with unique features to analyze sequence elements

Mon, 08/24/2015 - 09:21

Summary: We have developed CiVi, a user-friendly web-based tool to create custom circular maps to aid the analysis of microbial genomes and sequence elements. Sequence related data such as gene-name, COG class, PFAM domain, GC%, and subcellular location can be comprehensively viewed. Quantitative gene-related data (e.g. expression ratios or read counts) as well as predicted sequence elements (e.g. regulatory sequences) can be uploaded and visualized. CiVi accommodates the analysis of genomic elements by allowing a visual interpretation in the context of: (i) their genome-wide distribution, (ii) provided experimental data and (iii) the local orientation and location with respect to neighboring genes. CiVi thus enables both experts and non-experts to conveniently integrate public genome data with the results of genome analyses in circular genome maps suitable for publication.

Contact: L.Overmars@gmail.com

Supplementary information: Supplementary data are available at Bioinformatics online.

Availability and implementation: CiVi is freely available at http://civi.cmbi.ru.nl

Categories: Journal Articles

IonGAP: integrative bacterial genome analysis for Ion Torrent sequence data

Mon, 08/24/2015 - 09:21

Summary: We introduce IonGAP, a publicly available Web platform designed for the analysis of whole bacterial genomes using Ion Torrent sequence data. Besides assembly, it integrates a variety of comparative genomics, annotation and bacterial classification routines, based on the widely used FASTQ, BAM and SRA file formats. Benchmarking with different datasets evidenced that IonGAP is a fast, powerful and simple-to-use bioinformatics tool. By releasing this platform, we aim to translate low-cost bacterial genome analysis for microbiological prevention and control in healthcare, agroalimentary and pharmaceutical industry applications.

Availability and implementation: IonGAP is hosted by the ITER’s Teide-HPC supercomputer and is freely available on the Web for non-commercial use at http://iongap.hpc.iter.es.

Contact: mcolesan@ull.edu.es or cflores@ull.edu.es

Supplementary information: Supplementary data are available at Bioinformatics online.

Categories: Journal Articles

Interactive analysis of large cancer copy number studies with Copy Number Explorer

Mon, 08/24/2015 - 09:21

Summary: Copy number abnormalities (CNAs) such as somatically-acquired chromosomal deletions and duplications drive the development of cancer. As individual tumor genomes can contain tens or even hundreds of large and/or focal CNAs, a major difficulty is differentiating between important, recurrent pathogenic changes and benign changes unrelated to the subject’s phenotype. Here we present Copy Number Explorer, an interactive tool for mining large copy number datasets. Copy Number Explorer facilitates rapid visual and statistical identification of recurrent regions of gain or loss, identifies the genes most likely to drive CNA formation using the cghMCR method and identifies recurrently broken genes that may be disrupted or fused. The software also allows users to identify recurrent CNA regions that may be associated with differential survival.

Availability and Implementation: Copy Number Explorer is available under the GNU public license (GPL-3). Source code is available at: https://sourceforge.net/projects/copynumberexplorer/

Contact: scott.newman@emory.edu

Categories: Journal Articles

kSNP3.0: SNP detection and phylogenetic analysis of genomes without genome alignment or reference genome

Mon, 08/24/2015 - 09:21

Summary:We announce the release of kSNP3.0, a program for SNP identification and phylogenetic analysis without genome alignment or the requirement for reference genomes. kSNP3.0 is a significantly improved version of kSNP v2.

Availability and implementation: kSNP3.0 is implemented as a package of stand-alone executables for Linux and Mac OS X under the open-source BSD license. The executable packages, source code and a full User Guide are freely available at https://sourceforge.net/projects/ksnp/files/

Contact: barryghall@gmail.com

Categories: Journal Articles

Identification of C2H2-ZF binding preferences from ChIP-seq data using RCADE

Mon, 08/24/2015 - 09:21

Summary: Current methods for motif discovery from chromatin immunoprecipitation followed by sequencing (ChIP-seq) data often identify non-targeted transcription factor (TF) motifs, and are even further limited when peak sequences are similar due to common ancestry rather than common binding factors. The latter aspect particularly affects a large number of proteins from the Cys2His2 zinc finger (C2H2-ZF) class of TFs, as their binding sites are often dominated by endogenous retroelements that have highly similar sequences. Here, we present recognition code-assisted discovery of regulatory elements (RCADE) for motif discovery from C2H2-ZF ChIP-seq data. RCADE combines predictions from a DNA recognition code of C2H2-ZFs with ChIP-seq data to identify models that represent the genuine DNA binding preferences of C2H2-ZF proteins. We show that RCADE is able to identify generalizable binding models even from peaks that are exclusively located within the repeat regions of the genome, where state-of-the-art motif finding approaches largely fail.

Availability and implementation: RCADE is available as a webserver and also for download at http://rcade.ccbr.utoronto.ca/.

Supplementary information: Supplementary data are available at Bioinformatics online.

Contact: t.hughes@utoronto.ca

Categories: Journal Articles

Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data

Mon, 08/24/2015 - 09:21

Motivation: The characterization of phylogenetic and functional diversity is a key element in the analysis of microbial communities. Amplicon-based sequencing of marker genes, such as 16S rRNA, is a powerful tool for assessing and comparing the structure of microbial communities at a high phylogenetic resolution. Because 16S rRNA sequencing is more cost-effective than whole metagenome shotgun sequencing, marker gene analysis is frequently used for broad studies that involve a large number of different samples. However, in comparison to shotgun sequencing approaches, insights into the functional capabilities of the community get lost when restricting the analysis to taxonomic assignment of 16S rRNA data.

Results: Tax4Fun is a software package that predicts the functional capabilities of microbial communities based on 16S rRNA datasets. We evaluated Tax4Fun on a range of paired metagenome/16S rRNA datasets to assess its performance. Our results indicate that Tax4Fun provides a good approximation to functional profiles obtained from metagenomic shotgun sequencing approaches.

Availability and implementation: Tax4Fun is an open-source R package and applicable to output as obtained from the SILVAngs web server or the application of QIIME with a SILVA database extension. Tax4Fun is freely available for download at http://tax4fun.gobics.de/.

Contact: kasshau@gwdg.de

Supplementary information: Supplementary data are available at Bioinformatics online.

Categories: Journal Articles

BFC: correcting Illumina sequencing errors

Mon, 08/24/2015 - 09:21

Summary: BFC is a free, fast and easy-to-use sequencing error corrector designed for Illumina short reads. It uses a non-greedy algorithm but still maintains a speed comparable to implementations based on greedy methods. In evaluations on real data, BFC appears to correct more errors with fewer overcorrections in comparison to existing tools. It particularly does well in suppressing systematic sequencing errors, which helps to improve the base accuracy of de novo assemblies.

Availability and implementation: https://github.com/lh3/bfc

Contact: hengli@broadinstitute.org

Supplementary information: Supplementary data are available at Bioinformatics online.

Categories: Journal Articles

pez: phylogenetics for the environmental sciences

Mon, 08/24/2015 - 09:21

Summary: pez is an R package that permits measurement, modelling and simulation of phylogenetic structure in ecological data. pez contains the first implementation of many methods in R, and aggregates existing data structures and methods into a single, coherent package.

Availability and implementation: pez is released under the GPL v3 open-source license, available on the Internet from CRAN (http://cran.r-project.org). The package is under active development, and the authors welcome contributions (see http://github.com/willpearse/pez).

Contact: will.pearse@gmail.com

Categories: Journal Articles

iFoldRNA v2: folding RNA with constraints

Mon, 08/24/2015 - 09:21

Summary: A key to understanding RNA function is to uncover its complex 3D structure. Experimental methods used for determining RNA 3D structures are technologically challenging and laborious, which makes the development of computational prediction methods of substantial interest. Previously, we developed the iFoldRNA server that allows accurate prediction of short (<50 nt) tertiary RNA structures starting from primary sequences. Here, we present a new version of the iFoldRNA server that permits the prediction of tertiary structure of RNAs as long as a few hundred nucleotides. This substantial increase in the server capacity is achieved by utilization of experimental information such as base-pairing and hydroxyl-radical probing. We demonstrate a significant benefit provided by integration of experimental data and computational methods.

Availability and implementation: http://ifoldrna.dokhlab.org

Contact: dokh@unc.eu

Categories: Journal Articles

PDBest: a user-friendly platform for manipulating and enhancing protein structures

Mon, 08/24/2015 - 09:21

Summary: PDBest (PDB Enhanced Structures Toolkit) is a user-friendly, freely available platform for acquiring, manipulating and normalizing protein structures in a high-throughput and seamless fashion. With an intuitive graphical interface it allows users with no programming background to download and manipulate their files. The platform also exports protocols, enabling users to easily share PDB searching and filtering criteria, enhancing analysis reproducibility.

Availability and implementation: PDBest installation packages are freely available for several platforms at http://www.pdbest.dcc.ufmg.br

Contact: wellisson@dcc.ufmg.br, dpires@dcc.ufmg.br, raquelcm@dcc.ufmg.br

Supplementary information: Supplementary data are available at Bioinformatics online.

Categories: Journal Articles

MemGen: a general web server for the setup of lipid membrane simulation systems

Mon, 08/24/2015 - 09:21

Motivation: Molecular dynamics simulations provide atomic insight into the physicochemical characteristics of lipid membranes and hence, a wide range of force field families capable of modelling various lipid types have been developed in recent years. To model membranes in a biologically realistic lipid composition, simulation systems containing multiple different lipids must be assembled.

Results: We present a new web service called MemGen that is capable of setting up simulation systems of heterogenous lipid membranes. MemGen is not restricted to certain lipid force fields or lipid types, but instead builds membranes from uploaded structure files which may contain any kind of amphiphilic molecule. MemGen works with any all-atom or united-atom lipid representation.

Availability and implementation: MemGen is freely available without registration at http://memgen.uni-goettingen.de.

Contact: jhub@gwdg.de

Supplementary information: Supplementary data are available at Bioinformatics online.

Categories: Journal Articles

chipPCR: an R package to pre-process raw data of amplification curves

Mon, 08/24/2015 - 09:21

Motivation: Both the quantitative real-time polymerase chain reaction (qPCR) and quantitative isothermal amplification (qIA) are standard methods for nucleic acid quantification. Numerous real-time read-out technologies have been developed. Despite the continuous interest in amplification-based techniques, there are only few tools for pre-processing of amplification data. However, a transparent tool for precise control of raw data is indispensable in several scenarios, for example, during the development of new instruments.

Results: chipPCR is an R package for the pre-processing and quality analysis of raw data of amplification curves. The package takes advantage of R’s S4 object model and offers an extensible environment. chipPCR contains tools for raw data exploration: normalization, baselining, imputation of missing values, a powerful wrapper for amplification curve smoothing and a function to detect the start and end of an amplification curve. The capabilities of the software are enhanced by the implementation of algorithms unavailable in R, such as a 5-point stencil for derivative interpolation. Simulation tools, statistical tests, plots for data quality management, amplification efficiency/quantification cycle calculation, and datasets from qPCR and qIA experiments are part of the package. Core functionalities are integrated in GUIs (web-based and standalone shiny applications), thus streamlining analysis and report generation.

Availability and implementation: http://cran.r-project.org/web/packages/chipPCR. Source code: https://github.com/michbur/chipPCR.

Contact: stefan.roediger@b-tu.de

Supplementary information: Supplementary data are available at Bioinformatics online.

Categories: Journal Articles

ms-data-core-api: an open-source, metadata-oriented library for computational proteomics

Mon, 08/24/2015 - 09:21

Summary: The ms-data-core-api is a free, open-source library for developing computational proteomics tools and pipelines. The Application Programming Interface, written in Java, enables rapid tool creation by providing a robust, pluggable programming interface and common data model. The data model is based on controlled vocabularies/ontologies and captures the whole range of data types included in common proteomics experimental workflows, going from spectra to peptide/protein identifications to quantitative results. The library contains readers for three of the most used Proteomics Standards Initiative standard file formats: mzML, mzIdentML, and mzTab. In addition to mzML, it also supports other common mass spectra data formats: dta, ms2, mgf, pkl, apl (text-based), mzXML and mzData (XML-based). Also, it can be used to read PRIDE XML, the original format used by the PRIDE database, one of the world-leading proteomics resources. Finally, we present a set of algorithms and tools whose implementation illustrates the simplicity of developing applications using the library.

Availability and implementation: The software is freely available at https://github.com/PRIDE-Utilities/ms-data-core-api.

Supplementary information: Supplementary data are available at Bioinformatics online

Contact: juan@ebi.ac.uk

Categories: Journal Articles

Gener: a minimal programming module for chemical controllers based on DNA strand displacement

Mon, 08/24/2015 - 09:21

Summary: Gener is a development module for programming chemical controllers based on DNA strand displacement. Gener is developed with the aim of providing a simple interface that minimizes the opportunities for programming errors: Gener allows the user to test the computations of the DNA programs based on a simple two-domain strand displacement algebra, the minimal available so far. The tool allows the user to perform stepwise computations with respect to the rules of the algebra as well as exhaustive search of the computation space with different options for exploration and visualization. Gener can be used in combination with existing tools, and in particular, its programs can be exported to Microsoft Research’s DSD tool as well as to LaTeX.

Availability and implementation: Gener is available for download at the Cosbi website at http://www.cosbi.eu/research/prototypes/gener as a windows executable that can be run on Mac OS X and Linux by using Mono.

Contact: ozan@cosbi.eu

Categories: Journal Articles

phylogeo: an R package for geographic analysis and visualization of microbiome data

Mon, 08/24/2015 - 09:21

Motivation: We have created an R package named phylogeo that provides a set of geographic utilities for sequencing-based microbial ecology studies. Although the geographic location of samples is an important aspect of environmental microbiology, none of the major software packages used in processing microbiome data include utilities that allow users to map and explore the spatial dimension of their data. phylogeo solves this problem by providing a set of plotting and mapping functions that can be used to visualize the geographic distribution of samples, to look at the relatedness of microbiomes using ecological distance, and to map the geographic distribution of particular sequences. By extending the popular phyloseq package and using the same data structures and command formats, phylogeo allows users to easily map and explore the geographic dimensions of their data from the R programming language.

Availability and Implementation: phylogeo is documented and freely available http://zachcp.github.io/phylogeo

Contact: zcharlop@rockefeller.edu

Categories: Journal Articles

GOplot: an R package for visually combining expression data with functional analysis

Mon, 08/24/2015 - 09:21

Summary: Despite the plethora of methods available for the functional analysis of omics data, obtaining comprehensive-yet detailed understanding of the results remains challenging. This is mainly due to the lack of publicly available tools for the visualization of this type of information. Here we present an R package called GOplot, based on ggplot2, for enhanced graphical representation. Our package takes the output of any general enrichment analysis and generates plots at different levels of detail: from a general overview to identify the most enriched categories (bar plot, bubble plot) to a more detailed view displaying different types of information for molecules in a given set of categories (circle plot, chord plot, cluster plot). The package provides a deeper insight into omics data and allows scientists to generate insightful plots with only a few lines of code to easily communicate the findings.

Availability and Implementation: The R package GOplot is available via CRAN-The Comprehensive R Archive Network: http://cran.r-project.org/web/packages/GOplot. The shiny web application of the Venn diagram can be found at: https://wwalter.shinyapps.io/Venn/. A detailed manual of the package with sample figures can be found at https://wencke.github.io/

Contact: fscabo@cnic.es or mricote@cnic.es

Categories: Journal Articles

HTT-DB: Horizontally transferred transposable elements database

Mon, 08/24/2015 - 09:21

Motivation: Horizontal transfer of transposable (HTT) elements among eukaryotes was discovered in the mid-1980s. As then, >300 new cases have been described. New findings about HTT are revealing the evolutionary impact of this phenomenon on host genomes. In order to provide an up to date, interactive and expandable database for such events, we developed the HTT-DB database.

Results: HTT-DB allows easy access to most of HTT cases reported along with rich information about each case. Moreover, it allows the user to generate tables and graphs based on searches using Transposable elements and/or host species classification and export them in several formats.

Availability and implementation: This database is freely available on the web at http://lpa.saogabriel.unipampa.edu.br:8080/httdatabase. HTT-DB was developed based on Java and MySQL with all major browsers supported. Tools and software packages used are free for personal or non-profit projects.

Contact: bdotto82@gmail.com or gabriel.wallau@gmail.com

Categories: Journal Articles

'Flatten plus': a recent implementation in WSxM for biological research

Mon, 08/24/2015 - 09:21

Summary: Scanning probe microscopy (SPM) is already a relevant tool in biological research at the nanoscale. We present ‘Flatten plus’, a recent and helpful implementation in the well-known WSxM free software package. ‘Flatten plus’ allows reducing low-frequency noise in SPM images in a semi-automated way preventing the appearance of typical artifacts associated with such filters.

Availability and implementation: WSxM is a free software implemented in C++ supported on MS Windows, but it can also be run under Mac or Linux using emulators such as Wine or Parallels. WSxM can be downloaded from http://www.wsxmsolutions.com/.

Contact: ignacio.horcas@wsxmsolutions.com

Categories: Journal Articles