Journal Articles

SAAS-CNV: A Joint Segmentation Approach on Aggregated and Allele Specific Signals for the Identification of Somatic Copy Number Alterations with Next-Generation Sequencing Data

PLoS Computational Biology - Thu, 11/19/2015 - 17:00

by Zhongyang Zhang, Ke Hao

Cancer genomes exhibit profound somatic copy number alterations (SCNAs). Studying tumor SCNAs using massively parallel sequencing provides unprecedented resolution and meanwhile gives rise to new challenges in data analysis, complicated by tumor aneuploidy and heterogeneity as well as normal cell contamination. While the majority of read depth based methods utilize total sequencing depth alone for SCNA inference, the allele specific signals are undervalued. We proposed a joint segmentation and inference approach using both signals to meet some of the challenges. Our method consists of four major steps: 1) extracting read depth supporting reference and alternative alleles at each SNP/Indel locus and comparing the total read depth and alternative allele proportion between tumor and matched normal sample; 2) performing joint segmentation on the two signal dimensions; 3) correcting the copy number baseline from which the SCNA state is determined; 4) calling SCNA state for each segment based on both signal dimensions. The method is applicable to whole exome/genome sequencing (WES/WGS) as well as SNP array data in a tumor-control study. We applied the method to a dataset containing no SCNAs to test the specificity, created by pairing sequencing replicates of a single HapMap sample as normal/tumor pairs, as well as a large-scale WGS dataset consisting of 88 liver tumors along with adjacent normal tissues. Compared with representative methods, our method demonstrated improved accuracy, scalability to large cancer studies, capability in handling both sequencing and SNP array data, and the potential to improve the estimation of tumor ploidy and purity.
Categories: Journal Articles

Untangling Brain-Wide Dynamics in Consciousness by Cross-Embedding

PLoS Computational Biology - Thu, 11/19/2015 - 17:00

by Satohiro Tajima, Toru Yanagawa, Naotaka Fujii, Taro Toyoizumi

Brain-wide interactions generating complex neural dynamics are considered crucial for emergent cognitive functions. However, the irreducible nature of nonlinear and high-dimensional dynamical interactions challenges conventional reductionist approaches. We introduce a model-free method, based on embedding theorems in nonlinear state-space reconstruction, that permits a simultaneous characterization of complexity in local dynamics, directed interactions between brain areas, and how the complexity is produced by the interactions. We demonstrate this method in large-scale electrophysiological recordings from awake and anesthetized monkeys. The cross-embedding method captures structured interaction underlying cortex-wide dynamics that may be missed by conventional correlation-based analysis, demonstrating a critical role of time-series analysis in characterizing brain state. The method reveals a consciousness-related hierarchy of cortical areas, where dynamical complexity increases along with cross-area information flow. These findings demonstrate the advantages of the cross-embedding method in deciphering large-scale and heterogeneous neuronal systems, suggesting a crucial contribution by sensory-frontoparietal interactions to the emergence of complex brain dynamics during consciousness.
Categories: Journal Articles

“Broadband” Bioinformatics Skills Transfer with the Knowledge Transfer Programme (KTP): Educational Model for Upliftment and Sustainable Development

PLoS Computational Biology - Thu, 11/19/2015 - 17:00

by Emile R. Chimusa, Mamana Mbiyavanga, Velaphi Masilela, Judit Kumuthini

A shortage of practical skills and relevant expertise is possibly the primary obstacle to social upliftment and sustainable development in Africa. The “omics” fields, especially genomics, are increasingly dependent on the effective interpretation of large and complex sets of data. Despite abundant natural resources and population sizes comparable with many first-world countries from which talent could be drawn, countries in Africa still lag far behind the rest of the world in terms of specialized skills development. Moreover, there are serious concerns about disparities between countries within the continent. The multidisciplinary nature of the bioinformatics field, coupled with rare and depleting expertise, is a critical problem for the advancement of bioinformatics in Africa. We propose a formalized matchmaking system, which is aimed at reversing this trend, by introducing the Knowledge Transfer Programme (KTP). Instead of individual researchers travelling to other labs to learn, researchers with desirable skills are invited to join African research groups for six weeks to six months. Visiting researchers or trainers will pass on their expertise to multiple people simultaneously in their local environments, thus increasing the efficiency of knowledge transference. In return, visiting researchers have the opportunity to develop professional contacts, gain industry work experience, work with novel datasets, and strengthen and support their ongoing research. The KTP develops a network with a centralized hub through which groups and individuals are put into contact with one another and exchanges are facilitated by connecting both parties with potential funding sources. This is part of the PLOS Computational Biology Education collection.
Categories: Journal Articles

A Bio-inspired Collision Avoidance Model Based on Spatial Information Derived from Motion Detectors Leads to Common Routes

PLoS Computational Biology - Thu, 11/19/2015 - 17:00

by Olivier J. N. Bertrand, Jens P. Lindemann, Martin Egelhaaf

Avoiding collisions is one of the most basic needs of any mobile agent, both biological and technical, when searching around or aiming toward a goal. We propose a model of collision avoidance inspired by behavioral experiments on insects and by properties of optic flow on a spherical eye experienced during translation, and test the interaction of this model with goal-driven behavior. Insects, such as flies and bees, actively separate the rotational and translational optic flow components via behavior, i.e. by employing a saccadic strategy of flight and gaze control. Optic flow experienced during translation, i.e. during intersaccadic phases, contains information on the depth-structure of the environment, but this information is entangled with that on self-motion. Here, we propose a simple model to extract the depth structure from translational optic flow by using local properties of a spherical eye. On this basis, a motion direction of the agent is computed that ensures collision avoidance. Flying insects are thought to measure optic flow by correlation-type elementary motion detectors. Their responses depend, in addition to velocity, on the texture and contrast of objects and, thus, do not measure the velocity of objects veridically. Therefore, we initially used geometrically determined optic flow as input to a collision avoidance algorithm to show that depth information inferred from optic flow is sufficient to account for collision avoidance under closed-loop conditions. Then, the collision avoidance algorithm was tested with bio-inspired correlation-type elementary motion detectors in its input. Even then, the algorithm led successfully to collision avoidance and, in addition, replicated the characteristics of collision avoidance behavior of insects. Finally, the collision avoidance algorithm was combined with a goal direction and tested in cluttered environments. The simulated agent then showed goal-directed behavior reminiscent of components of the navigation behavior of insects.
Categories: Journal Articles

Electrochemical Imaging and Redox Interrogation of Surface Defects on Operating SrTiO3 Photoelectrodes

Journal of American Chemical Society - Thu, 11/19/2015 - 16:03

Journal of the American Chemical SocietyDOI: 10.1021/jacs.5b10256
Categories: Journal Articles

Complex Surface Diffusion Mechanisms of Cobalt Phthalocyanine Molecules on Ag(100)

Journal of American Chemical Society - Thu, 11/19/2015 - 15:59

Journal of the American Chemical SocietyDOI: 10.1021/jacs.5b08001
Categories: Journal Articles

In Situ Study of Fe3Pt–Fe2O3 Core–Shell Nanoparticle Formation

Journal of American Chemical Society - Thu, 11/19/2015 - 15:58

Journal of the American Chemical SocietyDOI: 10.1021/jacs.5b10076
Categories: Journal Articles

pH-Responsive Gas–Water–Solid Interface for Multiphase Catalysis

Journal of American Chemical Society - Thu, 11/19/2015 - 15:54

Journal of the American Chemical SocietyDOI: 10.1021/jacs.5b09790
Categories: Journal Articles

Electronically Stabilized Nonplanar Phenalenyl Radical and Its Planar Isomer

Journal of American Chemical Society - Thu, 11/19/2015 - 15:43

Journal of the American Chemical SocietyDOI: 10.1021/jacs.5b07959
Categories: Journal Articles

Substituent Effects in CH Hydrogen Bond Interactions: Linear Free Energy Relationships and Influence of Anions

Journal of American Chemical Society - Thu, 11/19/2015 - 15:42

Journal of the American Chemical SocietyDOI: 10.1021/jacs.5b08767
Categories: Journal Articles

Coordination-Driven Polymerization of Supramolecular Nanocages

Journal of American Chemical Society - Thu, 11/19/2015 - 08:09

Journal of the American Chemical SocietyDOI: 10.1021/jacs.5b10815
Categories: Journal Articles

Self-Assembled PbSe Nanowire:Perovskite Hybrids

Journal of American Chemical Society - Thu, 11/19/2015 - 08:08

Journal of the American Chemical SocietyDOI: 10.1021/jacs.5b10641
Categories: Journal Articles

An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

BMC Bioinformatics - Thu, 11/19/2015 - 07:00
Background: Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions.DescriptionWe have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Conclusions: This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.
Categories: Journal Articles

Amino acid alphabet reduction preserves fold information contained in contact interactions in proteins

ABSTRACT

To reduce complexity, understand generalized rules of protein folding, and facilitate de novo protein design, the 20-letter amino acid alphabet is commonly reduced to a smaller alphabet by clustering amino acids based on some measure of similarity. In this work, we seek the optimal alphabet that preserves as much of the structural information found in long-range (contact) interactions among amino acids in natively-folded proteins. We employ the Information Maximization Device, based on information theory, to partition the amino acids into well-defined clusters. Numbering from 2 to 19 groups, these optimal clusters of amino acids, while generated automatically, embody well-known properties of amino acids such as hydrophobicity/polarity, charge, size, and aromaticity, and are demonstrated to maintain the discriminative power of long-range interactions with minimal loss of mutual information. Our measurements suggest that reduced alphabets (of less than 10) are able to capture virtually all of the information residing in native contacts and may be sufficient for fold recognition, as demonstrated by extensive threading tests. In an expansive survey of the literature, we observe that alphabets derived from various approaches—including those derived from physicochemical intuition, local structure considerations, and sequence alignments of remote homologs—fare consistently well in preserving contact interaction information, highlighting a convergence in the various factors thought to be relevant to the folding code. Moreover, we find that alphabets commonly used in experimental protein design are nearly optimal and are largely coherent with observations that have arisen in this work. Proteins 2015; 83:2198–2216. © 2015 Wiley Periodicals, Inc.

Categories: Journal Articles

Exploring interaction mechanisms of the inhibitor binding to the VP35 IID region of Ebola virus by all atom molecular dynamics simulation method

Abstract

Ebola viruses (EBOVs) cause an acute and serious illness which is often fatal if untreated, and there is no effective vaccine until now. Multifunctional VP35 is critical for viral replication, RNA silencing suppression and nucleocapsid formation, and it is considered as a future target for the molecular biology technique. In the present work, the binding of inhibitor pyrrole-based compounds (GA017) to wild-type (WT), single (K248A, K251A, and I295A), and double (K248A/I295A) mutant VP35 were investigated by all-atom molecular dynamic (MD) simulations and Molecular Mechanics Generalized Born surface area (MM/GBSA) energy calculation. The calculated results indicate that the binding with GA017 makes the binding pocket more stable and reduces the space of the binding pocket. Moreover, the electrostatic interactions (ΔEele) and VDW energy (ΔEvdw) provide the major forces for affinity binding, and single mutation I295A and double mutation K248A/I295A have great influence on the conformation of the VP35 binding pocket. Interestingly, the residues R300-G301-D302 of I295A form a new helix and the sheet formed by the residues V294-I295-H296-I297 disappears in the double mutation K248A/I295A as compared with WT. Moreover, the binding free energy calculations show that I295A and K248A/I295A mutations decrease of absolute binding free energies while K248A and K251A mutations increase absolute binding free energy. Our calculated results are in good agreement with the experimental results that K248A/I295A double mutant results in near-complete loss of compound binding. The obtained information will be useful for design effective inhibitors for treating Ebola virus. Proteins 2015; 83:2263–2278. © 2015 Wiley Periodicals, Inc.

Categories: Journal Articles

Exact Algorithms for Minimum Weighted Dominating Induced Matching

Algorithmica - Thu, 11/19/2015 - 00:00
Abstract

Say that an edge of a graph G dominates itself and every other edge sharing a vertex of it. An edge dominating set of a graph \(G=(V,E)\) is a subset of edges \(E' \subseteq E\) which dominates all edges of G. In particular, if every edge of G is dominated by exactly one edge of \(E'\) then \(E'\) is a dominating induced matching. It is known that not every graph admits a dominating induced matching, while the problem to decide if it does admit it is NP-complete. In this paper we consider the problems of counting the number of dominating induced matchings and finding a minimum weighted dominating induced matching, if any, of a graph with weighted edges. We describe three exact algorithms for general graphs. The first runs in linear time for a given vertex dominating set of fixed size of the graph. The second runs in polynomial time if the graph admits a polynomial number of maximal independent sets. The third one is an \(O^*(1.1939^n)\) time and polynomial (linear) space, which improves over the existing algorithms for exactly solving this problem in general graphs.

Categories: Journal Articles

Evaluation of Monotone DNF Formulas

Algorithmica - Thu, 11/19/2015 - 00:00
Abstract

Stochastic boolean function evaluation (SBFE) is the problem of determining the value of a given boolean function f on an unknown input x, when each bit \(x_i\) of x can only be determined by paying a given associated cost \(c_i\) . Further, x is drawn from a given product distribution: for each \(x_i\) , \(\mathbf{Pr}[x_i=1] = p_i\) and the bits are independent. The goal is to minimize the expected cost of evaluation. In this paper, we study the complexity of the SBFE problem for classes of DNF formulas. We consider both exact and approximate versions of the problem for subclasses of DNF, for arbitrary costs and product distributions, and for unit costs and/or the uniform distribution.

Categories: Journal Articles

An heuristic filtering tool to identify phenotype-associated genetic variants applied to human intellectual disability and canine coat colors

BMC Bioinformatics - Wed, 11/18/2015 - 19:00
Background: Identification of one or several disease causing variant(s) from the large collection of variants present in an individual is often achieved by the sequential use of heuristic filters. The recent development of whole exome sequencing enrichment designs for several non-model species created the need for a species-independent, fast and versatile analysis tool, capable of tackling a wide variety of standard and more complex inheritance models. With this aim, we developed “Mendelian”, an R-package that can be used for heuristic variant filtering. Results: The R-package Mendelian offers fast and convenient filters to analyze putative variants for both recessive and dominant models of inheritance, with variable degrees of penetrance and detectance. Analysis of trios is supported. Filtering against variant databases and annotation of variants is also included. This package is not species specific and supports parallel computation. We validated this package by reanalyzing data from a whole exome sequencing experiment on intellectual disability in humans. In a second example, we identified the mutations responsible for coat color in the dog. This is the first example of whole exome sequencing without prior mapping in the dog. Conclusion: We developed an R-package that enables the identification of disease-causing variants from the long list of variants called in sequencing experiments. The software and a detailed manual are available at https://github.com/BartBroeckx/Mendelian.
Categories: Journal Articles

Distribution of single nucleotide variants on protein-protein interaction sites and its relation to minor allele frequency

Protein Science - Wed, 11/18/2015 - 18:44
Abstract

Recent advances in DNA sequencing techniques have identified rare single nucleotide variants with less than 1% minor allele frequency. Despite the growing interest and physiological importance of rare variants in genome sciences, less attention has been paid to the allele frequency of variants in protein sciences. To elucidate the characteristics of genetic variants on protein interaction sites, from the viewpoints of the allele frequency and the structural position of variants, we mapped about 20,000 human SNVs onto protein complexes. We found that variants are less abundant in protein interfaces, and specifically the core regions of interfaces. The tendency to “avoid” the interfacial core is stronger among common variants than rare variants. As amino acid substitutions, the trend of mutating amino acids among rare variants is consistent in different interfacial regions, reflecting the fact that rare variants result from random mutations in DNA sequences, whereas amino acid changes of common variants vary between the interfacial core and rim regions, possibly due to functional constraints on proteins. This study illustrated how the allele frequency of variants relates to the protein structural regions and the functional sites in general, and will lead to deeper understanding of the potential deleteriousness of rare variants at the structural level. Exceptional cases of the observed trends will shed light on the limitations of structural approaches to evaluate the functional impacts of variants. This article is protected by copyright. All rights reserved.

Categories: Journal Articles
Syndicate content