Nucleic Acids Research
The conserved 3'X terminal domain of hepatitis C virus genomic RNA forms a two-stem structure that promotes viral RNA dimerization
The 3'X domain of hepatitis C virus is a strongly conserved structure located at the 3' terminus of the viral genomic RNA. This domain modulates the replication and translation processes of the virus in conjunction with an upstream 5BSL3.2 stem–loop, and contains a palindromic sequence that facilitates RNA dimerization. Based on nuclear magnetic resonance spectroscopy and gel electrophoresis, we report here that domain 3'X adopts a structure composed of two stem–loops, and not three hairpins or a mixture of folds, as previously proposed. This structure exposes unpaired terminal nucleotides after a double-helical stem and palindromic bases in an apical loop, favoring genomic RNA replication and self-association. At higher ionic strength the domain forms homodimers comprising an intermolecular duplex of 110 nucleotides. The 3'X sequences can alternatively form heterodimers with 5BSL3.2. This contact, reported to favor translation, likely involves local melting of one of the 3'X stem–loops.
The FMRP/GRK4 mRNA interaction uncovers a new mode of binding of the Fragile X mental retardation protein in cerebellum
Fragile X syndrome (FXS), the most common form of inherited intellectual disability, is caused by the silencing of the FMR1 gene encoding an RNA-binding protein (FMRP) mainly involved in translational control. We characterized the interaction between FMRP and the mRNA of GRK4, a member of the guanine nucleotide-binding protein (G protein)-coupled receptor kinase super-family, both in vitro and in vivo. While the mRNA level of GRK4 is unchanged in the absence or in the presence of FMRP in different regions of the brain, GRK4 protein level is increased in Fmr1-null cerebellum, suggesting that FMRP negatively modulates the expression of GRK4 at the translational level in this brain region. The C-terminal region of FMRP interacts with a domain of GRK4 mRNA, that we called G4RIF, that is folded in four stem loops. The SL1 stem loop of G4RIF is protected by FMRP and is part of the S1/S2 sub-domain that directs translation repression of a reporter mRNA by FMRP. These data confirm the role of the G4RIF/FMRP complex in translational regulation. Considering the role of GRK4 in GABAB receptors desensitization, our results suggest that an increased GRK4 levels in FXS might contribute to cerebellum-dependent phenotypes through a deregulated desensitization of GABAB receptors.
Hexameric helicases are processive DNA unwinding machines but how they engage with a replication fork during unwinding is unknown. Using electron microscopy and single particle analysis we determined structures of the intact hexameric helicase E1 from papillomavirus and two complexes of E1 bound to a DNA replication fork end-labelled with protein tags. By labelling a DNA replication fork with streptavidin (dsDNA end) and Fab (5' ssDNA) we located the positions of these labels on the helicase surface, showing that at least 10 bp of dsDNA enter the E1 helicase via a side tunnel. In the currently accepted ‘steric exclusion’ model for dsDNA unwinding, the active 3' ssDNA strand is pulled through a central tunnel of the helicase motor domain as the dsDNA strands are wedged apart outside the protein assembly. Our structural observations together with nuclease footprinting assays indicate otherwise: strand separation is taking place inside E1 in a chamber above the helicase domain and the 5' passive ssDNA strands exits the assembly through a separate tunnel opposite to the dsDNA entry point. Our data therefore suggest an alternative to the current general model for DNA unwinding by hexameric helicases.
Structure and primase-mediated activation of a bacterial dodecameric replicative helicase
Replicative helicases are essential ATPases that unwind DNA to initiate chromosomal replication. While bacterial replicative DnaB helicases are hexameric, Helicobacter pylori DnaB (HpDnaB) was found to form double hexamers, similar to some archaeal and eukaryotic replicative helicases. Here we present a structural and functional analysis of HpDnaB protein during primosome formation. The crystal structure of the HpDnaB at 6.7 Å resolution reveals a dodecameric organization consisting of two hexamers assembled via their N-terminal rings in a stack-twisted mode. Using fluorescence anisotropy we show that HpDnaB dodecamer interacts with single-stranded DNA in the presence of ATP but has a low DNA unwinding activity. Multi-angle light scattering and small angle X-ray scattering demonstrate that interaction with the DnaG primase helicase-binding domain dissociates the helicase dodecamer into single ringed primosomes. Functional assays on the proteins and associated complexes indicate that these single ringed primosomes are the most active form of the helicase for ATP hydrolysis, DNA binding and unwinding. These findings shed light onto an activation mechanism of HpDnaB by the primase that might be relevant in other bacteria and possibly other organisms exploiting dodecameric helicases for DNA replication.
Microcalorimetric studies of DNA duplexes and their component single strands showed that association enthalpies of unfolded complementary strands into completely folded duplexes increase linearly with temperature and do not depend on salt concentration, i.e. duplex formation results in a constant heat capacity decrement, identical for CG and AT pairs. Although duplex thermostability increases with CG content, the enthalpic and entropic contributions of an AT pair to duplex formation exceed that of a CG pair when compared at the same temperature. The reduced contribution of AT pairs to duplex stabilization comes not from their lower enthalpy, as previously supposed, but from their larger entropy contribution. This larger enthalpy and particularly the greater entropy results from water fixed by the AT pair in the minor groove. As the increased entropy of an AT pair exceeds that of melting ice, the water molecule fixed by this pair must affect those of its neighbors. Water in the minor groove is, thus, orchestrated by the arrangement of AT groups, i.e. is context dependent. In contrast, water hydrating exposed nonpolar surfaces of bases is responsible for the heat capacity increment on dissociation and, therefore, for the temperature dependence of all thermodynamic characteristics of the double helix.
Solution structure of a DNA quadruplex containing ALS and FTD related GGGGCC repeat stabilized by 8-bromodeoxyguanosine substitution
A prolonged expansion of GGGGCC repeat within non-coding region of C9orf72 gene has been identified as the most common cause of familial amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD), which are devastating neurodegenerative disorders. Formation of unusual secondary structures within expanded GGGGCC repeat, including DNA and RNA G-quadruplexes and R-loops was proposed to drive ALS and FTD pathogenesis. Initial NMR investigation on DNA oligonucleotides with four repeat units as the shortest model with the ability to form an unimolecular G-quadruplex indicated their folding into multiple G-quadruplex structures in the presence of K+ ions. Single dG to 8Br-dG substitution at position 21 in oligonucleotide d[(G4C2)3G4] and careful optimization of folding conditions enabled formation of mostly a single G-quadruplex species, which enabled determination of a high-resolution structure with NMR. G-quadruplex structure adopted by d[(G4C2)3GGBrGG] is composed of four G-quartets, which are connected by three edgewise C-C loops. All four strands adopt antiparallel orientation to one another and have alternating syn-anti progression of glycosidic conformation of guanine residues. One of the cytosines in every loop is stacked upon the G-quartet contributing to a very compact and stable structure.
Structural basis for selective targeting of leishmanial ribosomes: aminoglycoside derivatives as promising therapeutics
Leishmaniasis comprises an array of diseases caused by pathogenic species of Leishmania, resulting in a spectrum of mild to life-threatening pathologies. Currently available therapies for leishmaniasis include a limited selection of drugs. This coupled with the rather fast emergence of parasite resistance, presents a dire public health concern. Paromomycin (PAR), a broad-spectrum aminoglycoside antibiotic, has been shown in recent years to be highly efficient in treating visceral leishmaniasis (VL)—the life-threatening form of the disease. While much focus has been given to exploration of PAR activities in bacteria, its mechanism of action in Leishmania has received relatively little scrutiny and has yet to be fully deciphered. In the present study we present an X-ray structure of PAR bound to rRNA model mimicking its leishmanial binding target, the ribosomal A-site. We also evaluate PAR inhibitory actions on leishmanial growth and ribosome function, as well as effects on auditory sensory cells, by comparing several structurally related natural and synthetic aminoglycoside derivatives. The results provide insights into the structural elements important for aminoglycoside inhibitory activities and selectivity for leishmanial cytosolic ribosomes, highlighting a novel synthetic derivative, compound 3, as a prospective therapeutic candidate for the treatment of VL.
LARVA: an integrative framework for large-scale analysis of recurrent variants in noncoding annotations
In cancer research, background models for mutation rates have been extensively calibrated in coding regions, leading to the identification of many driver genes, recurrently mutated more than expected. Noncoding regions are also associated with disease; however, background models for them have not been investigated in as much detail. This is partially due to limited noncoding functional annotation. Also, great mutation heterogeneity and potential correlations between neighboring sites give rise to substantial overdispersion in mutation count, resulting in problematic background rate estimation. Here, we address these issues with a new computational framework called LARVA. It integrates variants with a comprehensive set of noncoding functional elements, modeling the mutation counts of the elements with a β-binomial distribution to handle overdispersion. LARVA, moreover, uses regional genomic features such as replication timing to better estimate local mutation rates and mutational hotspots. We demonstrate LARVA's effectiveness on 760 whole-genome tumor sequences, showing that it identifies well-known noncoding drivers, such as mutations in the TERT promoter. Furthermore, LARVA highlights several novel highly mutated regulatory sites that could potentially be noncoding drivers. We make LARVA available as a software tool and release our highly mutated annotations as an online resource (larva.gersteinlab.org).
A key aspect of RNA secondary structure prediction is the identification of novel functional elements. This is a challenging task because these elements typically are embedded in longer transcripts where the borders between the element and flanking regions have to be defined. The flanking sequences impact the folding of the functional elements both at the level of computational analyses and when the element is extracted as a transcript for experimental analysis. Here, we analyze how different flanking region lengths impact folding into a constrained structure by computing probabilities of folding for different sizes of flanking regions. Our method, RNAcop (RNA context optimization by probability), is tested on known and de novo predicted structures. In vitro experiments support the computational analysis and suggest that for a number of structures, choosing proper lengths of flanking regions is critical. RNAcop is available as web server and stand-alone software via http://rth.dk/resources/rnacop.
Local sequence assembly reveals a high-resolution profile of somatic structural variations in 97 cancer genomes
Genomic structural variations (SVs) are pervasive in many types of cancers. Characterizing their underlying mechanisms and potential molecular consequences is crucial for understanding the basic biology of tumorigenesis. Here, we engineered a local assembly-based algorithm (laSV) that detects SVs with high accuracy from paired-end high-throughput genomic sequencing data and pinpoints their breakpoints at single base-pair resolution. By applying laSV to 97 tumor-normal paired genomic sequencing datasets across six cancer types produced by The Cancer Genome Atlas Research Network, we discovered that non-allelic homologous recombination is the primary mechanism for generating somatic SVs in acute myeloid leukemia. This finding contrasts with results for the other five types of solid tumors, in which non-homologous end joining and microhomology end joining are the predominant mechanisms. We also found that the genes recursively mutated by single nucleotide alterations differed from the genes recursively mutated by SVs, suggesting that these two types of genetic alterations play different roles during cancer progression. We further characterized how the gene structures of the oncogene JAK1 and the tumor suppressors KDM6A and RB1 are affected by somatic SVs and discussed the potential functional implications of intergenic SVs.
Cis-acting signals modulate the efficiency of programmed DNA elimination in Paramecium tetraurelia
In Paramecium, the regeneration of a functional somatic genome at each sexual event relies on the elimination of thousands of germline DNA sequences, known as Internal Eliminated Sequences (IESs), from the zygotic nuclear DNA. Here, we provide evidence that IESs’ length and sub-terminal bases jointly modulate IES excision by affecting DNA conformation in P. tetraurelia. Our study reveals an excess of complementary base pairing between IESs’ sub-terminal and contiguous sites, suggesting that IESs may form DNA loops prior to cleavage. The degree of complementary base pairing between IESs’ sub-terminal sites (termed Cin-score) is positively associated with IES length and is shaped by natural selection. Moreover, it escalates abruptly when IES length exceeds 45 nucleotides (nt), indicating that only sufficiently large IESs may form loops. Finally, we find that IESs smaller than 46 nt are favored targets of the cellular surveillance systems, presumably because of their relatively inefficient excision. Our findings extend the repertoire of cis-acting determinants for IES recognition/excision and provide unprecedented insights into the distinct selective pressures that operate on IESs and somatic DNA regions. This information potentially moves current models of IES evolution and of mechanisms of IES recognition/excision forward.
Cross-talk between competitive endogenous RNAs (ceRNAs) through shared miRNAs represents a novel layer of gene regulation that plays important roles in the physiology and development of cancers. However, a global view of their system-level properties across various types of cancers is still unknown. Here, we constructed the mRNA related ceRNA–ceRNA interaction landscape across 20 cancer types by systematically analyzing molecular profiles of 5203 tumors and miRNA regulations. Our study highlights the conserved features shared by pan-cancer and higher similarity within similar origin cell type. Moreover, a core ceRNA network was identified. Function analysis identified a common theme of cancer hallmarks, however they exhibit phenotype-specific connectivity patterns. Besides, we found a marked rewiring in the ceRNA program between various cancers, and further revealed conserved and rewired network ceRNA hubs in each cancer, which were tensely competitive interactions to constitute conserved and cancer-specific modules. By providing mechanistic linkage between known cancer miRNAs, their mediated ceRNA–ceRNA interactions, and the associations with known cancer hallmarks, the inferred cancer ceRNA–ceRNA interaction landscape will serve as a powerful public resource for further biological discoveries of tumorigenesis.
The transcription factors SOX9 and SOX5/SOX6 cooperate genome-wide through super-enhancers to drive chondrogenesis
SOX9 is a transcriptional activator required for chondrogenesis, and SOX5 and SOX6 are closely related DNA-binding proteins that critically enhance its function. We use here genome-wide approaches to gain novel insights into the full spectrum of the target genes and modes of action of this chondrogenic trio. Using the RCS cell line as a faithful model for proliferating/early prehypertrophic growth plate chondrocytes, we uncover that SOX6 and SOX9 bind thousands of genomic sites, frequently and most efficiently near each other. SOX9 recognizes pairs of inverted SOX motifs, whereas SOX6 favors pairs of tandem SOX motifs. The SOX proteins primarily target enhancers. While binding to a small fraction of typical enhancers, they bind multiple sites on almost all super-enhancers (SEs) present in RCS cells. These SEs are predominantly linked to cartilage-specific genes. The SOX proteins effectively work together to activate these SEs and are required for in vivo expression of their associated genes. These genes encode key regulatory factors, including the SOX trio proteins, and all essential cartilage extracellular matrix components. Chst11, Fgfr3, Runx2 and Runx3 are among many other newly identified SOX trio targets. SOX9 and SOX5/SOX6 thus cooperate genome-wide, primarily through SEs, to implement the growth plate chondrocyte differentiation program.
Despite the increasing knowledge about DNA methylation, the understanding of human epigenome evolution is in its infancy. Using whole genome bisulfite sequencing we identified hundreds of differentially methylated regions (DMRs) in humans compared to non-human primates and estimated that ~25% of these regions were detectable throughout several human tissues. Human DMRs were enriched for specific histone modifications and the majority were located distal to transcription start sites, highlighting the importance of regions outside the direct regulatory context. We also found a significant excess of endogenous retrovirus elements in human-specific hypomethylated.
We reported for the first time a close interplay between inter-species genetic and epigenetic variation in regions of incomplete lineage sorting, transcription factor binding sites and human differentially hypermethylated regions. Specifically, we observed an excess of human-specific substitutions in transcription factor binding sites located within human DMRs, suggesting that alteration of regulatory motifs underlies some human-specific methylation patterns. We also found that the acquisition of DNA hypermethylation in the human lineage is frequently coupled with a rapid evolution at nucleotide level in the neighborhood of these CpG sites. Taken together, our results reveal new insights into the mechanistic basis of human-specific DNA methylation patterns and the interpretation of inter-species non-coding variation.
Recent studies strongly suggest that in bacterial cells the order of genes along the chromosomal origin-to-terminus axis is determinative for regulation of the growth phase-dependent gene expression. The prediction from this observation is that positional displacement of pleiotropic genes will affect the genetic regulation and hence, the cellular phenotype. To test this prediction we inserted the origin-proximal dusB-fis operon encoding the global regulator FIS in the vicinity of replication terminus on both arms of the Escherichia coli chromosome. We found that the lower fis gene dosage in the strains with terminus-proximal dusB-fis operons was compensated by increased fis expression such that the intracellular concentration of FIS was homeostatically adjusted. Nevertheless, despite unchanged FIS levels the positional displacement of dusB-fis impaired the competitive growth fitness of cells and altered the state of the overarching network regulating DNA topology, as well as the cellular response to environmental stress, hazardous substances and antibiotics. Our finding that the chromosomal repositioning of a regulatory gene can determine the cellular phenotype unveils an important yet unexplored facet of the genetic control mechanisms and paves the way for novel approaches to manipulate bacterial physiology.
ZNF555 protein binds to transcriptional activator site of 4qA allele and ANT1: potential implication in Facioscapulohumeral dystrophy
Facioscapulohumeral dystrophy (FSHD) is an epi/genetic satellite disease associated with at least two satellite sequences in 4q35: (i) D4Z4 macrosatellite and (ii) β-satellite repeats (BSR), a prevalent part of the 4qA allele. Most of the recent FSHD studies have been focused on a DUX4 transcript inside D4Z4 and its tandem contraction in FSHD patients. However, the D4Z4-contraction alone is not pathological, which would also require the 4qA allele. Since little is known about BSR, we investigated the 4qA BSR functional role in the transcriptional control of the FSHD region 4q35. We have shown that an individual BSR possesses enhancer activity leading to activation of the Adenine Nucleotide Translocator 1 gene (ANT1), a major FSHD candidate gene. We have identified ZNF555, a previously uncharacterized protein, as a putative transcriptional factor highly expressed in human primary myoblasts that interacts with the BSR enhancer site and impacts the ANT1 promoter activity in FSHD myoblasts. The discovery of the functional role of the 4qA allele and ZNF555 in the transcriptional control of ANT1 advances our understanding of FSHD pathogenesis and provides potential therapeutic targets.
Deciphering the principles that govern mutually exclusive expression of Plasmodium falciparum clag3 genes
The product of the Plasmodium falciparum genes clag3.1 and clag3.2 plays a fundamental role in malaria parasite biology by determining solute transport into infected erythrocytes. Expression of the two clag3 genes is mutually exclusive, such that a single parasite expresses only one of the two genes at a time. Here we investigated the properties and mechanisms of clag3 mutual exclusion using transgenic parasite lines with extra copies of clag3 promoters located either in stable episomes or integrated in the parasite genome. We found that the additional clag3 promoters in these transgenic lines are silenced by default, but under strong selective pressure parasites with more than one clag3 promoter simultaneously active are observed, demonstrating that clag3 mutual exclusion is strongly favored but it is not strict. We show that silencing of clag3 genes is associated with the repressive histone mark H3K9me3 even in parasites with unusual clag3 expression patterns, and we provide direct evidence for heterochromatin spreading in P. falciparum. We also found that expression of a neighbor ncRNA correlates with clag3.1 expression. Altogether, our results reveal a scenario where fitness costs and non-deterministic molecular processes that favor mutual exclusion shape the expression patterns of this important gene family.