PLOS One

Publishing science

Updated: 8 years 20 weeks ago

Genome-Wide Identification of SSR and SNP Markers Based on Whole-Genome Re-Sequencing of a Thailand Wild Sacred Lotus (Nelumbo nucifera)

Wed, 11/25/2015 - 17:00

by Jihong Hu, Songtao Gui, Zhixuan Zhu, Xiaolei Wang, Weidong Ke, Yi Ding

Genomic resources such as single nucleotide polymorphism (SNPs), insertions and deletions (InDels) and SSRs (simple sequence repeats) are essential for crop improvement and better utilization in genetic breeding. However, the resources for the sacred lotus (Nelumbo nucifera Gaertn.) are still limited. In the present study, to dissect large-scale genomic molecular marker resources for sacred lotus, we re-sequenced a Thailand sacred lotus cultivar ‘Chiang Mai wild lotus’ and compared with the reported lotus genome ‘Middle lake wild lotus’. A total of 3,180,059 SNPs, 328, 251 InDels and 14,191 SVs were found between the two genomes. The functional impact analyses of these SNPs indicated that they may be involved in metabolic processes, binding, catalytic activity, etc. Mining the genome sequences for SSRs showed that 191,657 SSRs were identified with a frequency of one SSR per 4.23 kb and 103,656 SSR primer pairs were designed. Furthermore, 14, 502 EST-SSRs were also indentified using the available RNA-seq data in the NCBI. A subset of 150 SSRs (genomic and EST-SSRs) was randomly selected for validation and genetic diversity analysis. The genotypes could be easily distinguished using these SSR markers and the ‘Chiang Mai wild lotus’ was obviously differentiated from the other Chinese accessions. This study provides considerable amounts of genomic resources and markers for the quantitative trait locus (QTL) identification and molecular selection of the species, which could have a potential role in various applications in sacred lotus breeding.

Functional Cross-Talking between Differentially Expressed and Alternatively Spliced Genes in Human Liver Cancer Cells Treated with Berberine

Wed, 11/25/2015 - 17:00

by Zhen Sheng, Yi Sun, Ruixin Zhu, Na Jiao, Kailin Tang, Zhiwei Cao, Chao Ma

Berberine has been identified with anti-proliferative effects on various cancer cells. Many researchers have been trying to elucidate the anti-cancer mechanisms of berberine based on differentially expressed genes. However, differentially alternative splicing genes induced by berberine might also contribute to its pharmacological actions and have not been reported yet. Moreover, the potential functional cross-talking between the two sets of genes deserves further exploration. In this study, RNA-seq technology was used to detect the differentially expressed genes and differentially alternative spliced genes in BEL-7402 cancer cells induced by berberine. Functional enrichment analysis indicated that these genes were mainly enriched in the p53 and cell cycle signalling pathway. In addition, it was statistically proven that the two sets of genes were locally co-enriched along chromosomes, closely connected to each other based on protein-protein interaction and functionally similar on Gene Ontology tree. These results suggested that the two sets of genes regulated by berberine might be functionally cross-talked and jointly contribute to its cell cycle arresting effect. It has provided new clues for further researches on the pharmacological mechanisms of berberine as well as the other botanical drugs.

Transcriptome Analysis Revealed Highly Expressed Genes Encoding Secondary Metabolite Pathways and Small Cysteine-Rich Proteins in the Sclerotium of Lignosus rhinocerotis

Wed, 11/25/2015 - 17:00

by Hui-Yeng Y. Yap, Yit-Heng Chooi, Shin-Yee Fung, Szu-Ting Ng, Chon-Seng Tan, Nget-Hong Tan

Lignosus rhinocerotis (Cooke) Ryvarden (tiger milk mushroom) has long been known for its nutritional and medicinal benefits among the local communities in Southeast Asia. However, the molecular and genetic basis of its medicinal and nutraceutical properties at transcriptional level have not been investigated. In this study, the transcriptome of L. rhinocerotis sclerotium, the part with medicinal value, was analyzed using high-throughput Illumina HiSeqTM platform with good sequencing quality and alignment results. A total of 3,673, 117, and 59,649 events of alternative splicing, novel transcripts, and SNP variation were found to enrich its current genome database. A large number of transcripts were expressed and involved in the processing of gene information and carbohydrate metabolism. A few highly expressed genes encoding the cysteine-rich cerato-platanin, hydrophobins, and sugar-binding lectins were identified and their possible roles in L. rhinocerotis were discussed. Genes encoding enzymes involved in the biosynthesis of glucans, six gene clusters encoding four terpene synthases and one each of non-ribosomal peptide synthetase and polyketide synthase, and 109 transcribed cytochrome P450 sequences were also identified in the transcriptome. The data from this study forms a valuable foundation for future research in the exploitation of this mushroom in pharmacological and industrial applications.

Genome-Wide Identification, Phylogenetic and Expression Analyses of the Ubiquitin-Conjugating Enzyme Gene Family in Maize

Wed, 11/25/2015 - 17:00

by Dengwei Jue, Xuelian Sang, Shengqiao Lu, Chen Dong, Qiufang Zhao, Hongliang Chen, Liqiang Jia

Background

Ubiquitination is a post-translation modification where ubiquitin is attached to a substrate. Ubiquitin-conjugating enzymes (E2s) play a major role in the ubiquitin transfer pathway, as well as a variety of functions in plant biological processes. To date, no genome-wide characterization of this gene family has been conducted in maize (Zea mays).

Methodology/Principal Findings

In the present study, a total of 75 putative ZmUBC genes have been identified and located in the maize genome. Phylogenetic analysis revealed that ZmUBC proteins could be divided into 15 subfamilies, which include 13 ubiquitin-conjugating enzymes (ZmE2s) and two independent ubiquitin-conjugating enzyme variant (UEV) groups. The predicted ZmUBC genes were distributed across 10 chromosomes at different densities. In addition, analysis of exon-intron junctions and sequence motifs in each candidate gene has revealed high levels of conservation within and between phylogenetic groups. Tissue expression analysis indicated that most ZmUBC genes were expressed in at least one of the tissues, indicating that these are involved in various physiological and developmental processes in maize. Moreover, expression profile analyses of ZmUBC genes under different stress treatments (4°C, 20% PEG6000, and 200 mM NaCl) and various expression patterns indicated that these may play crucial roles in the response of plants to stress.

Conclusions

Genome-wide identification, chromosome organization, gene structure, evolutionary and expression analyses of ZmUBC genes have facilitated in the characterization of this gene family, as well as determined its potential involvement in growth, development, and stress responses. This study provides valuable information for better understanding the classification and putative functions of the UBC-encoding genes of maize.

In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae

Wed, 11/25/2015 - 17:00

by Jiří Macas, Petr Novák, Jaume Pellicer, Jana Čížková, Andrea Koblížková, Pavel Neumann, Iva Fuková, Jaroslav Doležel, Laura J. Kelly, Ilia J. Leitch

The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55–83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57%) of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%). Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

Role of Structural Dynamics at the Receptor G Protein Interface for Signal Transduction

Wed, 11/25/2015 - 17:00

by Alexander S. Rose, Ulrich Zachariae, Helmut Grubmüller, Klaus Peter Hofmann, Patrick Scheerer, Peter W. Hildebrand

GPCRs catalyze GDP/GTP exchange in the α-subunit of heterotrimeric G proteins (Gαßγ) through displacement of the Gα C-terminal α5 helix, which directly connects the interface of the active receptor (R*) to the nucleotide binding pocket of G. Hydrogen–deuterium exchange mass spectrometry and kinetic analysis of R* catalysed G protein activation have suggested that displacement of α5 starts from an intermediate GDP bound complex (R*•GGDP). To elucidate the structural basis of receptor-catalysed displacement of α5, we modelled the structure of R*•GGDP. A flexible docking protocol yielded an intermediate R*•GGDP complex, with a similar overall arrangement as in the X-ray structure of the nucleotide free complex (R*•Gempty), however with the α5 C-terminus (GαCT) forming different polar contacts with R*. Starting molecular dynamics simulations of GαCT bound to R* in the intermediate position, we observe a screw-like motion, which restores the specific interactions of α5 with R* in R*•Gempty. The observed rotation of α5 by 60° is in line with experimental data. Reformation of hydrogen bonds, water expulsion and formation of hydrophobic interactions are driving forces of the α5 displacement. We conclude that the identified interactions between R* and G protein define a structural framework in which the α5 displacement promotes direct transmission of the signal from R* to the GDP binding pocket.

Genome Analysis of Staphylococcus agnetis, an Agent of Lameness in Broiler Chickens

Wed, 11/25/2015 - 17:00

by Adnan A. K. Al-Rubaye, M. Brian Couger, Sohita Ojha, Jeff F. Pummill, Joseph A. Koon, Robert F. Wideman, Douglas D. Rhoads

Lameness in broiler chickens is a significant animal welfare and financial issue. Lameness can be enhanced by rearing young broilers on wire flooring. We have identified Staphylococcus agnetis as significantly involved in bacterial chondronecrosis with osteomyelitis (BCO) in proximal tibia and femorae, leading to lameness in broiler chickens in the wire floor system. Administration of S. agnetis in water induces lameness. Previously reported in some cases of cattle mastitis, this is the first report of this poorly described pathogen in chickens. We used long and short read next generation sequencing to assemble single finished contigs for the genome and a large plasmid from the chicken pathogen. Comparison of the S. agnetis genome to those of other pathogenic Staphylococci shows that S.agnetis contains a distinct repertoire of virulence determinants. Additionally, the S. agnetis genome has several regions that differ substantially from the genomes of other pathogenic Staphylococci. Comparison of our finished genome to a recent draft genome for a cattle mastitis isolate suggests that future investigations focus on the evolutionary epidemiology of this emerging pathogen of domestic animals.

LEMONS – A Tool for the Identification of Splice Junctions in Transcriptomes of Organisms Lacking Reference Genomes

Wed, 11/25/2015 - 17:00

by Liron Levin, Dan Bar-Yaacov, Amos Bouskila, Michal Chorev, Liran Carmel, Dan Mishmar

RNA-seq is becoming a preferred tool for genomics studies of model and non-model organisms. However, DNA-based analysis of organisms lacking sequenced genomes cannot rely on RNA-seq data alone to isolate most genes of interest, as DNA codes both exons and introns. With this in mind, we designed a novel tool, LEMONS, that exploits the evolutionary conservation of both exon/intron boundary positions and splice junction recognition signals to produce high throughput splice-junction predictions in the absence of a reference genome. When tested on multiple annotated vertebrate mRNA data, LEMONS accurately identified 87% (average) of the splice-junctions. LEMONS was then applied to our updated Mediterranean chameleon transcriptome, which lacks a reference genome, and predicted a total of 90,820 exon-exon junctions. We experimentally verified these splice-junction predictions by amplifying and sequencing twenty randomly selected genes from chameleon DNA templates. Exons and introns were detected in 19 of 20 of the positions predicted by LEMONS. To the best of our knowledge, LEMONS is currently the only experimentally verified tool that can accurately predict splice-junctions in organisms that lack a reference genome.

Body Mass Index Genetic Risk Score and Endometrial Cancer Risk

Wed, 11/25/2015 - 17:00

by Jennifer Prescott, Veronica W. Setiawan, Nicolas Wentzensen, Fredrick Schumacher, Herbert Yu, Ryan Delahanty, Leslie Bernstein, Stephen J. Chanock, Chu Chen, Linda S. Cook, Christine Friedenreich, Monserrat Garcia-Closas, Christopher A. Haiman, Loic Le Marchand, Xiaolin Liang, Jolanta Lissowska, Lingeng Lu, Anthony M. Magliocco, Sara H. Olson, Harvey A. Risch, Xiao-Ou Shu, Giske Ursin, Hannah P. Yang, Peter Kraft, Immaculata De Vivo

Genome-wide association studies (GWAS) have identified common variants that predispose individuals to a higher body mass index (BMI), an independent risk factor for endometrial cancer. Composite genotype risk scores (GRS) based on the joint effect of published BMI risk loci were used to explore whether endometrial cancer shares a genetic background with obesity. Genotype and risk factor data were available on 3,376 endometrial cancer case and 3,867 control participants of European ancestry from the Epidemiology of Endometrial Cancer Consortium GWAS. A BMI GRS was calculated by summing the number of BMI risk alleles at 97 independent loci. For exploratory analyses, additional GRSs were based on subsets of risk loci within putative etiologic BMI pathways. The BMI GRS was statistically significantly associated with endometrial cancer risk (P = 0.002). For every 10 BMI risk alleles a woman had a 13% increased endometrial cancer risk (95% CI: 4%, 22%). However, after adjusting for BMI, the BMI GRS was no longer associated with risk (per 10 BMI risk alleles OR = 0.99, 95% CI: 0.91, 1.07; P = 0.78). Heterogeneity by BMI did not reach statistical significance (P = 0.06), and no effect modification was noted by age, GWAS Stage, study design or between studies (P≥0.58). In exploratory analyses, the GRS defined by variants at loci containing monogenic obesity syndrome genes was associated with reduced endometrial cancer risk independent of BMI (per BMI risk allele OR = 0.92, 95% CI: 0.88, 0.96; P = 2.1 x 10−5). Possessing a large number of BMI risk alleles does not increase endometrial cancer risk above that conferred by excess body weight among women of European descent. Thus, the GRS based on all current established BMI loci does not provide added value independent of BMI. Future studies are required to validate the unexpected observed relation between monogenic obesity syndrome genetic variants and endometrial cancer risk.

Impact of Pre-Analytical Variables on Cancer Targeted Gene Sequencing Efficiency

Wed, 11/25/2015 - 17:00

by Luiz H. Araujo, Cynthia Timmers, Konstantin Shilo, Weiqiang Zhao, Jianying Zhang, Lianbo Yu, Thanemozhi G. Natarajan, Clinton J. Miller, Ayse Selen Yilmaz, Tom Liu, Joseph Amann, José Roberto Lapa e Silva, Carlos Gil Ferreira, David P. Carbone

Tumor specimens are often preserved as formalin-fixed paraffin-embedded (FFPE) tissue blocks, the most common clinical source for DNA sequencing. Herein, we evaluated the effect of pre-sequencing parameters to guide proper sample selection for targeted gene sequencing. Data from 113 FFPE lung tumor specimens were collected, and targeted gene sequencing was performed. Libraries were constructed using custom probes and were paired-end sequenced on a next generation sequencing platform. A PCR-based quality control (QC) assay was utilized to determine DNA quality, and a ratio was generated in comparison to control DNA. We observed that FFPE storage time, PCR/QC ratio, and DNA input in the library preparation were significantly correlated to most parameters of sequencing efficiency including depth of coverage, alignment rate, insert size, and read quality. A combined score using the three parameters was generated and proved highly accurate to predict sequencing metrics. We also showed wide read count variability within the genome, with worse coverage in regions of low GC content like in KRAS. Sample quality and GC content had independent effects on sequencing depth, and the worst results were observed in regions of low GC content in samples with poor quality. Our data confirm that FFPE samples are a reliable source for targeted gene sequencing in cancer, provided adequate sample quality controls are exercised. Tissue quality should be routinely assessed for pre-analytical factors, and sequencing depth may be limited in genomic regions of low GC content if suboptimal samples are utilized.

Sequencing and Analysis of the Pseudomonas fluorescens GcM5-1A Genome: A Pathogen Living in the Surface Coat of Bursaphelenchus xylophilus

Fri, 10/30/2015 - 16:00

by Kai Feng, Ronggui Li, Yingnan Chen, Boguang Zhao, Tongming Yin

It is known that several bacteria are adherent to the surface coat of pine wood nematode (Bursaphelenchus xylophilus), but their function and role in the pathogenesis of pine wilt disease remains debatable. The Pseudomonas fluorescens GcM5-1A is a bacterium isolated from the surface coat of pine wood nematodes. In previous studies, GcM5-1A was evident in connection with the pathogenicity of pine wilt disease. In this study, we report the de novo sequencing of the GcM5-1A genome. A 600-Mb collection of high-quality reads was obtained and assembled into sequence contigs spanning a 6.01-Mb length. Sequence annotation predicted 5,413 open reading frames, of which 2,988 were homologous to genes in the other four sequenced P. fluorescens isolates (SBW25, WH6, Pf0-1 and Pf-5) and 1,137 were unique to GcM5-1A. Phylogenetic studies and genome comparison revealed that GcM5-1A is more closely related to SBW25 and WH6 isolates than to Pf0-1 and Pf-5 isolates. Towards study of pathogenesis, we identified 79 candidate virulence factors in the genome of GcM5-1A, including the Alg, Fl, Waa gene families, and genes coding the major pathogenic protein fliC. In addition, genes for a complete T3SS system were identified in the genome of GcM5-1A. Such systems have proved to play a critical role in subverting and colonizing the host organisms of many gram-negative pathogenic bacteria. Although the functions of the candidate virulence factors need yet to be deciphered experimentally, the availability of this genome provides a basic platform to obtain informative clues to be addressed in future studies by the pine wilt disease research community.

A Scale-Corrected Comparison of Linkage Disequilibrium Levels between Genic and Non-Genic Regions

Fri, 10/30/2015 - 16:00

by Swetlana Berger, Martin Schlather, Gustavo de los Campos, Steffen Weigend, Rudolf Preisinger, Malena Erbe, Henner Simianer

The understanding of non-random association between loci, termed linkage disequilibrium (LD), plays a central role in genomic research. Since causal mutations are generally not included in genomic marker data, LD between those and available markers is essential for capturing the effects of causal loci on localizing genes responsible for traits. Thus, the interpretation of association studies requires a detailed knowledge of LD patterns. It is well known that most LD measures depend on minor allele frequencies (MAF) of the considered loci and the magnitude of LD is influenced by the physical distances between loci. In the present study, a procedure to compare the LD structure between genomic regions comprising several markers each is suggested. The approach accounts for different scaling factors, namely the distribution of MAF, the distribution of pair-wise differences in MAF, and the physical extent of compared regions, reflected by the distribution of pair-wise physical distances. In the first step, genomic regions are matched based on similarity in these scaling factors. In the second step, chromosome- and genome-wide significance tests for differences in medians of LD measures in each pair are performed. The proposed framework was applied to test the hypothesis that the average LD is different in genic and non-genic regions. This was tested with a genome-wide approach with data sets for humans (Homo sapiens), a highly selected chicken line (Gallus gallus domesticus) and the model plant Arabidopsis thaliana. In all three data sets we found a significantly higher level of LD in genic regions compared to non-genic regions. About 31% more LD was detected genome-wide in genic compared to non-genic regions in Arabidopsis thaliana, followed by 13.6% in human and 6% chicken. Chromosome-wide comparison discovered significant differences on all 5 chromosomes in Arabidopsis thaliana and on one third of the human and of the chicken chromosomes.

RNA-Seq Based Identification of Candidate Parasitism Genes of Cereal Cyst Nematode (Heterodera avenae) during Incompatible Infection to Aegilops variabilis

Fri, 10/30/2015 - 16:00

by Minghui Zheng, Hai Long, Yun Zhao, Lin Li, Delin Xu, Haili Zhang, Feng Liu, Guangbing Deng, Zhifen Pan, Maoqun Yu

One of the reasons for the progressive yield decline observed in cereals production is the rapid build-up of populations of the cereal cyst nematode (CCN, Heterodera avenae). These nematodes secrete so-call effectors into their host plant to suppress the plant defense responses, alter plant signaling pathways and then induce the formation of syncytium after infection. However, little is known about its molecular mechanism and parasitism during incompatible infection. To gain insight into its repertoire of parasitism genes, we investigated the transcriptome of the early parasitic second-stage (30 hours, 3 days and 9 days post infection) juveniles of the CCN as well as the CCN infected tissue of the host Aegilops variabilis by Illumina sequencing. Among all assembled unigenes, 681 putative genes of parasitic nematode were found, in which 56 putative effectors were identified, including novel pioneer genes and genes corresponding to previously reported effectors. All the 681 CCN unigenes were mapped to 229 GO terms and 200 KEGG pathways, including growth, development and several stimulus-related signaling pathways. Sixteen clusters were involved in the CCN unigene expression atlas at the early stages during infection process, and three of which were significantly gene-enriched. Besides, the protein-protein interaction network analysis revealed 35 node unigenes which may play an important role in the plant-CCN interaction. Moreover, in a comparison of differentially expressed genes between the pre-parasitic juveniles and the early parasitic juveniles, we found that hydrolase activity was up-regulated in pre J2s whereas binding activity was upregulated in infective J2s. RT-qPCR analysis on some selected genes showed detectable expression, indicating possible secretion of the proteins and putative role in infection. This study provided better insights into the incompatible interaction between H. avenae and the host plant Ae. varabilis. Moreover, RNAi targets with potential lethality were screened out and primarily validated, which provide candidates for engineering-based control of cereal cyst nematode in crops breeding.

Combining Metabolite-Based Pharmacophores with Bayesian Machine Learning Models for Mycobacterium tuberculosis Drug Discovery

Fri, 10/30/2015 - 16:00

by Sean Ekins, Peter B. Madrid, Malabika Sarker, Shao-Gang Li, Nisha Mittal, Pradeep Kumar, Xin Wang, Thomas P. Stratton, Matthew Zimmerman, Carolyn Talcott, Pauline Bourbon, Mike Travers, Maneesh Yadav, Joel S. Freundlich

Integrated computational approaches for Mycobacterium tuberculosis (Mtb) are useful to identify new molecules that could lead to future tuberculosis (TB) drugs. Our approach uses information derived from the TBCyc pathway and genome database, the Collaborative Drug Discovery TB database combined with 3D pharmacophores and dual event Bayesian models of whole-cell activity and lack of cytotoxicity. We have prioritized a large number of molecules that may act as mimics of substrates and metabolites in the TB metabolome. We computationally searched over 200,000 commercial molecules using 66 pharmacophores based on substrates and metabolites from Mtb and further filtering with Bayesian models. We ultimately tested 110 compounds in vitro that resulted in two compounds of interest, BAS 04912643 and BAS 00623753 (MIC of 2.5 and 5 μg/mL, respectively). These molecules were used as a starting point for hit-to-lead optimization. The most promising class proved to be the quinoxaline di-N-oxides, evidenced by transcriptional profiling to induce mRNA level perturbations most closely resembling known protonophores. One of these, SRI58 exhibited an MIC = 1.25 μg/mL versus Mtb and a CC50 in Vero cells of >40 μg/mL, while featuring fair Caco-2 A-B permeability (2.3 x 10−6 cm/s), kinetic solubility (125 μM at pH 7.4 in PBS) and mouse metabolic stability (63.6% remaining after 1 h incubation with mouse liver microsomes). Despite demonstration of how a combined bioinformatics/cheminformatics approach afforded a small molecule with promising in vitro profiles, we found that SRI58 did not exhibit quantifiable blood levels in mice.

A BIL Population Derived from G. hirsutum and G. barbadense Provides a Resource for Cotton Genetics and Breeding

Fri, 10/30/2015 - 16:00

by Xinhui Nie, Jianli Tu, Bin Wang, Xiaofeng Zhou, Zhongxu Lin

To provide a resource for cotton genetics and breeding, an interspecific hybridization between Gossypium hirsutum cv. Emian22 and G. barbadense acc. 3–79 was made. A population of 54 BILs (backcross inbred lines, BC1F8) was developed with the aim of transferring G. barbadense genes into G. hirsutum in order to genetically analyze these genes’ function in a G. hirsutum background and create new germplasms for breeding. Preliminary investigation of the morphological traits showed that the BILs had diverse variations in plant architecture, seed size, and fuzz color; the related traits of yield and fiber quality evaluated in 4 environments also showed abundant phenotypic variation. In order to explore the molecular diversity of the BIL population, 446 SSR markers selected at an average genetic distance of 10 cM from our interspecific linkage map were used to genotype the BIL population. A total of 393 polymorphic loci accounting for 84.4% MAF (major allele frequency) > 0.05 and 922 allele loci were detected, and the Shannon diversity index (I) was 0.417 per locus. The average introgression segment length was 16.24 cM, and an average of 29.53 segments were introgressed in each BIL line with an average background recovery of 79.8%. QTL mapping revealed 58 QTL associated with fiber quality and yield traits, and 47 favored alleles derived from the donor parent were discovered. This study demonstrated that the interspecific BIL population was enriched with much phenotypic and molecular variation which could be a resource for cotton genetics and breeding.

Somatic Variation of T-Cell Receptor Genes Strongly Associate with HLA Class Restriction

Fri, 10/30/2015 - 16:00

by Paul L. Klarenbeek, Marieke E. Doorenspleet, Rebecca E. E. Esveldt, Barbera D. C. van Schaik, Neubury Lardy, Antoine H. C. van Kampen, Paul P. Tak, Robert M. Plenge, Frank Baas, Paul I. W. de Bakker, Niek de Vries

Every person carries a vast repertoire of CD4+ T-helper cells and CD8+ cytotoxic T cells for a healthy immune system. Somatic VDJ recombination at genomic loci that encode the T-cell receptor (TCR) is a key step during T-cell development, but how a single T cell commits to become either CD4+ or CD8+ is poorly understood. To evaluate the influence of TCR sequence variation on CD4+/CD8+ lineage commitment, we sequenced rearranged TCRs for both α and β chains in naïve T cells isolated from healthy donors and investigated gene segment usage and recombination patterns in CD4+ and CD8+ T-cell subsets. Our data demonstrate that most V and J gene segments are strongly biased in the naïve CD4+ and CD8+ subsets with some segments increasing the odds of being CD4+ (or CD8+) up to five-fold. These V and J gene associations are highly reproducible across individuals and independent of classical HLA genotype, explaining ~11% of the observed variance in the CD4+ vs. CD8+ propensity. In addition, we identified a strong independent association of the electrostatic charge of the complementarity determining region 3 (CDR3) in both α and β chains, where a positively charged CDR3 is associated with CD4+ lineage and a negatively charged CDR3 with CD8+ lineage. Our findings suggest that somatic variation in different parts of the TCR influences T-cell lineage commitment in a predominantly additive fashion. This notion can help delineate how certain structural features of the TCR-peptide-HLA complex influence thymic selection.

Drug-Gene Interactions of Antihypertensive Medications and Risk of Incident Cardiovascular Disease: A Pharmacogenomics Study from the CHARGE Consortium

Fri, 10/30/2015 - 16:00

by Joshua C. Bis, Colleen Sitlani, Ryan Irvin, Christy L. Avery, Albert Vernon Smith, Fangui Sun, Daniel S. Evans, Solomon K. Musani, Xiaohui Li, Stella Trompet, Bouwe P. Krijthe, Tamara B. Harris, P. Miguel Quibrera, Jennifer A. Brody, Serkalem Demissie, Barry R. Davis, Kerri L. Wiggins, Gregory J. Tranah, Leslie A. Lange, Nona Sotoodehnia, David J. Stott, Oscar H. Franco, Lenore J. Launer, Til Stürmer, Kent D. Taylor, L. Adrienne Cupples, John H. Eckfeldt, Nicholas L. Smith, Yongmei Liu, James G. Wilson, Susan R. Heckbert, Brendan M. Buckley, M. Arfan Ikram, Eric Boerwinkle, Yii-Der Ida Chen, Anton J. M. de Craen, Andre G. Uitterlinden, Jerome I. Rotter, Ian Ford, Albert Hofman, Naveed Sattar, P. Eline Slagboom, Rudi G. J. Westendorp, Vilmundur Gudnason, Ramachandran S. Vasan, Thomas Lumley, Steven R. Cummings, Herman A. Taylor, Wendy Post, J. Wouter Jukema, Bruno H. Stricker, Eric A. Whitsel, Bruce M. Psaty, Donna Arnett

Background

Hypertension is a major risk factor for a spectrum of cardiovascular diseases (CVD), including myocardial infarction, sudden death, and stroke. In the US, over 65 million people have high blood pressure and a large proportion of these individuals are prescribed antihypertensive medications. Although large long-term clinical trials conducted in the last several decades have identified a number of effective antihypertensive treatments that reduce the risk of future clinical complications, responses to therapy and protection from cardiovascular events vary among individuals.

Methods

Using a genome-wide association study among 21,267 participants with pharmaceutically treated hypertension, we explored the hypothesis that genetic variants might influence or modify the effectiveness of common antihypertensive therapies on the risk of major cardiovascular outcomes. The classes of drug treatments included angiotensin-converting enzyme inhibitors, beta-blockers, calcium channel blockers, and diuretics. In the setting of the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium, each study performed array-based genome-wide genotyping, imputed to HapMap Phase II reference panels, and used additive genetic models in proportional hazards or logistic regression models to evaluate drug-gene interactions for each of four therapeutic drug classes. We used meta-analysis to combine study-specific interaction estimates for approximately 2 million single nucleotide polymorphisms (SNPs) in a discovery analysis among 15,375 European Ancestry participants (3,527 CVD cases) with targeted follow-up in a case-only study of 1,751 European Ancestry GenHAT participants as well as among 4,141 African-Americans (1,267 CVD cases).

Results

Although drug-SNP interactions were biologically plausible, exposures and outcomes were well measured, and power was sufficient to detect modest interactions, we did not identify any statistically significant interactions from the four antihypertensive therapy meta-analyses (Pinteraction > 5.0×10−8). Similarly, findings were null for meta-analyses restricted to 66 SNPs with significant main effects on coronary artery disease or blood pressure from large published genome-wide association studies (Pinteraction ≥ 0.01). Our results suggest that there are no major pharmacogenetic influences of common SNPs on the relationship between blood pressure medications and the risk of incident CVD.

Computational Design of Hypothetical New Peptides Based on a Cyclotide Scaffold as HIV gp120 Inhibitor

Fri, 10/30/2015 - 16:00

by Apiwat Sangphukieo, Wanapinun Nawae, Teeraphan Laomettachit, Umaporn Supasitthimethee, Marasri Ruengjitchatchawalya

Cyclotides are a family of triple disulfide cyclic peptides with exceptional resistance to thermal/chemical denaturation and enzymatic degradation. Several cyclotides have been shown to possess anti-HIV activity, including kalata B1 (KB1). However, the use of cyclotides as anti-HIV therapies remains limited due to the high toxicity in normal cells. Therefore, grafting anti-HIV epitopes onto a cyclotide might be a promising approach for reducing toxicity and simultaneously improving anti-HIV activity. Viral envelope glycoprotein gp120 is required for entry of HIV into CD4+ T cells. However, due to a high degree of variability and physical shielding, the design of drugs targeting gp120 remains challenging. We created a computational protocol in which molecular modeling techniques were combined with a genetic algorithm (GA) to automate the design of new cyclotides with improved binding to HIV gp120. We found that the group of modified cyclotides has better binding scores (23.1%) compared to the KB1. By using molecular dynamic (MD) simulation as a post filter for the final candidates, we identified two novel cyclotides, GA763 and GA190, which exhibited better interaction energies (36.6% and 22.8%, respectively) when binding to gp120 compared to KB1. This computational design represents an alternative tool for modifying peptides, including cyclotides and other stable peptides, as therapeutic agents before the synthesis process.

Probing Difference in Binding Modes of Inhibitors to MDMX by Molecular Dynamics Simulations and Different Free Energy Methods

Thu, 10/29/2015 - 16:00

by Shuhua Shi, Shaolong Zhang, Qinggang Zhang

The p53-MDMX interaction has attracted extensive attention of anti-cancer drug development in recent years. This current work adopted molecular dynamics (MD) simulations and cross-correlation analysis to investigate conformation changes of MDMX caused by inhibitor bindings. The obtained information indicates that the binding cleft of MDMX undergoes a large conformational change and the dynamic behavior of residues obviously change by the presence of different structural inhibitors. Two different methods of binding free energy predictions were employed to carry out a comparable insight into binding mechanisms of four inhibitors PMI, pDI, WK23 and WW8 to MDMX. The data show that the main factor controlling the inhibitor bindings to MDMX arises from van der Waals interactions. The binding free energies were further divided into contribution of each residue and the derived information gives a conclusion that the hydrophobic interactions, such as CH-CH, CH-π and π-π interactions, are responsible for the inhibitor associations with MDMX.

Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective

Thu, 10/29/2015 - 16:00

by Gurusamy Raman, SeonJoo Park

Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.