MLBio+Laboratory Machine Learning in Biomedical Informatics

BMC Genomics

Syndicate content
The latest research articles published by BMC Genomics
Updated: 49 weeks 5 days ago

Transcriptomics of differential vector competence: West Nile virus infection in two populations of Culex pipiens quinquefasciatus linked to ovary development

Sat, 06/21/2014 - 20:00
Background: Understanding mechanisms that contribute to viral dissemination in mosquito vectors will contribute to our ability to interfere with the transmission of viral pathogens that impact public health. The expression of genes in two Culex pipiens quinquefasciatus populations from Florida with known differences in vector competence to West Nile virus (WNV) were compared using high throughput sequencing. Results: A total of 15,176 transcripts were combined for comparison of expression differences between the two populations and 118 transcripts were differentially expressed (p < 0.05). The fold change in expression of the differentially expressed genes ranged from -7.5 - 6.13. The more competent population for WNV (Gainesville) over expressed 77 genes and down regulated 44 genes, compared with the less competent population for WNV (Vero Beach). Also, splicing analysis identified 3 transcripts with significantly different splice forms between the two populations. The functional analysis showed that the largest proportion of transcripts was included in the catalytic activity and transporter activity groups except for those in the unknown group. Interestingly, the up- regulated gene set contained most of the catalytic activity function and the down- regulated gene set had a notable proportion of transcripts with transporter activity function. Immune response category was shown in only the down regulated gene set, although those represent a relatively small portion of the function. Several different vitellogenin genes were expressed differentially. Based on the RNAseq data analysis, ovary development was compared across the populations and following WNV infection. There were significant differences among the compared groups. Conclusions: This study suggests that ovary development is correlated to vector competence in two Culex populations in Florida. Both populations control energy allocations to reproduction as a response to WNV. This result provides novel insight into the defense mechanism used by Culex spp. mosquitoes against WNV.

Supplemental carnitine affects the microRNA expression profile in skeletal muscle of obese Zucker rats

Fri, 06/20/2014 - 20:00
Background: In the past, numerous studies revealed that supplementation with carnitine has multiple effects on performance characteristics and gene expression in livestock and model animals. The molecular mechanisms underlying these observations are still largely unknown. Increasing evidence suggests that microRNAs (miRNAs), a class of small non-coding RNA molecules, play an important role in post-transcriptional regulation of gene expression and thereby influencing several physiological and pathological processes. Based on these findings, the aim of the present study was to investigate the influence of carnitine supplementation on the miRNA expression profile in skeletal muscle of obese Zucker rats using miRNA microarray analysis. Results: Obese Zucker rats supplemented with carnitine had higher concentrations of total carnitine in plasma and muscle than obese control rats (P < 0.05). miRNA expression profiling in skeletal muscle revealed a subset of 152 miRNAs out of the total number of miRNAs analysed (259) were identified to be differentially regulated (adjusted P-value < 0.05) by carnitine supplementation. Compared to the obese control group, 111 miRNAs were up-regulated and 41 down-regulated by carnitine supplementation (adjusted P-value < 0.05). 14 of these miRNAs showed a log2 ratio >= 0.5 and 7 miRNAs showed a log2 ratio <= -0.5 (adjusted P-value < 0.05). After confirmation by qRT-PCR, 11 miRNAs were found to be up-regulated and 6 miRNAs were down-regulated by carnitine supplementation (P < 0.05). Furthermore, a total of 1,446 target genes within the validated miRNAs were revealed using combined three bioinformatic algorithms. Analysis of Gene Ontology (GO) categories and KEGG pathways of the predicted targets revealed that carnitine supplementation regulates miRNAs that target a large set of genes involved in protein-localization and -transport, regulation of transcription and RNA metabolic processes, as well as genes involved in several signal transduction pathways, like ubiquitin-mediated proteolysis and longterm depression, are targeted by the miRNAs regulated by carnitine supplementation. Conclusion: The present study shows for the first time that supplementation of carnitine affects a large set of miRNAs in skeletal muscle of obese Zucker rats suggesting a novel mechanism through which carnitine exerts its multiple effects on gene expression, which were observed during the past.

Exercise induction of gut microbiota modifications in obese, non-obese and hypertensive rats

Fri, 06/20/2014 - 20:00
Background: Obesity is a multifactor disease associated with cardiovascular disorders such as hypertension. Recently, gut microbiota was linked to obesity pathogenesisand shown to influence the host metabolism. Moreover, several factors such as host-genotype and life-style have been shown to modulate gut microbiota composition. Exercise is a well-known agent used for the treatment of numerous pathologies, such as obesity and hypertension; it has recently been demonstrated to shape gut microbiota consortia. Since exercise-altered microbiota could possibly improve the treatment of diseases related to dysfunctional microbiota, this study aimed to examine the effect of controlled exercise training on gut microbial composition in Obese rats (n = 3), non-obese Wistar rats (n = 3) and Spontaneously Hypertensive rats (n = 3). Pyrosequencing of 16S rRNA genes from fecal samples collected before and after exercise training was used for this purpose. Results: Exercise altered the composition and diversity of gut bacteria at genus level in all rat lineages. Allobaculum (Hypertensive rats), Pseudomonas and Lactobacillus (Obese rats) were shown to be enriched after exercise, while Streptococcus (Wistar rats), Aggregatibacter and Sutturella (Hypertensive rats) were more enhanced before exercise. A significant correlation was seen in the Clostridiaceae and Bacteroidaceae families and Oscillospira and Ruminococcus genera with blood lactate accumulation. Moreover, Wistar and Hypertensive rats were shown to share a similar microbiota composition, as opposed to Obese rats. Finally, Streptococcus alactolyticus, Bifidobacterium animalis, Ruminococcus gnavus, Aggregatibacter pneumotropica and Bifidobacterium pseudolongum were enriched in Obese rats. Conclusions: These data indicate that non-obese and hypertensive rats harbor a different gut microbiota from obese rats and that exercise training alters gut microbiota from an obese and hypertensive genotype background.

RNA-Seq analysis reveals that multiple phytohormone biosynthesis and signal transduction pathways are reprogrammed in curled-cotyledons mutant of soybean [Glycine max (L.) Merr.]

Fri, 06/20/2014 - 20:00
Background: Soybean is one of the most economically important crops in the world. The cotyledon is the nutrient storage area in seeds, and it is critical for seed quality and yield. Cotyledon mutants are important for the genetic dissection of embryo patterning and seed development. However, the molecular mechanisms underlying soybean cotyledon development are largely unexplored. Results: In this study, we characterised a soybean curled-cotyledon (cco) mutant. Compared with wild-type (WT), anatomical analysis revealed that the cco cotyledons at the torpedo stage became more slender and grew outward. The entire embryos of cco mutant resembled the "tail of swallow". In addition, cco seeds displayed reduced germination rate and gibberellic acid (GA3) level, whereas the abscisic acid (ABA) and auxin (IAA) levels were increased. RNA-seq identified 1,093 differentially expressed genes (DEGs) between WT and the cco mutant. The KEGG pathway analysis showed many DEGs were mapped to the hormone biosynthesis and signal transduction pathways. Consistent with assays of hormones in seeds, the results of RNA-seq indicated auxin and ABA biosynthesis and signal transduction in cco were more active than in WT, while an early step in GA biosynthesis was blocked, as well as conversion rate of inactive GAs to bioactive GAs in GA signaling. Furthermore, genes participated in other hormone biosynthesis and signalling pathways such as cytokinin (CK), ethylene (ET), brassinosteroid (BR), and jasmonate acid (JA) were also affected in the cco mutant. Conclusions: Our data suggest that multiple phytohormone biosynthesis and signal transduction pathways are reprogrammed in cco, and changes in these pathways may partially contribute to the cco mutant phenotype, suggesting the involvement of multiple hormones in the coordination of soybean cotyledon development.

The transcriptomic profile of peripheral blood nuclear cells in dogs with heart failure

Fri, 06/20/2014 - 20:00
Background: In recent years advances have been made in the investigative methods of molecular background of canine heart disease. Studies have been conducted to identify specific genes which, when pathologically expressed, could lead to the dysfunction of the canine heart or are correlated with heart failure. For this purpose genome wide microarray experiments on tissues from failing hearts have been performed. In the presented study a whole genome microarray analysis was used for the first time to describe the transcription profile of peripheral blood nuclear cells in dogs with heart failure. Dogs with recognized heart disease were classified according the ISACHC (International Small Animal Cardiac Health Council) classification scheme as class 1 (asymptomatic) - 13 dogs, class 2 (mild to moderate heart failure) - 13 dogs and class 3 (severe heart failure) - 12 dogs. The control group consisted of 14 healthy dogs. The clinical picture of the animals included: animal history, clinical examination, echocardiographic examination and where applicable electrocardiographic and radiographic examinations. Results: In the present study we identified four sets of differentially expressed genes, namely heart-failure-specific genes and ISACHC1-specific genes, ISACHC2-sepcific genes and ISACHC-3 specific genes. The most important set consisted of genes differentially expressed in all dogs with heart failure, despite the ISACHC stage. We identified 71 heart-failure-specific genes which were involved in two statistically significant receptor signalling pathways, namely angiotensinR - > CREB/ELK-SRF/TP53 signalling and ephrinR - > actin signalling. The number of ISACHC1-specific genes was 83; ISACHC2-specific genes - 1247 and ISACHC3-specific - 200. Conclusions: The transcriptomic profile of peripheral blood nuclear cells in dogs with heart failure seems to reflect the presence of clinical signs of the disease in patients based on the observation that the largest number of differentially expressed genes was identified in ISACHC 2 group of patients. This group consists of dogs just starting to show clinical signs of heart failure. A set of genes was also found to have changed expression in all dogs with heart failure, despite the stage of the disease.

Identification of host-microbe interaction factors in the genomes of soft rot-associated pathogens Dickeya dadantii 3937 and Pectobacterium carotovorum WPP14 with supervised machine learning

Fri, 06/20/2014 - 20:00
Background: A wealth of genome sequences has provided thousands of genes of unknown function, but identification of functions for the large numbers of hypothetical genes in phytopathogens remains a challenge that impacts all research on plant-microbe interactions. Decades of research on the molecular basis of pathogenesis focused on a limited number of factors associated with long-known host-microbe interaction systems, providing limited direction into this challenge. Computational approaches to identify virulence genes often rely on two strategies: searching for sequence similarity to known host-microbe interaction factors from other organisms, and identifying islands of genes that discriminate between pathogens of one type and closely related non-pathogens or pathogens of a different type. The former is limited to known genes, excluding vast collections of genes of unknown function found in every genome. The latter lacks specificity, since many genes in genomic islands have little to do with host-interaction.Result: In this study, we developed a supervised machine learning approach that was designed to recognize patterns from large and disparate data types, in order to identify candidate host-microbe interaction factors. The soft rot Enterobacteriaceae strains Dickeya dadantii 3937 and Pectobacterium carotovorum WPP14 were used for development of this tool, because these pathogens are important on multiple high value crops in agriculture worldwide and more genomic and functional data is available for the Enterobacteriaceae than any other microbial family. Our approach achieved greater than 90% precision and a recall rate over 80% in 10-fold cross validation tests. Application of the learning scheme to the complete genome of these two organisms generated a list of roughly 200 candidates, many of which were previously not implicated in plant-microbe interaction and many of which are of completely unknown function. Conclusion: These lists provide new targets for experimental validation and further characterization, and our approach presents a promising pattern-learning scheme that can be generalized to create a resource to study host-microbe interactions in other bacterial phytopathogens.

Genomic insights into the serine protease gene family and expression profile analysis in the planthopper, Nilaparvata lugens

Fri, 06/20/2014 - 20:00
Background: The brown planthopper (Nilaparvata lugens) is one of the most destructive rice plant pests in Asia. N. lugens causes extensive damage to rice by sucking rice phloem sap, which results in hopper burn (complete death of the rice plants). Despite its importance, little is known about the digestion, development and defense mechanisms of this hemimetabolous insect pest. In this study, we aim to identify the serine protease (SP) and serine protease homolog (SPH) genes, which form a large family in eukaryotes, due to the potential for multiple physiological roles. Having a fully sequenced genome for N. lugens allows us to perform in-depth analysis of the gene structures, reveal the evolutionary relationships and predict the physiological functions of SP genes. Results: The genome- and transcriptome-wide analysis identified 90 putative SP (65) and SPH (25) genes in N. lugens. Detailed gene information regarding the exon-intron organization, size, distribution and transcription orientation in the genome revealed that many SP/SPH loci are closely situated on the same scaffold, indicating the frequent occurrence of gene duplications in this large gene family. The gene expression profiles revealed new findings with regard to how SPs/SPHs respond to bacterial infections as well as their tissue-, development- and sex-specific expressions. Conclusions: Our findings provide comprehensive gene sequence resources and expression profiles of the N. lugens SP and SPH genes, which give insights into clarifying the potentially functional roles of these genes in the biological processes including development, digestion, reproduction and immunity.

ShrimpGPAT: a gene and protein annotation tool for knowledge sharing and gene discovery in shrimp

Fri, 06/20/2014 - 20:00
Background: Although captured and cultivated marine shrimp constitute highly important seafood in terms of both economic value and production quantity, biologists have little knowledge of the shrimp genome and this partly hinders their ability to improve shrimp aquaculture. To help improve this situation, the Shrimp Gene and Protein Annotation Tool (ShrimpGPAT) was conceived as a community-based annotation platform for the acquisition and updating of full-length complementary DNAs (cDNAs), Expressed Sequence Tags (ESTs), transcript contigs and protein sequences of penaeid shrimp and their decapod relatives and for in-silico functional annotation and sequence analysis.Description: ShrimpGPAT currently holds quality-filtered, molecular sequences of 14 decapod species (~500,000 records for six penaeid shrimp and eight other decapods). The database predominantly comprises transcript sequences derived by both traditional EST Sanger sequencing and more recently by massive-parallel sequencing technologies. The analysis pipeline provides putative functions in terms of sequence homologs, gene ontologies and protein-protein interactions. Data retrieval can be conducted easily either by a keyword text search or by a sequence query via BLAST, and users can save records of interest for later investigation using tools such as multiple sequence alignment and BLAST searches against pre-defined databases. In addition, ShrimpGPAT provides space for community insights by allowing functional annotation with tags and comments on sequences. Community-contributed information will allow for continuous database enrichment, for improvement of functions and for other aspects of sequence analysis. Conclusions: ShrimpGPAT is a new, free and easily accessed service for the shrimp research community that provides a comprehensive and up-to-date database of quality-filtered decapod gene and protein sequences together with putative functional prediction and sequence analysis tools. An important feature is its community-based functional annotation capability that allows the research community to contribute knowledge and insights about the properties of molecular sequences for better, shared, functional characterization of shrimp genes. Regularly updated and expanded with data on more decapods, ShrimpGPAT is publicly available at

Genome sequences characterizing five mutations in RNA polymerase and major capsid of phages [greek small letter phi]A318 and [greek small letter phi]As51 of Vibrio alginolyticus with different burst efficiencies

Fri, 06/20/2014 - 20:00
Background: The burst size of a phage is important prior to phage therapy and probiotic usage. The efficiency for a phage to burst its host bacterium can result from molecular domino effects of the phage gene expressions which dominate to control host machinery after infection. We found two Podoviridae phages, [greek small letter phi]A318 and [greek small letter phi]As51, burst a common host V. alginolyticus with different efficiencies of 72 and 10 PFU/bacterium, respectively. Presumably, the genome sequences can be compared to explain their differences in burst sizes. Results: Among genes in 42.5 kb genomes with a GC content of 43.5%, 16 out of 47 open-reading frames (ORFs) were annotated to known functions, including RNA polymerase (RNAP) and phage structure proteins. 11 strong phage promoters and three terminators were found. The consensus sequence for the new vibriophage promoters is AATAAAGTTGCCCTATA, where the AGTTG bases of -8 through -12 are important for the vibriophage specificity, especially a consensus T at -9 position eliminating RNAP of K1E, T7 and SP6 phages to transcribe the genes. [greek small letter phi]A318 and [greek small letter phi]As51 RNAP shared their own specific promoters. In comparing [greek small letter phi]As51 with [greek small letter phi]A318 genomes, only two nucleotides were deleted in the RNAP gene and three mutating nucleotides were found in the major capsid genes. Conclusion: Subtle analyses on the residue alterations uncovered the effects of five nucleotide mutations on the functions of the RNAP and capsid proteins, which account for the host-bursting efficiency. The deletion of two nucleotides in RNAP gene truncates the primary translation due to early stop codon, while a second translational peptide starting from GTG just at deletion point can remediate the polymerase activity. Out of three nucleotide mutations in major capsid gene, H53N mutation weakens the subunit assembly between capsomeres for the phage head; E313K reduces the fold binding between beta-sheet and Spine Helix inside the peptide.

Identification, characterization, and utilization of single copy genes in 29 angiosperm genomes

Fri, 06/20/2014 - 20:00
Background: Single copy genes are common across angiosperm genomes. With the sufficiently high quality sequenced genomes, the identification of large-scale single copy genes among multiple species is possible. Although some characteristics have been reported, our study provides novel insights into single copy genes. Results: We identified single copy genes across 29 angiosperm genomes. A significant negative correlation was found between the number of duplicate blocks and the number of single copy genes. We found that a considerable number of single copy genes are located in organelles, showing a preference for binding and catalytic activity. The analysis of effective number of codons (Nc) illustrates that single copy genes have a stronger codon bias than non-single copy genes in eudicots. The relative high expression level of single copy genes was partially confirmed by the RNA-seq data, rather than the Codon Adaptation Index (CAI). Unlike in most other species, a strongly negatively correlation occurs between Nc and GC3 among single copy genes in grass genomes. When compared to all non-single copy genes, single copy genes indicate more conservation (as indicated by Ka and Ks values). But our alternative splicing (AS) results reveal that selective constraints are weaker in single copy genes than in low copy family genes (1-10 in-paralogs) and stronger than high copy family genes (>10 in-paralogs). Using concatenated shared single copy genes, we obtained a well-resolved phylogenetic tree. With the addition of intron sequences, the branch support is improved, but striking incongruences are also evident. Therefore, it is noteworthy that inclusion of intron sequences seems more appropriate for the phylogenetic reconstruction at lower taxonomic levels. Conclusions: Our analysis provides insight into the evolutionary characteristics of single copy genes across 29 angiosperm genomes. The results suggest that there are key differences in evolutionary constraints between single copy genes and non-single copy genes. And to some extent, these evolutionary constraints show some species-specific differences, especially between eudicots and monocots. Our preliminary evidence also suggests that the concatenated shared single copy genes are well suited for use in resolving phylogenetic relationships.

Powered by Drupal, an open source content management system