Technical reports

Permanent link for this collection

Browse

Recent Submissions

Now showing 1 - 5 of 5
  • Item
    MCB100147, Genome Informatics for Animals and Plants
    (2017-07) Gilbert, Don
    Renewal of this XSEDE Genome Informatics for Animals and Plants project will facilitate the accurate discovery and reconstruction of animal and plant genes, in current and future genomics collaborations, including those by this author and those independently undertaken. Precision genomics is essential in medicine, environmental health, sustainable agriculture, and research in biological sciences. Yet the popular genome informatics methods lag behind the high levels of accuracy and completeness in gene construction that are attainable with today's accurate RNA-seq data. EvidentialGene is a genome informatics pipeline for gene construction that has a measurably high accuracy and completeness rate, for the range of animals and plants. This pipeline algorithm is simple and robust, compared to gene modeling pipelines, and often outperforms their gene reconstructions.
  • Item
    MCB100147, Genome Informatics for Animals and Plants
    (2016-04) Gilbert, Don
    Renewal of this XSEDE Genome Informatics for Animals and Plants project will facilitate the accurate discovery and reconstruction of animal and plant genes, in current and future genomics collaborations, including those by this author and those independently undertaken. Precision genomics is essential in medicine, environmental health, sustainable agriculture, and biological research. Yet popular genome informatics methods lag behind the high levels of accuracy and completeness in gene construction that are attainable with current RNA-seq data. EvidentialGene is a genome informatics pipeline for gene construction that has a measurably high accuracy and completeness rate, for insects, ticks and crustaceans to crop plants and trees, to fishes and other vertebrates. It uses big data from gene sequencers, generating bigger gene sets than alternate methods, then efficiently reduces those into accurate species gene sets using biological criteria of protein codes and orthology.
  • Item
    MCB10014, Genome Informatics for animals and plants related NSF Award: 0640462, Shared genome database informatics and cyberinfrastructure
    (2014-10) Gilbert, Don
    The research focus is developing for and using shared cyberinfrastructure for assembly, annotation and comparative analysis of new eukaroyte genomes. This sixth year has produced substantial results with EvidentialGene: 1. A complete gene set and genome annotation for the environmental, population genomic killifish Fundulus heteroclitus; 2. A complete, finished gene set for the environmental genomic model water flea Daphnia magna; 3. Assistance to the loblolly pine genome project [1] with transcript assembly methods and computations; 4. Draft gene assemblies for honey bee (Apis mel.) and deer tick (Ixodes scap.) of significance for agricultural and health improvements. In conjunction with gene set production, new algorithms for merging methods of transcriptome and genome construction have been conceived, developed and implemented in the EvidentialGene code set for these projects.
  • Item
    MCB10014, Genome Informatics for animals and plants related NSF Award: 0640462, Shared genome database informatics and cyberinfrastructure
    (2013-07) Gilbert, Don
    The primary research focus is developing for and using shared cyberinfrastructure for assembly, annotation and comparative analysis of new organism genomes. This fifth year effort has produced substantial results with EvidentialGene, introduced in prior reports. In conjunction with this software engineering, it has been used to produce well-annotated gene sets for several animal and plant species. An engineering discovery has been made and substantiated: that gene construction from mRNA-seq data is now surpassing genome-based gene predictions in biological accuracy. This has ramifications to many areas of biosciences and related health and agriculture fields that rely on accurate gene information from animals and plants. The software developed in this project is now being used by others to advance their gene discovery for a range of organisms.
  • Item
    MCB10014, Genome Informatics for animals and plants related NSF Award: 0640462, Shared genome database informatics and cyberinfrastructure
    (2012-06) Gilbert, Don
    The primary research focus is developing for and using shared cyberinfrastructure for assembly, annotation and comparative analysis of new organism genomes. This fourth year effort has refined the genome/gene annotation software pipeline EvidentialGene, introduced the in prior annual report. In conjunction with this software engineering, it has been used to produce well annotated gene sets for Daphnia magna waterflea, pea aphid, Nasonia jewel wasp, and Theobroma cacao chocolate tree. This work uses the NSF-funded shared cyberinfrastructure of Teragrid/XSEDE and National Center for Genome Analysis Support.