README FILE FOR EvidentialGene Software Created by: Donald G. Gilbert 1001 E. 3rd St. Bloomington, IN, 47405 gilbertd@indiana.edu or gilbert.bionet@gmail.com --------------------------------------- LOCATION http://hdl.handle.net/2022/22691 --------------------------------------- DOI https://doi.org/10.5967/wd0g-rb50 --------------------------------------- FOLDERS AND FILES LIST evigene13may30.tar 2013-May, 4.4M --------------------------------------- FILE INFORMATION The primary code language is Perl, with additional Unix shell scripts, documents, and other files. These tar files ('tar=tape archive') contain program scripts and documents of the EvidentialGene project. Extract the tar archive with Unix 'tar' program, as tar -xf evigene.tar into current folder, preserving run permission. Run the Perl ".pl" scripts from extracted evigene folder, as they are a package. E.g., for Unix bash shell: export evigene=`pwd`/evigene; $evigene/scripts/prot/tr2aacds.pl ..; $evigene/scripts/evgmrna2tsa.pl .. ; For Unix csh/tcsh, use "set evigene=`pwd`/evigene". Most of the shell ".sh" scripts require editing for your cluster; consider them examples. These scripts have brief -help, but most of their documentation is perl POD; read the scripts. This is a complex package, including my working scripts for several genome projects, some are obsolete now. --------------------------------------- ABSTRACT EvidentialGene is a genome informatics project for "Evidence Directed Gene Construction for Eukaryotes", for constructing high quality, accurate gene sets for animals and plants (any eukaryotes), developed by Don Gilbert at Indiana University, gilbertd at indiana edu. Construction refers to the combination of classical gene prediction, and more recent gene assembly (de-novo and genome-assisted) methods. The basic Evigene methods involve using available best-of-breed gene prediction and assembly software, combining all evidence for genes, from expressed sequences, genome assembly sequences, related species protein sequences, and any other, to annotate and score gene constructions. Over-produced constructions are classified by gene evidence for best qualities per locus, including genome-aligned and gene-transcript aligned (genome-free) locus identification. --------------------------------------- RESEARCH QUESTION(S) If you have questions please contact gilbert.bionet@gmail.com --------------------------------------- COPYRIGHT & LICENSING INFORMATION This software is licensed for reuse under a Creative Commons Attribution 3.0 license