Presentations
Permanent link for this collectionhttps://hdl.handle.net/2022/14534
Browse
Browsing Presentations by Author "Doak, Thomas G."
Now showing 1 - 6 of 6
- Results Per Page
- Sort Options
Item Harvesting Field Station Data: Automating Data Flow from Raspberry Pi Sensors to Collaborative Websites(Annual Meeting of the Organization of Biological Field Stations, 2018-09-22) Sanders, Sheri; Guido, Emmanuel; Anderson, Jazzly; Slayton, Thomas; Doak, Thomas G.Field stations increasingly leverage remote sensors for large scale environmental data collection. Here we demonstrate a proof-of-concept workflow from data collection from remote sensors to presentation of summary results on a remote - and therefore fast and stable - cloud server. Environmental data is collected via raspberry pis in several locations and the data is streamed to the server on XSEDE's Jetstream, housed in part at Indiana University, through low-bandwith messaging. The Jetstream cloud server does all the heavy lifting, exporting the data into a database, running automatically updating summary scripts to produce graphs, and hosting a Drupal-based website to present the data to collaborators or the public. While we use compact data in our demo, larger databases can be backed up on XSEDE's Wrangler, a large scale storage server also housed in part at Indiana University. The end product is automatic aggregation and back up of sensor data onto a stable website that does not require a in-house server or large bandwidth on-site. This workflow is packaged into a ready-to-use and publically-available Jetstream image, meaning researchers could use their own sensors and R code for custom graphs with very little set up. Alternatively, the image can be used to house and display larger scale databases from other data types, such as audio recordings or photography. Future work will be in developing the ability to "pick up" data via drone fly-over and aggregation of citizen science data from multiple sites.Item The National Center for Genome Analysis Support (Poster)(2012-09) Doak, Thomas G.; LeDuc, Richard; Wu, Le-Shin; Stewart, Craig A.; Henschel, Robert; Barnett, William K.Item National Center for Genome Analysis Support (Poster)(2012) Doak, Thomas G.; Wu, Le-Shin; Stewart, Craig A.; Henschel, Robert; Barnett, William K.Item Navigating the Sequence Read Archive to identify crAssphage, an ubiquitous inhabitant of the human microbiome(Jim Holland Summer Science Research Program Poster Session, 2019-07-14) Cai, Jasmine X.; Weathers, Jania G.; Leffler, Haley; Ganapaneni, Sruthi; Papudeshi, Bhavya; Sanders, Sheri; Doak, Thomas G.The declining costs of genome sequencing and growing amounts of genetic data is evolving the field of genomics to become more integrated with computational analysis. The use of high performance clusters(HPC) are necessary to compute the large amounts of data in genomic projects. However, many biologists lack the background experience in working with HPC systems, which limits their ability to best address their research questions. National Center of Genome Analysis Support (NCGAS) is an NSF funded center that focuses on filling this crevice, through helping the research through providing training as workshops, bioinformatics support on projects, and access to compute resources. As a byproduct of helping on research projects, we develop open source workflows and make them available to the community. Here we present a developed workflow that will assist researchers in mining the sequence read archive (SRA), to identify other environments/datasets potentially contain a genome of interest, and identify their closely related genomes. As a proof of concept, we used two genomes to test the developed workflow. We selected these two different genomes to ensure the flexibility of the workflow to generate results in formats to aid further downstream analysis based on the research question.The developed pipeline will be made available through an NSF cloud computing platform, Jetstream with documentation to the research community.Item Providing National Cyberinfrastructure to Biologists, esp. genomicists(2015-03-23) Barnett, William K.; Doak, Thomas G.The presentation outlines the science and research addressed by NCGAS, the resources NCGAS provides, and the near- to-mid-term future of bioinformatics research.Item A workflow to identify genomes in the Sequence Read Archive for phylogenomic analysis(American Society for Microbiology 2019, 2019-06-23) Leffler, Haley; Ganapaneni, Sruthi; Papudeshi, Bhavya; Ganote, Carrie; Sanders, Sheri; Doak, Thomas G.The declining costs of genome sequencing and growing amounts of genetic data is evolving the field of genomics to become more integrated with computational analysis. The use of high performance clusters(HPC) are necessary to compute the large amounts of data in genomic projects. However, many biologists lack the background experience in working with HPC systems, which limits their ability to best address their research questions. National Center of Genome Analysis Support (NCGAS) is an NSF funded center that focuses on filling this crevice, through helping the research through providing training as workshops, bioinformatics support on projects, and access to compute resources. As a byproduct of helping on research projects, we develop open source workflows and make them available to the community. Here we present a developed workflow that will assist researchers in mining the sequence read archive (SRA), to identify other environments/datasets potentially contain a genome of interest, and identify their closely related genomes. As a proof of concept, we used two genomes to test the developed workflow. We selected these two different genomes to ensure the flexibility of the workflow to generate results in formats to aid further downstream analysis based on the research question.The developed pipeline will be made available through an NSF cloud computing platform, Jetstream with documentation to the research community.