Abstract:
Managing the profusion and accumulated volumes of life-science data is cumbersome; transferring them can require anything from shipping a hard drive to paying a graduate student to babysit transfers. Indiana University’s Data Capacitor solves this problem by exporting a high-performance Lustre file system across wide area networks to multiple locations. A mounted file system lets researchers run simple and familiar commands without having to contend with special tools for data transfer. Moreover, multiple mounts let researchers compute against their data from anywhere. To meet the insatiable bandwidth demands of life scientists, network infrastructure providers are increasingly offering 100 Gigabit circuits. IU recently used Lustre across a 100 Gigabit network spanning 2,300 miles to demonstrate application performance across a great distance. This presentation will describe the Data Capacitor cyber infrastructure and associated work, explore future use cases applicable to bioinformatics, and explain how the National Center for Genome Analysis Support (NCGAS) at Indiana University intends to integrate the Data Capacitor into their workflows.