It’s not a data deluge – it’s worse than that
No Thumbnail Available
Can’t use the file because of accessibility barriers? Contact us with the title of the item, permanent link, and specifics of your accommodation need.
Date
2010-06-22
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Permanent Link
Abstract
IU was among the many organizations that developed the phrase “data deluge” to describe the prodigious capabilities of digital instruments to produce data. A deluge calls to mind an extremely heavy rain, or maybe being drenched by a large wave. Unfortunately the situation we have is worse than that. The new capabilities of next-generation sequencing machines, digital video, and the capability of scientists to put high-output devices in remote locations makes the data issue far more challenging that it has ever been. This talk focuses on two general areas of handling data issues: wide area filesystems and movement of data across long distances; and the challenges of data management when data production rates simply exceed the capabilities of the network connecting source to analysis facilities. Examples will be drawn from use of the IU Data Capacitor, now the most widely used globally-accessible file system in the history of the TeraGrid; and field studies with data sources ranging from the Antarctic ice cap to African villages to telescopes on remote mountains. Some successes and many emerging challenges will be discussed.
Description
Keynote presentation at the Third International Workshop on Data Intensive Distributed Computing (DIDC'10) held in conjunction with HPDC'10, Chicago IL.
Keywords
data management, high throughput, data storage
Citation
Journal
DOI
Link(s) to data and video for this item
Relation
Rights
Type
Presentation