It’s not a data deluge – it’s worse than that

No Thumbnail Available
Can’t use the file because of accessibility barriers? Contact us with the title of the item, permanent link, and specifics of your accommodation need.

Date

2010-06-22

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

IU was among the many organizations that developed the phrase “data deluge” to describe the prodigious capabilities of digital instruments to produce data. A deluge calls to mind an extremely heavy rain, or maybe being drenched by a large wave. Unfortunately the situation we have is worse than that. The new capabilities of next-generation sequencing machines, digital video, and the capability of scientists to put high-output devices in remote locations makes the data issue far more challenging that it has ever been. This talk focuses on two general areas of handling data issues: wide area filesystems and movement of data across long distances; and the challenges of data management when data production rates simply exceed the capabilities of the network connecting source to analysis facilities. Examples will be drawn from use of the IU Data Capacitor, now the most widely used globally-accessible file system in the history of the TeraGrid; and field studies with data sources ranging from the Antarctic ice cap to African villages to telescopes on remote mountains. Some successes and many emerging challenges will be discussed.

Description

Keynote presentation at the Third International Workshop on Data Intensive Distributed Computing (DIDC'10) held in conjunction with HPDC'10, Chicago IL.

Keywords

data management, high throughput, data storage

Citation

Journal

DOI

Link(s) to data and video for this item

Relation

Rights

Type

Presentation

Collections