Analysis of Memory Constrained Live Provenance

dc.altmetrics.displayfalseen
dc.contributor.authorPeng, Chen
dc.contributor.authorTom, Evans
dc.contributor.authorBeth, Plale
dc.date.accessioned2016-04-26T02:59:32Z
dc.date.available2016-04-26T02:59:32Z
dc.description.abstractWe conjecture that meaningful analysis of large-scale provenance can be preserved by analyzing provenance data in limited memory while the data is still in motion; that the provenance needs not be fully resident before analysis can occur. As a proof of concept, this paper defi nes a stream model for reasoning about provenance data in motion for Big Data provenance. We propose a novel streaming algorithm for the backward provenance query, and apply it to the live provenance captured from agent-based simulations. The performance test demonstrates high throughput, low latency and good scalability, in a distributed stream processing framework built on Apache Kafka and Spark Streaming.en
dc.description.sponsorshipthe National Science Foundation under award number 1360463en
dc.identifier.urihttps://hdl.handle.net/2022/20809
dc.language.isoen_USen
dc.subjectlive data provenanceen
dc.subjectstream processingen
dc.subjectagent-based modelen
dc.titleAnalysis of Memory Constrained Live Provenanceen
dc.typePreprinten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
streamProv.pdf
Size:
899.16 KB
Format:
Adobe Portable Document Format
Can’t use the file because of accessibility barriers? Contact us with the title of the item, permanent link, and specifics of your accommodation need.