Analysis of Memory Constrained Live Provenance
dc.altmetrics.display | false | en |
dc.contributor.author | Peng, Chen | |
dc.contributor.author | Tom, Evans | |
dc.contributor.author | Beth, Plale | |
dc.date.accessioned | 2016-04-26T02:59:32Z | |
dc.date.available | 2016-04-26T02:59:32Z | |
dc.description.abstract | We conjecture that meaningful analysis of large-scale provenance can be preserved by analyzing provenance data in limited memory while the data is still in motion; that the provenance needs not be fully resident before analysis can occur. As a proof of concept, this paper defi nes a stream model for reasoning about provenance data in motion for Big Data provenance. We propose a novel streaming algorithm for the backward provenance query, and apply it to the live provenance captured from agent-based simulations. The performance test demonstrates high throughput, low latency and good scalability, in a distributed stream processing framework built on Apache Kafka and Spark Streaming. | en |
dc.description.sponsorship | the National Science Foundation under award number 1360463 | en |
dc.identifier.uri | https://hdl.handle.net/2022/20809 | |
dc.language.iso | en_US | en |
dc.subject | live data provenance | en |
dc.subject | stream processing | en |
dc.subject | agent-based model | en |
dc.title | Analysis of Memory Constrained Live Provenance | en |
dc.type | Preprint | en |
Files
Original bundle
1 - 1 of 1
Collections
Can’t use the file because of accessibility barriers? Contact us with the title of the item, permanent link, and specifics of your accommodation need.