Analysis of Memory Constrained Live Provenance
| dc.altmetrics.display | false | |
| dc.contributor.author | Peng, Chen | |
| dc.contributor.author | Tom, Evans | |
| dc.contributor.author | Beth, Plale | |
| dc.date.accessioned | 2016-04-26T02:59:32Z | |
| dc.date.available | 2016-04-26T02:59:32Z | |
| dc.description.abstract | We conjecture that meaningful analysis of large-scale provenance can be preserved by analyzing provenance data in limited memory while the data is still in motion; that the provenance needs not be fully resident before analysis can occur. As a proof of concept, this paper defi nes a stream model for reasoning about provenance data in motion for Big Data provenance. We propose a novel streaming algorithm for the backward provenance query, and apply it to the live provenance captured from agent-based simulations. The performance test demonstrates high throughput, low latency and good scalability, in a distributed stream processing framework built on Apache Kafka and Spark Streaming. | |
| dc.description.sponsorship | the National Science Foundation under award number 1360463 | |
| dc.identifier.uri | https://hdl.handle.net/2022/20809 | |
| dc.language.iso | en_US | |
| dc.subject | live data provenance | |
| dc.subject | stream processing | |
| dc.subject | agent-based model | |
| dc.title | Analysis of Memory Constrained Live Provenance | |
| dc.type | Preprint |
Files
Original bundle
1 - 1 of 1
Collections
Can’t use the file because of accessibility barriers? Contact us