Analysis of Memory Constrained Live Provenance
Loading...
Can’t use the file because of accessibility barriers? Contact us with the title of the item, permanent link, and specifics of your accommodation need.
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Permanent Link
Abstract
We conjecture that meaningful analysis of large-scale provenance can be preserved by analyzing provenance data in limited memory while the data is still in motion; that the provenance needs not be fully resident before analysis can occur. As a proof of concept, this paper defi nes a stream model for reasoning about provenance data in motion for Big Data provenance. We propose a novel streaming algorithm for the backward provenance query, and apply it to the live provenance captured from agent-based simulations. The performance test demonstrates high throughput, low latency and good scalability, in a distributed stream processing framework built on Apache Kafka and Spark Streaming.
Description
Keywords
live data provenance, stream processing, agent-based model
Citation
Journal
DOI
Link(s) to data and video for this item
Relation
Rights
Type
Preprint