Provenance capture and use in a satellite data processing pipeline. Jensen, S., Plale, B., Aktas, M., Luo, Y., Chen, P., & Conover, H. IEEE Transactions on Geoscience and Remote Sensing, 2013.
doi  abstract   bibtex   
With the interdependencies that exist between data in a scientific processing pipeline, the ability to track the provenance of the scientific process through multiple stages is necessary to determining the usability of the resulting data product. In this paper, we study the capture of provenance from an existing NASA instrument ingest pipeline. Since instrumenting the scientific code for a production system is not feasible, we show how provenance events can be scavenged from log files to generate detailed provenance graphs. Through extensions to the Karma provenance system, which have been implemented on a test instance of the AMSR-E production data pipeline, we determine that when the volume of provenance information is high, provenance graph visualizations provide a good tool for monitoring the ingest pipeline and identifying processing differences in ways not seen before. Two novel uses of provenance that we present in this paper are comparisons between processing runs and forward provenance for viewing downstream dependencies. © 1980-2012 IEEE.
@article{
 title = {Provenance capture and use in a satellite data processing pipeline},
 type = {article},
 year = {2013},
 volume = {51},
 id = {059f02bc-7bda-3180-893c-1a2a28f14a82},
 created = {2019-10-01T17:20:45.021Z},
 file_attached = {false},
 profile_id = {42d295c0-0737-38d6-8b43-508cab6ea85d},
 last_modified = {2019-10-01T17:23:18.332Z},
 read = {false},
 starred = {false},
 authored = {true},
 confirmed = {true},
 hidden = {false},
 citation_key = {Jensen2013},
 folder_uuids = {73f994b4-a3be-4035-a6dd-3802077ce863},
 private_publication = {false},
 abstract = {With the interdependencies that exist between data in a scientific processing pipeline, the ability to track the provenance of the scientific process through multiple stages is necessary to determining the usability of the resulting data product. In this paper, we study the capture of provenance from an existing NASA instrument ingest pipeline. Since instrumenting the scientific code for a production system is not feasible, we show how provenance events can be scavenged from log files to generate detailed provenance graphs. Through extensions to the Karma provenance system, which have been implemented on a test instance of the AMSR-E production data pipeline, we determine that when the volume of provenance information is high, provenance graph visualizations provide a good tool for monitoring the ingest pipeline and identifying processing differences in ways not seen before. Two novel uses of provenance that we present in this paper are comparisons between processing runs and forward provenance for viewing downstream dependencies. © 1980-2012 IEEE.},
 bibtype = {article},
 author = {Jensen, S. and Plale, B. and Aktas, M.S. and Luo, Y. and Chen, P. and Conover, H.},
 doi = {10.1109/TGRS.2013.2266929},
 journal = {IEEE Transactions on Geoscience and Remote Sensing},
 number = {11}
}

Downloads: 0