๐Ÿ“œ Execution metadata_provo.xml: Execution Provenance Metadata๏ƒ

Among the metadata automatically generated during a CCP method execution, the metadata_provo.xml file plays a central role in ensuring traceability, reproducibility, and adherence to the FAIR principles (Findable, Accessible, Interoperable, Reusable).

This file is created at runtime and captures a structured record of the execution process using the W3C PROV-O ontology, extended with CCP-specific terms (d4s: namespace).

Purpose and FAIR Alignment๏ƒ

The metadata_provo.xml document provides a detailed and machine-readable description of:

  • What was executed: method name, version, and parameters

  • Which infrastructure and runtime (Docker image) was used

  • Who triggered the execution (user and agent)

  • Where the execution took place (VRE context)

  • When it started and ended

  • What was produced (standard output, error, results)

This aligns with the FAIR principles by:

  • Ensuring findability of methods and outputs through persistent identifiers

  • Promoting accessibility via downloadable and interoperable XML

  • Supporting interoperability through PROV-O semantics

  • Enabling reuse through complete and contextual provenance information

File Availability๏ƒ

  • Generated automatically with each execution

  • Included in the output.zip if archiving is enabled

  • Stored and referenced in archived executions in the CCP interface

Structure Overview๏ƒ

The XML file contains:

  • <prov:activity>: The execution session, with start/end time, operator name, version, status, and the triggering agent

  • <prov:entity>: Each input, output, parameter, and resource is represented as an entity, with detailed types and descriptions

  • <prov:person> and <prov:softwareAgent>: The user and system initiating the process

Each <prov:entity> includes:

  • A prov:value indicating its actual content or reference

  • A link to the activity that used or generated it

  • Type annotations such as d4s:IMPORTED, d4s:COMPUTED, d4s:text/plain, etc.

๐Ÿงช Example Snippet๏ƒ

<prov:activity id="activity-...">
  <prov:startTime>2025-03-25T09:40:54.797Z</prov:startTime>
  <prov:endTime>2025-03-25T09:50:30.136Z</prov:endTime>
  <prov:type xsi:type="xsd:QName">d4s:computation</prov:type>
  <prov:softwareAgent prov:id="d4s:itineris.d4science.org"/>
  <prov:person prov:id="d4s:alfredo.oliviero"/>
  <prov:entity prov:id="d4s:operator_name">
    <prov:value>GDAL - Raster TIFF Details - Workspace</prov:value>
  </prov:entity>
</prov:activity>

๐Ÿ’ก Note Notes๏ƒ

  • The file is not meant for manual editing

  • It is especially useful in automated workflows, scientific validation, and long-term preservation

  • Fields such as d4s:status, d4s:VRE, and d4s:operator_description offer additional context

โœ… CCP-generated provenance allows external tools, auditors, or collaborators to reconstruct not only what was done, but also how, where, and by whom โ€” making it a key enabler of transparent and reproducible science.