Data Processing Bundle

This page is used to discuss recommendations about data processing (managing, curation, extraction, mining, deriving, etc.) steps which will increasingly be done by automatic mechanisms (workflows, procedures) and which needs to be separated from data publication and data archiving (if creating unused copies is meant by archiving). We distinguish 4 statements:

  1. Suggestions which come from RDA WG outputs (RDA WG)
  2. Suggestions which emerge from RDA discussions (RDA)
  3. Suggestions from other initiatives
  4. Suggestions for RDA Recommendations (RDA REC) which will be the result of RDA interactions

If there are more statements that relate to this bundle that come from other initiatives we should add them.

PRC1. RDA-PP: A trustworthy repository must specify auditable practical policies for its various tasks, turn them into executable procedures and workflows, and systematically apply them in all cases to document provenance of all its digital objects.

PRC2. RDA: Workflows and procedures that create new digital objects need to include software components that read the existing PID record and metadata and that associate a new PID, create new metadata incl. provenance both to be associated with the new DO and upload it into a trustworthy repository.

PRC3. RDA: Annotations need to be created in stand-off manner and where suitable Open Annotation Format should be applied.



From these statements we can draw a number of obviously widely agreed recommendations:

REC1: (to come as a process of finding convergence in an open discussion)