Re: [rda-datafabric-ig][rda-collection-wg] Re: [rda-datafabric-ig][rda-dft][rda-collection-wg] Re: [rda-datafabric-ig][rda-collection-wg] Some thoughts on "Data Aggregations" terminology & concepts

    You are here

19 Apr 2016

Dear Jeremy, Jeff, Gary,
You are right, that the purpose of the collection definition of the
collection WG is to set a minimum bar to get as much specificity as
necessary in order to outline an API at the end, that is able to handle
specific queries on collections.
The question whether DOs might be identified by a formal ID, or by a
query, or by some other method, is currently not really solved in this
context, and the idea to construct a collection by some function is
rather new in the collection WG.
The decision to define the collection as multiple sets (one of
identifiers, one of links, and one of metadata), and not just define it
as a set of digital objects, exactly came from these kind of lacks in
the current definition of identifiers and DO. To circumvent this
ambiguity at least for our purposes it seems to be better to be more
explicit here and describe the different sets used.
My favourite definition for collection would actually be: A collection
is a digital object which is identified by a PID and consists of a set
of PIDs/Ids - full stop
This would be a really nice definition, wouldn't it, but this implies a
lot of difficult questions, that we probably better avoid at this point
in time.
For instance it would imply a definition of identifiers, that includes
multiple possible (typed) identifications by each identifier. For
instance one identifications identifies the DO itself, another
identifies the metadata, and a third would point to lets say a previous
version or a collection of dependent citations. But this would be rather
restrictive for the choice of identifier systems and not all of them
could be used then anymore, a political question.
And furthermore it would need a clear view on how a DO, constructed by a
query or some other method, can be identified by a PID, which is still
an open question. We need to discuss this in the collection WG anyway.