Terms from paper entered into RDA DFT Term Tool

19 Jan 2016

Hi, all,
I've entered the terms/definitions from our paper (
http://dx.doi.org/10.5281/zenodo.34542) into the RDA Data Foundations &
Terminology TeD-T terms tool (
http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page). This can be listed as
part of the implementation & dissemination plan of action.
Claire, I wasn't sure which terms cross-linked to which definitions in the
CASRAI research data domain glossary, but if you can let me know, I will
add those as references.
Terms entered in the RDA DFT TeD-T tool:
1. Data publishing
2. Data publishing workflows
3. Data journal (nb: format issues for internal links)
4. Data article
5. Data review
6. Data repository entry (nb: format issues for internal links)
Amy Nurnberger, Research Data Manager
Center for Digital Research and Scholarship
Columbia University / 212.851.2827
E-mail: ***@***.*** <***@***.***>
ORCID: 0000-0002-5931-072X
Twitter: @DataAtCU

  • Leonardo Candela's picture

    Author: Leonardo Candela

    Date: 20 Jan, 2016

    One of the aspects puzzling me is the fact that all these definitions are not connected to any existing piece of work. I'm confident that there are other names that are potential "synonyms" of the selected ones, e.g. "data paper" seems to me more frequent than "data article" or "data descriptor". Evidence of this is in Candela, L., Castelli, D., Manghi, P. and Tani, A. (2015), Data journals: A survey. Journal of the Association for Information Science and Technology, 66: 1747–1762. doi: 10.1002/asi.23358

    By citing / referring to existing pieces of work the given definitions might be less "out of the blue" than they appear now.

    In addition to that, I do not know whether the authors believe in Wikipedia or not. BTW it might be a good idea to attack this data source to disseminate the terminology (although it is not shared yet). 

    Some pages are already there, e.g.



  • Amy Nurnberger's picture

    Author: Amy Nurnberger

    Date: 20 Jan, 2016

    Hi, Leonardo,
    Thank you for your thoughts on this. Given the WikiMedia platform of the
    RDA DFT TeD-T term tool, I encourage you to edit, enhance, open for
    discussion, and add references to the entries for these terms. Most
    certainly, please do add the related terms for "data paper" that are
    detailed in your article.
    Regarding your idea for editing (& adding) Wikipedia entries, that's great!
    It would be especially nice to add details/images regarding the data
    publishing workflow to the Data Publishing entry.
    Are there any Wikipedians that would like to volunteer for this?
    On Wed, Jan 20, 2016 at 8:52 AM, leonardo.candela <
    ***@***.***> wrote:

  • Gary Berg-Cross's picture

    Author: Gary Berg-Cross

    Date: 20 Jan, 2016

    >One of the aspects puzzling me is the fact that all these definitions are
    not connected to any existing >piece of work. I'm confident that there are
    >One of the aspects puzzling me is the fact that all these definitions are
    not connected to any existing >piece of work. I'm confident that there are
    other names that are potential "synonyms" of the selected ones,
    One may put alternative definitions in the DFT term tool and discuss them.
    In earlier work on the core DFT definitions we had a synthesis document
    that discussed some of the variant definitions for things like digital
    objects. Below is that section of the Synthesis report showing what this
    consideration of alternatives feels like. You may want to use the tool to
    carry on such a discussion,
    1 Digital Object (DO)
    *A. Definition*
    *A digital object (DO) is represented by a bitstream, is referenced and

    by a persistent identifier and has properties being characterized by
    metadata. *
    *Note: As indicated we only talk about registered DOs in the context of
    this document. *
    *Note: Properties included in metadata include discovery, contextual,
    schema, rights, curation and provenance information. *
    *Note: A DO is said to be dynamic when the information content represented
    in a DO is changing for some period of time or even for indefinite
    *B. Elaboration*
    There are many alternative views and definitions out there, we just want to
    mention 4 of them:
    *Variant 1*
    Digital objects (or digital materials) refer to any item that is available
    digitally. (Wikipedia)
    *Variant 2*
    A digital object is composed of structured sequence of bits/bytes. As an
    object it is named. The bit sequence realizing the object can be identified
    & accessed by a unique and persistent identifier or by use of referencing
    attributes describing its properties. (in DFT Term Tool and from the
    Practical Policy WG).
    *Variant 3*
    Digital Object is also called a Digital Entity defined as
    “machine-independent data structure consisting of one or more elements in
    digital form that can be parsed by different information systems; the
    structure helps to enable interoperability among diverse information
    systems in the Internet.” (in DFT Term Tool)
    *Variant 4*
    The Fedora Commons architecture defines a generic digital object model that
    can be used to persist and deliver the essential characteristics for many
    kinds of digital content including documents, images, electronic books,
    multi-media learning objects, datasets, metadata and many others. This
    digital object model is a fundamental building block of the Content Model
    Architecture and all other Fedora-provided functionality. A Fedora object
    contains a persistent identifier. (Fedora Commons)
    *Variant 5*
    Digital objects are marked by a limited set of variable yet generic
    attributes such as editability, interactivity, openness and
    distributedness. As digital objects diffuse throughout the institutional
    fabric, these attributes and the information–based operations and
    procedures out of which they are sustained install themselves at the heart
    of social practice. (Kallinikos et.al.).
    *Variant 6*
    Digital objects consist of multiple elements, each of which consists of a
    type-value pair. Each of the types is represented by identifier and can
    thereby be interrogated individually. Identifying the data structure
    itself, instead of a specific file or folder that may contain it, or
    perhaps the machine on which it was first made available, enables
    persistent information access that is decoupled from most aspects of the
    underlying technology. (Robert E. Kahn: http://hdl.handle.net/4263537/5044)
    *Variant 7*
    A Digital Object is an entity consisting of a sequence of bits, or a set of
    sequences of bits, having an associated unique and persistent identifier. A
    DO may be static or dynamic, or some combination thereof. An entity is then
    defined as: An entity is anything that has a separate and distinct
    existence that can be uniquely identified.(Kahn et.al.)
    It is important to note that not all communities insist on a PID and
    registration as definitional of a DO. But (a) in this document we are only
    making statements on the sphere of registered data and (b) many do and the
    value of that has been a theme in some of this work and that of other RDA
    Fedora Commons, as cited in variant 4, offers an implementation of a
    specific DO model which is central to its architecture and allows users to
    bundle a number of content streams, to give it an identifier and to
    associate metadata descriptions with the bundle and its components which
    are themselves typed streams. So it fits with the definition we have
    The first variant is a highly condensed view of DOs and may lack enough
    detail to support automating some aspects of DO management. The second
    variant specifies some additional metadata for a DO and also takes a more
    flexible approach to IDs recognizing the use of local IDs. The third
    variant is a process view in part, since it focuses on DOs’ construction
    principle and its process characteristics, and does not tell us what about
    the structure is needed to enable interoperability. More information may be
    needed to help automate interoperability. The construction principle is
    reflected in the definition in an abstract way and the attribute
    descriptions within ID or metadata records will enable processing. The
    second variant makes use of the words “is being composed” instead of “is
    a”. Since we find “is a” more simple and direct we opt for these words.
    Having a name, as is suggested in variant 2, is one of its properties that
    can be found in the ID and/or metadata records; therefore it does not to be
    specified in the definition. Variant 2 does not require an ID but also
    allows using “referencing attributes” as identification basis. Here,
    however, we would clearly like to speak of externally registered persistent
    and unique identifiers, since this will be the only way to register DO’s as
    an explicit step as it is necessarily required in the domain of registered
    digital data. Variant 5 adds abstract and useful requirements which are
    essential for accessibility, but is neutral as to the role of an ID. It
    does not tell us how to document a DO to do this. Variant 6 refers to the
    internal structure of a DO and the importance of types that describe the
    elements of the DO independent of its creation contexts. In so far it comes
    close to the Fedora object model in Variant 3. Variant 7 includes the
    possibility of changing content, introduces the term "entity" and
    emphasizes the importance of being uniquely identified.
    *C. Conclusion*
    *We can conclude that the above definition is in agreement with the 7
    variants except for the explicit registration which will be essential in
    our growing domain of DOs. *

    Some repositories include passport like information with the PID which goes
    beyond pure referencing.

    This can be seen in analogy of the Internet. There are many nodes out there
    that do not have an IP address, but they cannot participate in the Internet
    Gary Berg-Cross, Ph.D.
    Member, Ontolog Board of Trustees
    Independent Consultant
    Potomac, MD
    On Wed, Jan 20, 2016 at 8:52 AM, leonardo.candela <
    ***@***.***> wrote:

submit a comment