Data Foundation and Terminology: Data Models
I hope that summer is going well and look forward to seeing you all at the 2nd RDA Plenary. The DFT WG is progressing and we will very shortly have a Models paper, lead by Peter Wittenburg, out for review.
At the Gottenburg meeting the DFT WG discussed soliciting input on terms from other WGs and at this point this input would be very useful. Out next step will involve analysis of model and terms to bring some things into harmony. We would therefore like to ask if you have some candidate terms/concepts from your preliminary work that you would like to us include in our work.
Thoughts and input can be sent to me and Peter W and, of course, posted on the RDA site as you choose. Discussion is likely to take place there over the next month or so in preparation for the Plenary.
Best wishes and thanks in anticipation
Gary Berg-Cross, Ph.D.
The file referenced by Gary is now online in the filedepot, you can get it with the link below.
Author: Reagan Moore
Date: 23 Jul, 2013
I read through the data model descriptions, which primarily focused on identification and access methods for digital objects.
I am interested in a generalization of the data model that is applied within the iRODS data grid. The challenge is that the environment that is used to managed the digital object is equally important. We had to consider name spaces for both the environment in which the digital object is managed as well as a name space for the digital object. The trivial case is the need for a name space for users, which is required if access controls are going to be enforced.
The generalized model is to consider:
This made it possible to develop name spaces for digital objects, collections, users, storage systems, and policies.
We could then impose:
Author: Chris Morris
Date: 26 Jul, 2013
When a paper is retracted, it would be great if some appropriate markup was supplied to future lookup of the papers that cited it.
This simple use case is a reminder that the life cycle of research data is a little richer than that of data in general. Some extra information is needed to support the management of good research practice, and the appropriate steps when misconduct is suspected:
- links from results to researchers' declaration of interests
- a link to details of the funding source (statistics show that a study of the efficacy of a drug has different significance if publicly funded that if funded by the manufacturer)
- if there were human subjects, links to the consent given and the report of the ethical review
- a link to any retraction that applies
BioMedBridges is discussing these questions, but we are some way from a proposed ontology.
Author: Gary Berg-Cross
Date: 29 Jul, 2013
Thank you for this comment, which does extend some of the thinking on documentation and markup/annotation of research data. My first thought is that the additional "markup' would be an extension of a metadata standard such as the Dublin Core with the items you cite such as "declaration of interests."
This is a topic for that WG to consider and one of us should cross-post it to them, and we can raise it at the 2nd Plenary when cross-group discussions take place.
Within a stanard template like the Dublin Core there is a very general domain category. Differen domains, such as BioMed may have a need for some specific documentation, per your example. Thus the standard temple may need to be further extended in some systematic way for different reserach fields.
Author: Peter Wittenburg
Date: 25 Aug, 2013
During the last weeks we worked on an analysis of the models that have been presented so far. Yet we did not manage to include Reagan's model ideas which will be the next step in revising DM1 and DM2.
We are working on the following documents and whenever they are published via this forum, people are welcome to respond.
Data Models 1: Overview uploaded earlier - version 1
Data Models 2: Analysis (of data models) now uploaded - version 0.2
Data Models 3: Analysis of Workflows to come
Data Models 4: Synthesis to come - will need thorough and open discussion
Data Models 5: Terminology to come
Everyone interested is invited to comment on the DM 2 document.
We are planning to discuss the documents
- at the coming DFT virtual session 27.9.2013 at 4 pm CET
- at the plenary DFT session
You find all uploaded documents of DFT at the following link:
Author: Gary Berg-Cross
Date: 27 Aug, 2013
You can find the latest DFT WG document on Model Analysis in the File Repository.
The exact link is:
Author: Gary Berg-Cross
Date: 06 Sep, 2013
We have had some interesting exchanges on email about PIDs and such. If interested you should be able to read these on the DFT email archive at:
Author: Simon Cox
Date: 17 Sep, 2013
Perhaps a review of some of the work on registry-repository models could provide some useful insight.
I suggest looking at the 'Procedures for registration' standard from ISO/TC 211
ISO 19135 http://www.iso.org/iso/catalogue_detail.htm?csnumber=32553
Sorry this is an ISO document so you or your library has to purchase it (not different from academic journals :-) ) but there is some good stuff inside so I recommend taking a look.
And also the OASIS ebXML Registry-Repository model https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=regrep This particular implementation is probably a dead technology now, but the coupled registry-repository theory is good.
While each of these are expressed in terms of a detailed object model with many attributes defined, the general principle are similar: description, lifecycle, access.
For terminology definitions, also look here: http://www.isotc211.org/Terminology.htm