The Metadata IG will concern itself with all aspects of metadata for research data. In particular, it will attempt to coordinate the efforts of the WGs concerned with metadata to produce a coherent approach to metadata covering metadata modalities of description, restriction, navigation, provenance, preservation and the use of metadata for the purposes discovery, contextualisation, validation, analytical processing, simulation, visualisation and interoperation. It will also liaise with the other WGs especially Data Foundation and Terminology, PIDs, Standardisation of data categories and codes and Data Citation. This IG activity relates to data management policies and plans of research organisations and researchers, and to policies and standards of research funders and of research communities which may or may not be official standards.
The metadata IG will organise itself through online meetings and face-to-face meetings of members of the IG present at RDA Plenary events. It is proposed that – while membership is open to any RDA registered member – key members will be the leaders of the WGs concerned with metadata.
Metadata Principles - Created and endorsed by all the RDA metadata groups
- The only difference between metadata and data is the mode of use
- Metadata is not just for data, it is also for users, software services, computing resources
- Metadata is not just for description and discovery; it is also for contextualisation (relevance, quality, restrictions (rights, costs)) and for coupling users, software and computing resources to data (to provide a VRE)
- Metadata must be machine-understandable as well as human-understandable for autonomicity (formalism)
- Management (meta)data is also relevant (research proposal, funding, project information, research outputs, outcomes, impact…)
The Metadata IG did not hold a session at RDA Plenary 20 in Goteborg, for several reasons:
- The work of the group had been somewhat disrupted by COVID;
- Plenaries during COVID had not been well attended;
- In general, it was observable that requirements had evolved;
- Technologies had evolved with new initiatives relevant to metadata (as just one example, I-Adopt);
At P20 the co-chairs sat in on sessions of other groups to get an informal ‘landscape picture’ of where we are with metadata. Just about every group with whom we interacted has requirements for metadata, and those requirements increasingly require more complex metadata. The other good news is that RDA participants seem to realize that ‘library catalog card’ metadata is insufficient and that metadata with complex structures are required to provide the information necessary for applications in each domain of interest.
Regarding technology, the big push is towards graph representations of information structures, particularly knowledge graphs with base entities/objects (such as a dataset or software service) as vertices/nodes and relationships between them (with rich semantics) as edges/arcs. Such representations may use triplestores - with RDF triples or utilize relational (or object-relational) stores - with n-tuples – as infrastructure.
Thus, the co-chairs propose to restart the work on the MIG metadata element set (see: https://www.rd-alliance.org/groups/metadata-ig.html ). We already have some volunteers to act as editors for some of the elements – please volunteer to join the group around each of the elements in which you have an interest and where you are willing to contribute to the discussions (and if there is currently no leader for that element, contact the co-chairs to offer to do this important coordinating job). This should be a self-organizing activity (we are all volunteers in this community). The co-chairs will try to monitor the developing activity and assist as necessary.
Metadata with formal syntax and declared semantics are even more necessary now, to ensure FAIRness but also with the range of application areas requiring such metadata, and especially to ensure that systems increasingly utilizing AI (not just the currently popular large language models but more for data management, analytics, simulation, and visualization) have a basis in formal logic.
- Put your name down against metadata elements of interest https://docs.google.com/spreadsheets/d/1Y-mhE5gRZmaFRBl-HDm5hn23cCHJg1Yf8Gq-jeI89Kc/edit?usp=sharing (you all have edit access);
- If you are willing to lead the activity on an element contact the co-chairs.
Metadata Element Set:
The metadata groups intend to recommend the following metadata element set. Please note that each element needs 'unpacking' to get to something recognizable and actionable by a computer. The comments from reviewers are listed for each element. The folder for all elements is here.
As noted above, these are elements, not single-valued attributes. Most will have internal syntax (structure) and use of terms that require declared semantics. Also it is not exhaustive; it is expected that particular subject domains will have much greater lists of elements. This list is intended to be the recommend list of elements that should be provided by all within RDA to
- permit discovery,
- support contextualisation (assessment of relevance and value) and
- facilitate action (interoperation including query and integration).
Use Case Analysis:
The initial use case Analysis was presented in Session 9 joint meeting of all the metadata groups at Plenary 6 in Paris. Below are some revised slides based on the feedback from that meeting and the master use case spreadsheet showing the process.
Metadata Standards Catalog:
The RDA Metadata Standards Catalog Working Group supports an open directory of metadata standards applicable to scientific data that is both human-readable and machine-readable using an API. Additions or updates to the directory can be made here.
Metadata Standards Directory:
The RDA Metadata Standards Directory Working Group supports a collaborative, open directory of metadata standards applicable to scientific data. Additions or updates to the directory can be made here.
FAIR guiding principles published in Nature journal
The FAIR Principles address these needs by providing a precise and measurable set of qualities a good data publication should exhibit - qualities that ensure that the data is Findable, Accessible, Interoperable, and Reusable (FAIR). The FAIR Guiding Principles for scientific data management and stewardship have been published.
- 2021-11-03 - 2021-11-11 RDA Plenary 18 Virtual
- 2021-04-20 - 2021-04-23 RDA Plenary 17 Edinburgh (Virtual)
- 2020-11-09 - 2020-11-12 RDA Plenary 16 Costa Rica (Virtual)
- 2019-10-23 - 2019-10-25 RDA Plenary 14 Helsinki
- 2019-04-02 - 2019-04-04 RDA Plenary 13 Philadelphia
- 2018-11-05 - 2018-11-08 RDA Plenary 12 Gabarone
- 2018-03-19 - 2018-03-21 RDA Plenary 11 Berlin
- 2017-09-19 - 2017-09-21 RDA Plenary 10 Montreal
- 2017-04-05 - 2017-04-07 RDA Plenary 9 Barcelona
- 2016-09-15 - 2016-09-17 RDA Plenary 8 Denver
- 2016-03--01 - 2016-03-03 RDA Plenary 7 Tokyo
- 2015-09-23 - 2015-09-25 RDA Plenary 6 Paris
- 2015-08-03- 2015-11-03 RDA Plenary 5 San Diego
- 2014-09-22 - 2014-09-24 RDA Plenary 4 Amsterdam
- 2014-03-26 - 2014-03-28 RDA Plenary 3 Dublin
- 2014-02-24 - 2014-02-25 RDA Europe Munich Meeting