An open, universal literature-data cross-linking service - RDA/WDS Publishing Data Services WG Recommendations

    You are here


An open, universal literature-data cross-linking service - RDA/WDS Publishing Data Services WG Recommendations

By Hylke Koers

  RDA/WDS Publishing Data Services WG

Recommendation Title: An open, universal literature-data cross-linking service

Impact: Improves visibility, discoverability, re-use and reproducibility by bringing existing article/data links together, normalize them using a common schema, and expose the full set as an open service

ExternalServices using the cross-linking service:

  • The Scholix initiative, a high level interoperability framework for exchanging information about the links between scholarly literature and data.
  • The DLI Service,the first exemplar aggregation and query service fed by the Scholix open information ecosystem.
Recommendation package DOI:

Group Chair: 

Hylke Koers, Elsevier

Adrian Burton, Australian National Data Service


The ICSU-WDS & RDA Publishing Data Services group proposes an approach to sharing information about the links between the literature and research data. A set of hubs will collect literature­data (as well as data­data) links from their natural communities using minor extensions to existing local procedures and, in some cases, inference. The hubs agree on an interoperability framework with a common information model and open exchange methods, optimised for exchanging information among the hubs. The hubs will serve as an enabling global information infrastructure for the development of (third party) services.

The Recommendation is also published at the following DOI:

Relevant to this recommendation is also the release of the Scholix framework by ICSU-WDS and RDA, details can be found here:

Output Status: 
RDA Endorsed Recommendations
Group content visibility: 
Use group defaults
Primary WG Focus / Output focus: 
Domain Agnostic: 
Domain Agnostic
  • Megan O'Donnell's picture

    Author: Megan O'Donnell

    Date: 29 Jun, 2016

    I would like to suggest that the term "data-data" be avoided when possible. There is already a lot of confusion, and misuse, of the terms and concepts related to data metadata, data record metadata, and data.

    Prehaps "rdata-data", with the "r" refering to "research" or "record", could function as a possible replacement?

  • Andrea Perego's picture

    Author: Andrea Perego

    Date: 29 Jun, 2016

    Thanks for sharing this work!

    I have a couple of comments, that I include below.

    1. The recommendation is heavily relying on specialistic services and protocols, able to comply with a number of requirements. Given that, it is not completely clear to me what happens in scenarios where some of the resources to be linked are not in such infrastructure. In my experience, this is quite a common scenario - e.g., a team creating a dataset from a dataset created and maintained by a team from a different institution / country / continent.

    2. Following point (1), I would suggest considering how to complement this infrastructure with solutions able to work also in contexts using mainstream technologies. For instance, it is nowadays common practice to embed structured metadata in Web pages, following mechanisms as HTML+RDFa, Microformats and Microdata. This approach is also natively supported by data catalogue platforms as CKAN. This basically means that there are already a huge amount of "bibliographic citations" available on the Web, that can be used as a basis for linking data, publications, and software, if properly exploited. And this may be also a way to address the issue at point (1).


  • Vijay Kumar Mishra's picture

    Author: Vijay Kumar Mishra

    Date: 30 Jun, 2016

    I thnk this effort is really good.But we should avoid the use of words like "data data " or "rdata-data".Either this type of words should be clarified completely or should not be repeated more than expected.Also there should be a global linkage between data and literatures.

  • Greg Tananbaum's picture

    Author: Greg Tananbaum

    Date: 14 Nov, 2016

    Q1. Is this output applicable to your organisation?

    This output is applicable to SPARC in that it addresses an issue our members (academic libraries) are likely to confront in the coming months and years.  Many SPARC-affiliated universities and colleges operate institutional repositories, CRIS systems, and other platforms tasked with tracking the output of their researchers.  As data becomes “a first class citizen of scholarly communication”, tracking it and tying it to other research outputs will be important to SPARC members.

    Q2. Have you adopted this output, or might you consider adopting it?

    SPARC is an umbrella organization.  As such, it would not be appropriate to directly adopt the output.  However, SPARC is increasingly interested in calling members’ attention to scholarly communication infrastructure.  We would be happy to consider webinars, dissemination of materials, and other mechanisms to put the SCHOLIX opportunity in front of our community.

    Q3 As an OA member, how do you believe this output furthers the RDA mission (or not)?

    I don’t feel sufficiently immersed in RDA activities to confidently answer this question.

  • Sebastian Karcher's picture

    Author: Sebastian Karcher

    Date: 23 Nov, 2016

    Q1. Is this output applicable to your organisation?

    As most data repositories, the Qualitative Data Repository is very interested in more closely tracking data-article linkages for the data in our holding. Absent notifications from authors/depositors, this is currently an impossible task for a small repository such as QDR. The Scholix framework, once widely adopted, would provide significant benefits to our organization.


    Q2. Have you adopted this output, or might you consider adopting it?

    One of the benefits of Scholix is that it puts very little demand on individual organizations, concentrating efforts in the “hubs.” We are making a concerted effort to include all article-data linkages we are aware of in the metadata we deposit with DataCite. We also strongly encourage all depositors to adopt proper data citation practices, which will typically mean better article-data linkage data via CrossRef.

    We currently do not use any DLI input on landing pages, given that the data are still (understandably) rather incomplete, but we are monitoring the progress closely and are likely to take advantage of DLI (or another linking service based on scholix) in the future.


    Q3 As an OA member, how do you believe this output furthers the RDA mission (or not)?

    Scholix answers to a key demand in tracking the re-use of data in literature. Helping to establish the infrastructure that easily allows for and gives access to article-data (and data-data, etc.) linkages is exactly what RDA should be about.

    Two additional comments:

    Outreach: At the Denver RDA meeting, one of the most common questions was how individual repositories could support/adopt Scholix. This turns out to be quite easy, especially where a repository is already working with DOIs/DataCite. An easy-to-find and easy-to-follow guideline for that would be very helpful (and should ideally include some clarification on the relation types recommended in the metadata) and should be released quickly rather than relying on Force11, CODATA, etc. for dissemination (though obviously those channels should be used).

    Scope: While hard to do, it would be very valuable if Scholix did not just include metadata going forward but also article-data linkages from past work (as openAire has collected to some extent. I would recommend outreach to various other organizations that have collected such data, including the Infolys project in Germany and the data collected by ICPSR on citations to their holdings. We would also encourage to think beyond “articles” as a linking object for data. While we understand that RDA is dominated by disciplines where articles predominate, in large parts of the social science and humanities, books continue to play an important role and should be included not just in the metadata specifications (which I believe they are) but also in the language used to promote Scholix.

    Sebastian Karcher, Associate Director, for the Qualitative Data Repository

submit a comment