RDA Candidate Working Group Certification of Digital Repositorie

 

Case Statement
1. Working Group charter 
In order to guarantee data sharing, the long-term preservation of these data in sustainable digital repositories is a sine qua non. Data that are created and used by science and scholarship need to be managed, curated and archived, making sure that the substantial investments in preparing and presenting the content and tools will not be lost. Researchers need to be sure that the resources the repositories offer remain meaningful and usable over time. Moreover, the repositories themselves need to have sustainable business models.
Preservation and sustainability raise challenges in many areas. The main issues related to long term preservation and sustainability remain basically unresolved, as many organizational, technical, financial and legal aspects remain open. Certification is therefore fundamental in guaranteeing the trustworthiness of digital repositories and thus in sustaining the opportunities for long-term data sharing.
The Working Group will build on previous work in the area of certification. It will deliver the global overview and the necessary recommendations and requirements that allow the effective implementation of certification of digital repositories on a national, European and even global level.
 
The deliverables of the Working Group are:
1.Report on the current state of affairs in the area of certification (month 1-6)
2.Set of recommended strategies on the further development and implementation of the certification framework (month 7-18)
With the support of these deliverables funders can oblige researchers to deposit their data in a digital repository with a certain level of trustworthiness according to international standards.
In order to facilitate and stimulate researchers to adequately manage their data during their research, data management should be eligible for funding. Furthermore, the development of a network of trusted digital repositories could be stimulated with some seed money to cover part of the expenses of digital repositories of their certification activities. 
 
2. Value proposition
Since the mid nineties there has been a demand for a way to judge whether or not a repository is doing it’s work properly. Funders of repositories and those who entrust their valuable digitally encoded information to them, urgently need to know whether their funds and their faith are well founded. Stakeholders need to know if the repository is worthy of trust. 
Certification offers repositories something to show to funders and users. Next to that the certification process will offer them advice of where improvements are needed.
Furthermore, certification could provide third parties such as publishers assurance that the technical risks associated with datasets under review have been identified at ingest stage by the repository.
 
In short the benefits of certification of digital repositories are:
Immediate as well as long term usability beyond primary users
Confidence in data
Better use of public funds
Best brains can take advantage of data wherever they are
In April of this year EC Vice-President Neelie Kroes encouraged her audience at the ALLEA General Assembly in Rome to make science open. According to Kroes “sharing data, and having the forum to openly use and build on what is shared, are essential to science. They fuel the progress and practice of scientific discovery.” Her speech was an overture to the Recommendation on Access to and Preservation of Scientific Information, adopted by the European Commission last month.  
One of the recommendations in the document specifically highlights the importance of “ensuring the quality and reliability of the infrastructure, including through the use of certification mechanisms for repositories”. This recommendation provides a policy framework that needs to be translated into concrete actions.
We have a strong conviction that trustworthy repositories are a key element in the future of research infrastructures, not only in Europe but all over the globe. 
They will enable a future in which “data are preserved in a way that optimizes scientific discovery, innovation, and societal benefit. Where appropriate, producers of data benefit from opening it to broad access and routinely deposit their data in reliable repositories. A framework of repositories work to international standards, to ensure they are trustworthy” (G8+5 Scientific Data Working Group Report, draft October 2011).
Data sharing in the long term will depend on the creation of a global network of trusted digital repositories that guarantees the research data to remain usable and meaningful over time.
 
3. Engagement with existing work in the area
Within Europe a multilevel framework for certification of trusted repositories has recently been developed. This framework consists of a sequence of three levels, in increasing trustworthiness: basic, extended and formal certification. The European Framework for Audit and Certification of Digital Repositories is a collaboration between the Data Seal of Approval, the Repository Audit and Certification Working Group of the CCSDS and the DIN Working Group ‘Trustworthy Archives – Certification’.  
These parties all lead separate groups aiming at certifying digital repositories. They wish to put in place mechanisms to ensure that the groups can collaborate in setting up an integrated framework for auditing and certifying digital repositories. This framework clearly has global potential. The work within this framework will be the basis on which the Working Group will build its own activities. 
Next to that the Working Group will take into account the work that has been done within the area of risk assessment. An example of this category of initiatives is DRAMBORA (Digital Repository Audit Method Based on Risk Assessment), which presents a methodology for self-assessment, encouraging organisations to establish a comprehensive self-awareness of their objectives, activities and assets before identifying, assessing and managing the risks implicit within their organisation. 
The Working Group will furthermore build on the work that has been done:
in other projects, for example:
the FP7 project APARSEN, including the ISO test audits ;
the PREPARDE project (Peer REview for Publication & Accreditation of Research data in the Earth sciences) funded by JISC (UK). This project addresses the issue of what criteria are needed for a repository to be considered objectively trustworthy; 
and within the framework of research infrastructures, such as CLARIN, DARIAH, CESSDA.
Finally, DCC (and others) are considering to present a proposal in the coming FP7 call on digital preservation to develop an online data management tool for EC funded projects.
Through its individual members the Working Group is closely connected to all of the initiatives and activities mentioned above.
 
4. Work plan
a. The form and description of final deliverables of the candidate Working Group
The activities of the Working Group will concentrate on two deliverables:
1. Report on the current state of affairs in the area of certification (month 1-6)
This overview will contain amongst others the following topics:
framework of certification standards 
quantification of the full costs of certification
existing funder requirements
 
2. Set of recommended strategies on the further development and implementation of the certification framework (month 7-18)
These recommended strategies will focus on the following aspects:
benefits and added value of certification to the various actors involved (funders, repositories)
maintaining consistency of certification across boundaries (including training)
awareness raising 
o where is awareness needed (funders, repositories, research communities)?
o how should it be accomplished?
o how can effectiveness be monitored?
 
outlook on further work
The recommended strategies will be aimed at both national and international funders of research and the digital repositories themselves.
The Working Group believes that the endorsement by the RDA Council of such recommendations will give a boost to the development of a network of digital repositories around the globe and to the promotion and enforcement by funders of data archiving and data sharing.
 
b. The form and description of milestones and intermediate documents, code or other deliverables that will be developed during the course of the CWG’s work 
Milestones
January 2013 IDCC13 
March 2013 RDA Launch (first physical meeting and    presentation of the WG)
Summer 2013 RDA Conference (presentation of the first draft deliverable)
November 2013 open certification workshop (second physical meeting and presentation of the first ideas on recommended strategies)
June 2014 presentation of the final deliverables
Intermediate documents
public document aimed at generating input from a broader circle of funders, researchers and repository managers (summer 2013)
draft of deliverable 1 (summer 2013)
draft of deliverable 2 (early spring 2014)
 
 
c. A description of the Working Group’s mode and frequency of operation (e.g. on-line and/or on-site, how frequently will the group meet, etc.)
During the 18 months of its existence the Working Group will have two face-to-face meetings. Ideally these meetings are organized directly adjacent to relevant international conferences. In between, the group will have regular (two or three weekly) teleconferences. Next to that a lot of mail contact is expected. 
d. A description of how the Working Group plans to develop consensus, address conflicts, stay on track and within scope, and move forward during operation 
The Working Group consists of experts in the field, most of whom already know and partly work with each other. This will give the Working Group a quick start and will also make it easier to address and openly discuss any conflicts that might arise during the process.
Furthermore, the Working Group will use a tight and detailed schedule of work and frequent teleconferences in order to stay on track, within scope and deliver on time. 
 
e. A description of the CWG’s planned approach to broader community engagement and participation
The Working Group will produce a public document that will be widely circulated in order to generate input from the broader communities of funders, researchers and repository managers.
In November 2013 the Working Group will organize a workshop that is open to everyone who is interested in the topic of certification. 
 
5. Adoption plan
With respect to the adoption and implementation of the working group deliverables, the Working Group is in the fortunate condition that the three communities/organisations that represent the three certification standards are all part of the group. The standards representatives are influential stakeholders in their own communities and they will certainly pick up the recommendations of the Working Group and adopt them within their own environment.
The international funders, like the EC and NSF, are closely related to the RDA initiative and they will certainly take notice of the deliverables of our Working Group and the recommendations that are aimed at them as important implementation stakeholders.
The repositories that are part of the Working Group membership are already convinced of the importance of certification. They can play a crucial advocating role within their own repository communities. The repositories at large will also be incorporated in the workshop that is planned.
 
6. Initial membership
Confirmed members are:
David Giaretta, ISO Repository Audit and Certification Working Group (UK)
Christian Keitel, DIN Working Group "Trusted Archives - Certification" (GER)
Henk Harmsen, Data Seal of Approval (NL)
Ingrid Dillo, European Framework for Audit and Certification of Digital Repositories (NL)
Sayeed Choudhuri, Johns Hopkins University (USA)
Paul Trilsbeek, The Language Archive, Max Planck Institute for Psycholinguistics (NL)
Dinesh Katre, Centre for Development of Advanced Computing (India)
Kevin Ashley, Digital Curation Center (UK)
Mary Vardigan, ICPSR, University of Michigan (USA)
Jared Lyle, ICPSR, University of Michigan (USA)
Peter Granda, ICPSR, University of Michigan (USA)
 
Still to be confirmed:
a representative PARADISEC/Australian Antarctic Data Center (Australia)
a representative South African Data Archive (SADA)