You are here

Body:

**NOTE: The following text has been revised. The authority Charter Statement is here.**

 

Research Data Management in Engineering

 

Introduction

Research in Engineering comprises a vast span of sub-disciplines including for example  chemical, civil, electrical, and mechanical engineering. Traditionally engineering disciplines belong to the applied sciences, which cooperate closely with industry and look to create commercial advantage from the research. Therefore open data and data sharing are rarely considered even when the research has been completed and the economic interests are secured.

 

The Interest Group on “Research Data Management in Engineering” (IG RDM4Eng) will identify, collect and compare industrial and institutional workflows, services and tools regarding ‘engineering research data’.

 

The Interest Group presents an opportunity to highlight emerging FAIR (Findable, Accessible, Interoperable, Re-usable) data approaches from scientific and industrial engineering disciplines and explores how data tools can be used ‘as a service’ to break up existing community specific 'data silos'. The proposal’s background is based on projects and initiatives in the area of engineering sciences, and especially relates to the challenges posed by contract or mission oriented research which is performed together with industrial stakeholders.

 

User scenario(s) or use case(s) the IG wishes to address

The engineering community is highly fragmented in terms of its RDM organization. Data and the descriptive documentation and software basis are crucial components of sustainable engineering research. Within the engineering science community at universities, initiatives such as the American Association of Engineering Societies (AAES) and CESAER (Conference of European Schools for Advanced Engineering Education and Research) with the Task Force Open Science (TFOS) have shown that the approach to RDM is often bottom-up. This is in contrast to other disciplines where RDM is strongly driven by professional associations (e.g. DARIAH in the humanities and ELIXIR in life sciences).

 

A major challenge is the heterogeneity of data generated by research groups, even if they investigate the same phenomenon. The data obtained in the primary data analysis are usually located on local data storage or on a backup storage by the doctoral students and scientists. Many scientists pursue their own strategy for file naming and documentation of their research data. Due to the different systems of the scientists, there are few systematic records within the discipline. Standardized metadata records are not yet widespread which results in data that are difficult to retrieve and access. Data publication is far from being the norm. The FAIR principles are far from being properly implemented.

 

Within industry, on the other hand, particularly Industry 4.0, data management comprises a strategic business approach that applies a consistent set of business processes in support of the collaborative creation, management, dissemination, and use of a product and/or a service. IT Systems supporting these industrial data management processes are known under different names:

 

  • Engineering data management (EDM system)
  • Product data management (PDM system)
  • Product life cycle management (PLM system)
  • Collaborative product development

 

Thanks to these management systems, the industrial workflows of product and service engineering are usually well documented. What is missing, however, are common interfaces and protocols for managing, accessing and re-using research data from industry and academics. Each data provider has its own service offering and returns data in different (proprietary) formats with different licenses and costs. Additionally, commercial data providers are often constrained to particular business sectors in specific geographical areas and keep their data locked within isolated data sets. The combination of these factors hinders interoperability and further uptake in FAIR data platforms (such as the proposed European Open Science Cloud) and a better data value chain around corporate information.

 

During the last year, several workshops and interviews in the CESAER context with participating engineers in the Netherlands, Ireland and Germany were performed, representing major engineering communities such as computational engineering, mechanical engineering, construction and thermodynamics.

 

Major outcomes are needs for

  • an introduction to tailored, existing methods and tools regarding RDM, which are adaptable to the needs of the specific sectors (also in combination with educational resources, e.g. persons who speak the ‘language’ and can demonstrate & adapt existing RDM tools)
  • curated standards for metadata and data modelling that are in line with the FAIR data principles
  • a harmonized software coding platform for engineers (source code presents ‘the most important’ research data within many engineering communities)
  • a basic data documentation guideline (or maybe even a standard), especially concerning contract and mission oriented research with industrial partners as well as journal guidelines
  • better, coordinated access to HPC facilities.

 

We see the introduction of an IG in RDA as an opportunity to seek solutions in a broader international context, activating engineering scientists from all over the world. We also expect that an IG will provide a stronger leverage when it comes to engaging the industrial sector. Industry members are usually not present at university-based or scientific-community based-workshops, but RDA provides a framework which is nowadays widely recognized.

 

 

 

Objectives

The proposed “Research Data Management in Engineering” Interest Group (IG RDM4Eng) seeks to bring together scientific and industrial stakeholders from all relevant sectors. The IG RDM4Eng will provide its scientific and industrial members with the opportunity to discuss and improve the legal and technological challenges to the adoption of FAIR data and software management in Engineering, to exchange knowledge, opinions and experiences, and form or participate in existing Working Groups to address these challenges.

 

This includes in particular contract research and its associated privacy and security concerns (such as the conditions included in non-disclosure agreements), and the important role of software and source code as ‘research data types’ within the engineering sector:

 

1)   Use case: coding base & project data: Analyzing the use of software source code repositories such as GitHub and GitLab by the engineering sector and identifying common needs and best practices, such as a centralized GitLab/GitHub framework with similar best practices and standards for computational engineering and software management. In addition to software management, this use case will also identify workflows and services to facilitate, standardize and harmonize the transfer of engineering project results (e.g. data.DURAARK.eu, an architectural engineering example, or 4TU.ResearchData) into a broader FAIR knowledge base. Currently, outcomes of such projects, data results and accompanying repositories, are usually listed on discovery platforms such as re3data.org. As a principle, these should be however also distributed in the emerging FAIR assessment platforms and tools such as FAIRsharing.org, where they can be linked to metadata and data standards.

 

2)    Use case: privacy and security in engineering data management: Given the lack of legal national and international harmonisation of scientific and industrial data protection and access, different approaches, contracts (e.g. non-disclosure agreements) and protocols will have to be accessed and compared in order to improve the FAIRness of engineering research data. One  example is the use of research data as support for intellectual property and patent claims, and the role of research data in institutional technology transfer offices, where public data sharing is often seen as giving away assets and know-how. Thus, scientists may have to be pointed to possibilities to get access to legal advice as well as appropriate contract templates for the planning and executing of FAIR best practices in research data management in the case of mission oriented research.

 

Based on the use cases as outlined above, the preliminary focus of the IG RDM4Eng will cover following areas:

 

  • Engineering Data and Code landscape
    • defining a list of tools dealing with Engineering Data Management in academia and industry
    • defining and evaluating existing and developing engineering data platforms
    • disseminating the IG results within other relevant engineering organisations on a global, European and national scale

 

  • Privacy and Security in Engineering Data
    • sharing best practice on non-disclosure agreements with industrial stakeholders and differential privacy
    • developing models for dynamic consent that protects industrial and institutional interest while enabling data sharing ‘as open as possible, as closed as necessary
    • providing a forum for discussing, explaining and responding to data regulation issues on a national and international level

 

As a result, the IG RDM4Eng will build a knowledge base in order to share technical practices, identify common data and service requirements, and facilitate search and analysis of existing FAIR data solutions for interoperability challenges that are shared among engineering research infrastructures, universities and companies. The IG will seek collaboration with those RDA groups that have affinity to the objectives mentioned above, as well as with external organisations (such as AAES, CESAER, NIST) past and ongoing engineering projects (Big Data Europe, BOOST 4.0, DURAARK) and industrial stakeholders from different engineering sectors:

  • Automotive sector
  • Construction sector
  • Computational engineering sector
  • Mechanics
  • Architectural engineering
  • Chemical engineering
  • Coastal Engineering
  • and others

 

Participation

This IG will be open to all RDA members from all countries and scientific disciplines. Particularly, but not exclusively, the IG will welcome members from the following backgrounds:

  • Scientists involved in contract research, to share their experience in dealing with RDM questions and non-disclosure agreements
  • Industrial representatives from major and minor companies representing engineering science and the industry (particularly industry 4.0) sector
  • Practitioners of software engineering for the industry sector
  • Policy-makers for non-disclosure agreements & legal experts
  • Data Stewards and related research data experts
  • HPC and distributed computing experts

 

Outcomes

Major/Preliminary outcomes of the IG RDM4Eng will include the following:

  • Strengthen the connection between the industrial and academic sector
  • Bring light to the issue of contract and mission oriented engineering research from global and national points of view
  • Establish an exchange & information knowledge base for engineering data types and software products
  • Display funder guidelines and best practices
  • It is planned to solve particular problems like those identified in the CESAER context by spawning of RDA Working Groups

 

 

Mechanism

Outputs and recommendations will be produced based on consensus of the participating RDA group members. All topics will be openly discussed via the RDA communication platform providing a CMS, document store, and Wiki.

 

At the RDA plenaries the IG will organize group sessions and will interact with other RDA groups, e.g. by the organization of joint sessions. In between plenaries regular virtual conferences will guarantee the continuity of activities and encourage the continuous exchange of information.

 

The initial co-chairs will accompany the group’s creation and establish the activities. It is intended to conduct a co-chair election every two years.

 

The proposed IG has identified overlap with regard to contents with the following RDA groups:

  • IG Chemistry Research Data
  • IG Data Fabric
  • IG Data Foundations and Terminology
  • WG Data Type Registries
  • WG Data Versioning
  • IG Disciplinary Collaboration Framework IG
  • IG Domain Repositories
  • IG From Observational Data to Information
  • IG Health Data
  • WG Blockchain Applications in Health
  • WG International Materials Resource Registries
  • WG Metadata Standards Catalog
  • IG RDA/CODATA Legal Interoperability
  • IG RDA/CODATA Materials Data, Infrastructure & Interoperability
  • IG RDA/NISO Privacy Implications of Research Data Sets
  • IG Reproducibility
  • WG Research Data Collections
  • IG Software Source Code
  • IG Vocabulary Services

 

While especially the IG RDA/CODATA Materials Data, Infrastructure & Interoperability  as well as the IG RDA/NISO Privacy Implications of Research Data Sets have conceptual similarities with this the IG RDM4Eng, to our knowledge, none of the above IG focus on an inclusion of both, industrial and scientific stakeholders from the engineering sector and bringing them together both on an European and on an international scale.

 

 

 

Timeline

We are looking forward to having the IG established at the 13th RDA Plenary Meeting in April 2019 in Philadelphia, USA. The first outcomes of this IG are planned to be presented in a timely fashion using the RDA platform and file repository structure, with a formal presentation and discussion at the latest after 12 months after the establishment of the IG.

 

 

List of initial members

 

Name

Affiliation

Country

Chair

Marta Teperek

TU Delft

Netherlands

 

Susanna-Assunta Sansone

Uni of Oxford, Dep of Engineering Science (and RDA FAIRsharing WG)

UK

 

Alastair Dunning

TU Delft

Netherlands

 

Daniela Hausen

RWTH Aachen University

Germany

Chair

Angelina Kraft

Technische Informationsbibliothek (TIB) German National Library of Science and Technology

Germany

Co-Chair

Markus Stocker

Technische Informationsbibliothek (TIB) German National Library of Science and Technology

Germany

 

Gerald Jagusch

ULB Darmstadt

Germany

 

Nanette Rißler-Pipka

Karlsruhe Institute of Technology (KIT)

Germany

 

David Wallom

University of Oxford

UK

 

Kyong-Ha Lee

Korea Institute of Science and Technology Information

South Korea

 

Jonathan Petters

Virginia Tech

USA

 

Gretchen Greene

National Institute of Standards and Technology (NIST)

USA

 

 

Review period start:
Tuesday, 29 January, 2019 to Friday, 1 March, 2019
Custom text:
Body:

 

Please note that the text below the line contains the original Charter, also attached here.  The original Charter was superseded by the revised Charter in July 2020, which  can be found here.


Original Charter of the Go Fair Interest Group

 

RDA Interest Group Draft Charter

 

GO FAIR Interest Group

 

Introduction

GO FAIR stands for Global Open FAIR and aims at bringing interested stakeholders together to define a practical and concrete approach for implementing the FAIR principles. GO FAIR is organized in terms of Implementation Networks (INs) that are groups of stakeholders that decide to collaborate to define, design and/or implement elements that can contribute to a global platform for data, services and computing capacity interoperability. The GO FAIR Interest Group (GFIG) aims at serving as a bridge between the various GO FAIR INs and RDA’s Interest and Work Groups.

 

User scenario(s) or use case(s) the IG wishes to address

  • From one side, RDA Interest and Work Groups produce a number of guidelines and recommended approaches to deal with several data-related issues and they could and should be adopted by the GO FAIR INs (INs) in their activities. From the other side, the GO FAIR INs often identify issues for what there are no recommended approaches yet. These issues will be brought up to the RDA community’s attention by this proposed GO FAIR Interest Group. Some of the identified issues brought from the GO FAIR Ins can be tackled in existing RDA Work Groups and others can motivate the formation of new RDA Work Groups.

 

 

Objectives

  • GFIG will establish frequent communication and collaboration with RDA FAIR-related IGs and WGs;
  • GFIG will bring recommendations produced by RDA IGs and WGs into the GO FAIR INs for testing and adoption;
  • GFIG will bring the results and feedback from the testing and adoption of RDA recommendations by GO FAIR INs back to their related RDA IGs and WGs;
  • GFIG will bring data-related issues identified by GO FAIR INs for which proper recommendations are missing to the attention of the RDA community aiming at raising enough interest so interested members of RDA could tackle, eventually by spinning off new RDA IGs, WGs or task forces.
  • Create RDA WGs to tackle some of the identified issues.

 

Participation

Both GO FAIR and RDA are initiatives that congregate a diversity of communities from different knowledge domains. In the context of RDA, the commonality is the interest in evolving the current practices in research data. For GO FAIR, this commonality is the desire to contribute to an improvement in findability, accessibility, interoperability and reusability of data, services and computing capacity. Therefore, the proposed GO FAIR IG is open to individuals from all communities interested in sharing these goals with contributions ranging from technical definitions of metadata content and semantics to data policy templates.

 

Outcomes

As previously mentioned, this proposed IG aims at serving as a bridge between activities and interests in RDA and GO FAIR. Therefore, the expected outcomes are:

  • Closer collaboration between the communities from these two initiatives;
  • RDA recommendations tested and validated in GO FAIR Implementation Networks;
  • New RDA WGs focusing on issues identified by GO FAIR Implementation Networks;
  • New RDA recommendations based on issues identified by GO FAIR Implementation Networks;

 

Mechanism

The GO FAIR International Support and Coordination Office and the GO FAIR Regional Offices keep constant communication with the GO FAIR Implementation Networks. The data-related issues identified by these INs will be discussed with the RDA communities during the planned bi-monthly GO FAIR IG calls and during the RDA Plenaries. Similarly, the RDA recommendations will be brought to the GO FAIR INs attention during the many calls, meetings and conferences organized by GO FAIR. Moreover, there are many RDA members that participate in GO FAIR INs. We will invite them to become members of this proposed GO FAIR IG making the connection between RDA and GO FAIR closer and more direct.

 

Timeline

  • Establish the IG;
  • Coordinate the first f2f meeting (hopefully at P13);
  • Increase the number of participants of the IG;
  • Engage with related RDA IGs and WGs to start the collaboration and communication;
  • Establish the first cases of GO FAIR INs testing and adopting RDA recommendations;
  • Bring the first GO FAIR-identified issues to the RDA community;

 

Potential Group Members

 

FIRST NAME

LAST NAME

TITLE

Luiz Olavo

Bonino da Silva Santos

Co-Chair

Erik

Schultes

Co-Chair

Mark D.

Wilkinson

Member

Michel

Dumontier

Member

Barend

Mons

Member

Yann

Le Franc

Member

Sarah

Jones

Member

Ville

Tenhunen

Member

Keith

Russel

Member

Ian

Bruno

Member

Michael

Conlon

Member

Wade

Bishop

Member

Susanna-Assunta

Sansone

Member

Richard

Hartshorn

Member

James

Wilson

Member

 

 

 

 

Review period start:
Saturday, 5 January, 2019 to Tuesday, 12 February, 2019
Custom text:
Body:

** Please note - The content below has been deprecated. The authority version of the Group Charter Statement is available here.**

 

Social Science Research Data Interest Group Proposal

Introduction

We are proposing a Social Sciences & Humanities Research Data (SSHRD) Interest Group under the auspices of the Research Data Alliance (RDA), to foster diverse professional exchange on issues particular to data originating from the social sciences and humanities.

 

Social Sciences and Humanities Research Data cover many disciplines, appear in many data types, deal with multiple objects and levels, and are very distributed – coming from various sources. It could be described as a patchwork quilt, lacking a grand design or focus. On the other hand, it is a way to cover the whole spectrum, to be flexible in collecting data.

 

There is a huge potential reuse of SSHRD – for researchers, but also for professionals outside universities, for companies, governments, and for citizens.

 

As a research data community, we are entering the implementation phase of the FAIR principles: we can see first results on Findability – with various catalogues coming available. But for Accessibility, it already complicates as many social data are too sensitive to share directly: access to social sciences research data is not dichotomous: open or closed but requires fine-tuning on making data accessible. For better Interoperability, we need alignment on controlled vocabularies and ontologies, as well as semantic techniques to relate data.

 

A barrier for Reuse is lack of clarity in data policies – and their implications for researchers: what are conditions and requirements for providing access, and for using the data. For Reuse, new users want information about the quality and provenance of the data: where do they come from, how were they collected and curated, etc.

 

Social Science & Humanities Research Data Interest Group Focused Initiatives

This new SSH interest group will begin by helping to coordinate communications across the various current RDA groups of interest to our disciplines and to provide a place for our members to share solutions and concerns with others in our fields. These groups include the FAIRSharing and others noted by Braukmann in the “RDA Overview for the Social Sciences" (http://doi.org/10.5281/zenodo.1401105 ). We will also seek input and coordinate with the external SSH community leaders and organizations such as CESSDA, DDI Alliance, IASSIST, ICPSR, IFDO and WDS. We will be open and inclusive seeking to use this group to connect the various organizations working to promote SSH data sharing. While this new interest group will certainly be a coordinating group, we also aim to produce new RDA Working Groups to help provide solutions to challenges in our SSH domains. Given the complexity of SSHRD, we feel that the initial focus of our interest group should be very focused on defining working groups of immediate need to our communities. We foresee that to implement FAIR in our disciplines we need to engage all stakeholders: funders, producers, service providers, users; and as a new interest group we want to prioritise on three specific areas focused on the Social Science and Humanities research data communities.

  • quality of data
    find automated ways to investigate and provide information on quality

  • data policy
    align – and wherever simplify – data policies and their implications for making data available and for using data.

  • sensitive data
    estimates are that over 40% of the data in our community is too sensitive to make them openly available without any restrictions or measures.

    This approach – to engage all relevant parties, and to focus on a limited number of topics – meets the Minimum Viable Ecosystem approach that is used for the EOSC, and has been used in building and expanding platforms, like Apple did (see frame)

In Apple’s case, the MP3 player represented a limited feature set (music only) but established the necessary minimum ecosystem (MVE). Next there was iTunes to add new music. But the big leap happened when they added partners who increased the feature set, moving towards the full rollout: Adding AT&T as a partner expanded the iPod into the iPhone, a more complete feature set. The rest is history.
Source:
https://smartorg.com/innovating-to-create-an-ecosystem/

 

Objectives

The initial objectives of the SSHRD interest group will be to bring together major community members seeking to coordinate international social science research data sharing. As noted above several existing groups within RDA are discussing topics of interest to the SSH disciplines but none focus specifically on the disciplines of social science and humanities with the key focus on data sharing. We will bring individuals and communities together to provide this forum and define future working groups to help solve the issues identified.

 

Participation

We want to include all stakeholders and reach as broad an audience as possible. We will initially begin working with existing RDA groups and further discussions between CESSDA, DDI Alliance and IFDO to start the conversations. But we will immediately begin contacting the many other groups ranging from IASSIST to ICPSR that will be valuable in providing input to this interest group. Below is an initial matrix of the groups we feel would have interest in our efforts.

 

Social Sciences

All Sciences

Policies (Open Science)

IFDO

OECD, CODATA, ORFG

Infrastructures Repositories

Trust/Certification Archiving Sensitive Data

CESSDA, ICPSR, ODUM, ADA

ERAN (EU), RAIRD (NO)

ESFRI, ARDC Figshare, Mendeley, Zenodo/OpenAIRE CoreTrustSeal Dataverse
EGI (AAI)

Architecture Digital Objects

PID Metadata Kernel Info

DDI

DOI / Handle, DataCite ISO
RDA Data Type Registry

Data Production

ESS, SHARE, Wage- Indicator, GGP, EVS, WVS, ...

National Statistics, ILO, World Bank, ...

Researchers

Former SSRN?

YEAR (young researchers EU)

 

Outcomes

  • Landscape analysis on sensitive data
    topics/issues, initiatives/activities/tools, best practices (and don’ts)

  • Implementation guide on data policies
    building on ORFG Blueprint, Science Europe DMP

  • Landscape analysis on tools for assessing data quality
    automated and integrated in existing data services tools (e.g. Dataverse, FigShare, Fedora, etc...)

 

Mechanism

We will begin by having biweekly video conference calls on a rotating time slot basis to help be inclusive for members in various time zones. We will propose an initial IG meeting at the RDA Plenary in Philadelphia Spring of 2019 to plan for the rest of the year. Initial calls will be coordinated by CESSDA, IFDO, and DDI Alliance but as the group grows, we will be adding more participants and organizational representatives from our matrix above.

 

 

First Six Months Timeline

Early January 2019  - Draft of first IG meeting agenda in Philadelphia

Late January 2019 - First call and then every two weeks following

Late February 2019 - Finalize IG meeting agenda for Plenary

Early March 2019 - Contact list of organizations to invite to participate

April 2019 - RDA Plenary meeting

May 2019 - Suggested Working Group Topics

July 2019 - Final Draft of Working Group Suggestions

 

Initial and Potential Group Members

 

This group will initially be Co-Chaired by Jonathan Crabtree representing the Odum Institute, the International Federation of Data Organizations and the Global Dataverse Community Consortium, Ron Dekker representing the Consortium of European Social Science Data Archives (CESSDA) and Steve McEachern representing the DDI Alliance and the Australian Data Archive. Our initial Birds of a Feather meeting in Botswana was very well attended and we have thirty members from that meeting that are interested in joining this interest group. We will utilize our existing social science community network connections to increase this list significantly during the first six months once we begin advertising our intent to the larger RDA audience.

Review period start:
Saturday, 5 January, 2019 to Tuesday, 5 February, 2019
Custom text:
Body:

Name of Proposed Interest Group: Research Funders and Stakeholders on Open Research and Data Management Policies and Practices

 

Introduction (A brief articulation of what issues the IG will address, how this IG is aligned with the RDA mission, and how this IG would be a value-added contribution to the RDA community):

 

Research funders have been participating in the RDA since the beginning, first through the Funders’ Forum, and then they began to propose Bird of a Feathers sessions in which funders from around the globe bring subjects that they wish to discuss publicly among themselves and on which they want to gather input from the RDA community. The aim of the proposed Interest Group is to provide a sustainable venue for these discussions. Funders are important RDA stakeholders, and the IG would substantiate their engagement with RDA and be fully aligned with the RDA principles of openness, harmonization and being community-driven. It would be a significant value-added contribution to the RDA community as a vehicle for open discussion of policy-driven topics.

 

User scenario(s) or use case(s) the IG wishes to address (what triggered the desire for this IG in the first place):

 

There are currently few forums for research funders engaged in the open research and data management space to discuss their unique challenges, perspectives and opportunities. Creating a forum for research funders to initiate or further data-related discussions could result in lasting benefits for the research community (increased coordination, standardization and support) and the funders themselves in terms of avoiding duplication of efforts and building on existing expertise and resources.

 

Open research means making research outputs – including datasets, publications and code – widely and rapidly available. To realize the potential benefits of open research, research funders are implementing new open research policies, adopting new technologies and tools, and developing training and partnered initiatives. Although there is a lot of variation in approach, there are a number of commonalities, including a focus on data management plans, peer review support, working with institutions, and data requirements built into funding opportunities and initiatives.

 

These questions were addressed during a BoF session, “Research Funders on the Topic of Open Research and Data Management” (https://www.rd-alliance.org/research-funders-topic-open-research-and-data-management-rda-10th-plenary-bof-meeting) at P10, and during two BoF sessions held in Berlin “Interest Group for Funders and Stakeholders on Open Research and Data Management Policies” (https://rd-alliance.org/funders-and-stakeholders-open-research-and-data-management-policies-rda-11th-plenary-bof-meeting). These sessions are the precursors of the proposed Interest Group. The second session at P12 was devoted to discuss the Interest Group planning. This IG proposal results from these preliminary discussions.

 

Objectives (A specific set of focus areas for discussion, including use cases that pointed to the need for the IG in the first place.   Articulate how this group is different from other current activities inside or outside of RDA.):

 

The IG will provide a forum for research funders to update each other on their open research and data management policy development, and to foster discussion among research funders and other stakeholders on trends, challenges and opportunities related to developing, implementing and responding to open research and data management policies (such as best practices with respect to adjudicating DMPs, monitoring and compliance of policies, etc.). It can also be used for discussing policy alignments. It will initiate Working Group proposals when relevant.

 

Participation (Address which communities will be involved, what skills or knowledge should they have, and how will you engage these communities.  Also address how this group proposes to coordinate its activity with relevant related groups):

 

The primary participants are research funders, but the IG aim is to be open to any member of the RDA community interested in being involved. The attendance of the Montreal and Berlin BoFs demonstrates the interest of the community for these discussions. They can guide the activities of other RDA Groups by providing a policy framework.

 

Outcomes(Discuss what the IG intends to accomplish.  Include examples of WG topics or supporting IG-level outputs that might lead to WGs later on.

 

The IG will improve coordination between funders in domains relevant to the RDA. It will identify subjects of common interest as topics for discussion at each Plenary, and propose how to continue to foster the discussion among research funders at RDA (teleconferences between Plenaries, meetings et Plenaries, RDA Working Group) or other venues.  The IG would also provide an important venue for engaging publically with the broader data community, be they researchers, data professionals, publishers, etc.  Topics for particular meeting would be proposed by funding organizations (with input welcomed by others).

 

Mechanism (Describe how often your group will meet and how will you maintain momentum between Plenaries.

 

The role of the chairs would be to collect proposals for discussion topics, obtain the members approval prior, and organize presentations and discussions prior to each meeting.  The chairs would also work with the broader funding community to help coordinate any activities that may require the creation of a formal working group.  The chairs would report on plans and activities at the Funders Forum and interact with other IG chairs as per usual RDA practices.  

 

 

Timeline (Describe draft milestones and goals for the first 12 months):

 

Topics currently being considered for a possible first meeting include:

 

  • Working with data repositories
  • Data management policy alignment
  • Data sharing across international borders
  • Data management monitoring and compliance
  • Peer/merit review committee training
  • Data management training for researchers and students
  • Creating incentives for data management and data sharing
  • Using persistent identifiers in funding systems

 

Members will vote on these topics (through a doddle poll in early January).   The interim co-chairs will organized the first meeting. 

 

 

Potential Group Members (Include proposed chairs/initial leadership and all members who have expressed interest):

 

Matthew Lucas (Social Sciences Research Council of Canada) and Josh Greenberg (Sloan Foundation,) the current co-chairs of the Funders Forum, will serve as the interim co-chairs of the proposed Interest Group pending the group’s proposed first meeting at the next RDA Plenary in Philadelphia.  At that time co-chairs representing geographic and gender diversity will be proposed. 

 

The list below represents the representative of funding organizations who participated in the 12th RDA Plenary and whom agreed with the proposal to create a standing Funders Interest Group.   The fill list of Funders Forum participants (which vary from meeting to meeting) is considerably longer.

 

First Name

Last Name

Organization

Country

Kevin

Ashley

DCC

United Kingdom

Claudia Maria

Bauzer Medeiros

FAPESP, Brazil

Brazil

Fran

Berman

RPI

USA

Juan

Bicarregui

UKRI / STFC

United Kingdom

Wade

Bishop

University of Tennessee - Belmont

USA

David

Carr

Wellcome Trust

United Kingdom

Ingrid

Dillo

DANS

The Netherlands

Lisa

Federer

National Library of Medicine

United States

Francoise

Genova

CNRS/Observatoire Astronomique de Strasbourg

France

Daniel

Goroff

Alfred P. Sloan Foundation

USA

Jason

Haga

AIST

Japan

Kazuhiro

Hayashi

National Institute of Science and Technology Policy

Japan

Edit

Herczog

Vision & Values

Belgium

Athanasios

Karalopoulos

EC -RTD

Belgium

Mark

Leggott

Research Data Canada

Canada

Matthew

Lucas

SSHRC, Canada

Canada

Yasuhiro

Murayama

ICSU-WDS

Japan

Yasuhiro

Murayama

National Institute of Information and Communications Technology

Japan

Kay

Raseroka

Joint Minds Consult

Botswana

Bob

Samors

Belmont Forum

Italy

Andrew

Treloar

Australian Research Data Commons

Australia

Ross

Wilkinson

Australian Research Data Commons

Australia

 

 

Review period start:
Monday, 21 January, 2019 to Thursday, 21 February, 2019
Custom text:
Body:

Name of Proposed Interest Group: Data Properties as Economic Goods (Data Economics)

 

Introduction (A brief articulation of what issues the IG will address, how this IG is aligned with the RDA mission, and how this IG would be a value-added contribution to the RDA community):

 

The fact that data has a value is commonly recognised. However, data value is different from those associated with the consumable goods. Supporting this area of inquiry and development, there are a number of initiatives to create data markets and data exchange services.

 

As part of this trend, existing business models of paid or commercial data(sets) services such as archives are based rather on services subscription fee. Quality of datasets  in many cases  is assessed by independent certification body or based on peer review by expert. This type of model is useful for specific use cases, although this does not provide a consistent model to make data an economic goods and enable data commoditisation.  Another key factor is growing interest in making the best use of research data resulted in formulation of the FAIR (Findable – Accessible – Interoperable – Reusable) data principles, which are widely supported by industry and business.

 

However, emerging data driven technologies and economy facilitate interest to making data a new economic value (data commoditisation) and consequently identification of the new properties of data as economic goods. The following properties leverage the FAIR data properties and are defined as STREAM for industrial and commoditised data

 

[S] Sovereign

[T] Trusted

[R] Reusable

[E] Exchangeable

[A] Actionable

[M] Measurable

Other properties to be considered and necessary for defining workable business and operational models: nonrival nature of data, data ownership, data quality, measurable use of data, privacy, integrity, and provenance.

 

 

User scenario(s) or use case(s) the IG wishes to address (what triggered the desire for this IG in the first place):

 

The proposed Interest Group is a direct and well supported outcome of the BoF on Data as economic goods that took place at the RDA 12 Plenary on 5-8 Nov 2018 in Gaborone.  The BoF drew 29  participants from 13 countries, and addressed and included four presentations. The agenda and notes are found at: https://www.rd-alliance.org/bof-data-properties-economic-goods-rda-12th-...

 

Following the presentation, a thoughtful and energetic discussion took place, and the need for continued discussion was explicit, leading to the unanimously supported recommendation to create corresponding IG within the RDA community. The importance of addressing data economics and data market issues is supported by numerous national and international projects and initiatives in Europe (such as MATES project, DL4LD project, International Data Space Association (IDSA), others). Data Markets is also a topic of the ICT139(a) call in Horizon2020 work programme for 2019-2020. Data markets is an important component of the ongoing industry digitalisation.

 

Creating consistent and workable models for data exchange and commoditisation is critical to facilitating research data usage, creation of new value added data driven services bringing additional resources to research organisations. Consistent data pricing and data markets models are equally important for government funded and sponsored research, open data and governmental data. A number of examples from industry demonstrate practical interest to facilitate and obtain value from data exchange, interoperability, and adopting common architectures for data markets. This includes but not limited to creating IDSA Architecture (https://www.internationaldataspaces.org/wp-content/uploads/2018/04/InternationalDataSpacesAssociation_ReferenzArchitecture2.0.pdf), recent Open Data Initiative (ODI) by Microsoft and associates SAP and Abode, and others.  

 

These developments, and the mission and goals of the RDA lend support and underscore the need for an interest group on Data Economics, and inter-related topics of data properties, the association between data and economic goods, data markets and other topics. An interest group will provide an important open forum for further exploration and open discussion, and allow for important connections to be made to data economies as they intersect with other RDA IGs and WGs.

 

 

Objectives (A specific set of focus areas for discussion, including use cases that pointed to the need for the IG in the first place.   Articulate how this group is different from other current activities inside or outside of RDA.):

 

The objective of this IG is the exchange of information about existing developments and initiatives and promotion of best practices in data markets and data marketplace operation.

  1. Provide and open forum and venue for gathering individuals interested in discussing and defining set of properties that are related to data as economic goods
  2. Compile initial agreed set of data properties and validate currently denoted STREAM properties
  3. Identify key actions to further research topics and coordination actions
  4. Identify set of documents to be produced by IG, in particular STREAM-like properties definition and overview of business models for research data and commoditised data trading and exchange
  5. Provide input to other RDA WGs and IGs, as well as to other associations and standardisation bodies such as BDVA, IDSA, NIST, ISO and IEEE, using existing RDA channels or establishing new channels.

 

 

Participation (Address which communities will be involved, what skills or knowledge should they have, and how will you engage these communities.  Also address how this group proposes to coordinate its activity with relevant related groups.):

 

IG members will provide an initial community to start working on the documents.

 

Interaction with other IG and WG is expected, external experts and contributors will be invited. Interaction with external organisations is expected: NIST, IEEE, International Data Space Association.

 

 

Outcomes (Discuss what the IG intends to accomplish.  Include examples of WG topics or supporting IG-level outputs that might lead to WGs later on.):

 

At the initial stage of 1-1.5 year, the IG plans to produce an information document(s) providing overview of existing approaches and concept related to data economic values, data pricing, economical models related to several use cases identified by RDA community on using and managing research data.

 

Other general outcomes may include:

  • Increased use and sharing data; in particular between research community and industry
  • Establishing compatible metadata and measurable data properties.
  • Bringing economic value to research based on better defined cooperation with industry

The IG activity will be driven by community and other expected outcomes will address community needs and requests. In particular, he IG will make the case for creating a taxonomy of the data properties as economic good that be a spin off Working Group.

 

 

Mechanism (Describe how often your group will meet and how will you maintain momentum between Plenaries.):

 

The IG will meet at the RDA Plenaries, will manage discussion on the list, and establish regular call (expected monthly) between plenaries. Activity between plenaries will be facilitated by working of the IG outcomes and documents.

 

Timeline (Describe draft milestones and goals for the first 12 months):

 

The suggested timeline is presented in the table below

 

Time span or milestone

Activity

Goals and outcomes

2019-2020

  • Meetings at RDA Plenaries (first expected at RDA13Spring 2019) – twice a year
  • Regular tel/skype/video call – as required by work items (expected 3-4 between plenaries)

Build community

2021

Review IG results outcome and identify needs for renewed Charter and mandate

Renewed IG Charter

September 2019

Initial Draft information document on Data properties as economic goods

Draft

RDA14 Autumn 2019

Interacting with other RDA IG/WG and external standardisation bodies and organisations

IG network is built

Spring 2020

Final draft on Data properties as economic goods. Call for IG and community comments.

Final drat published

Autumn 2020

IG information finalisation

Document published by RDA

 

 

 

 

 

Potential Group Members (Include proposed chairs/initial leadership and all members who have expressed interest):

 

 

FIRST NAME

LAST NAME

EMAIL

TITLE

Yuri

Demchenko

y.demchenko@uva.nl

Senior Researcher , University of Amsterdam

Jane

Greenberg

jg3243@drexel.edu

Drexel University

Gary

Berg-Cross

gbergcross@gmail.com

Ontolog

Steven

Brewer

S.Brewer@soton.ac.uk

University of Southampton

Wouter

Los

W.Los@uva.nl

University of Amsterdam

Markus

Spiekermann

Markus.Spiekermann@isst.fraunhofer.de

Fraunhofer ISST

Sebastian

Steinbuss

@internationaldataspaces.org

International Data Spaces Association

Rebecca

Koskela

rkoskela@unm.edu

Univ of New Mexico

Keith

Jeffery

Keith.Jeffery@keithgjefferyconsultants.co.uk

Keith Jeffery Consultants

  

Additionally, 27 attendees of the BoF on Data Economics will be added when the meeting sign-in sheet will be decoded 

Review period start:
Friday, 4 January, 2019 to Tuesday, 12 February, 2019
Custom text:
Body:

Case Statement

 

  1. WG charter
    1. Context

 

Technological advancements have made science more data intensive and interconnected, with researchers producing and sharing increasing volumes of research data. To maximise the value of science, research data (sets) should have four foundational characteristics; they should be:

  • 'Findable', i.e. discoverable with machine readable metadata, identifiable and locatable by means of a standard identification mechanism;
  • 'Accessible', i.e. available and obtainable;
  • 'Interoperable', i.e. both syntactically parseable and semantically understandable, allowing data exchange and reuse among scientific disciplines, researchers, institutions, organisations and countries; and
  • 'Reusable', i.e. sufficiently described and shared with the least restrictive licences, allowing the widest reuse possible across scientific disciplines and borders, and the least cumbersome integration with other data sources.

Findability, Accessibility, Interoperability and Reusability – the FAIR principles – were first introduced in 2014 and intend to define a minimal set of community-agreed guiding principles and practices that allow both machines and humans to find, access, interoperate and re-use research data. The FAIR principles define characteristics that contemporary research data resources, vocabularies and infrastructures should exhibit to assist discovery and reuse by third-parties and they can be further refined into a range of facets that have the potential to: a) improve scientific research, b) contribute to growth and accelerate innovation in a global digital economy, c) increase the reproducibility of research and d) better inform citizens and society about the results and value of research (through thorough and comprehensible description of the data sets).

    1. problem

The aspirational nature of the FAIR data principles and their rapid adoption at international level has led to an ambiguity and a wide range of interpretations of FAIRness since the principles do not strictly define how to achieve a state of FAIRness but rather they describe a continuum of features, attributes and behaviours that move a digital object closer to that goal. As a result, a number of incompatible methodologies to assess FAIRness have been developed already and relevant work is in under way by various groups.

 

Due to the lack of a common set of core assessment criteria for FAIRness, researchers and organisations cannot evaluate the readiness and implementation level of their datasets vis-à-vis the FAIR data principles in a coherent way. The majority of the available FAIR assessment frameworks: i) produce results which cannot be combined or compared and ii) do not allow a benchmark based on the comparison amongst peers. In addition, research performing organisations and data infrastructures cannot develop or follow a minimum set of shared guidelines to climb up the ladder of FAIR because of the increased heterogeneity of the offered FAIR metrics tools.

    1. Outcomes

The Working Group "FAIR data maturity model: core criteria to assess the implementation level of the FAIR data principles" will bring together stakeholders from different scientific and research disciplines, the industry and public sector, who are active and/or interested in the FAIR data principles and in particular in assessment criteria and methodologies for evaluating their real-life uptake and implementation level. The Working Group will develop as an RDA Recommendation a common set of core assessment criteria for FAIRness and a generic and expandable self-assessment model for measuring the maturity level of a dataset from the following perspectives:

  • Data findability, i.e. how well it describes the data it produces or manages with rich metadata, assigns to data/metadata a globally unique persistent identifier and registers or indexes them in a searchable resource;
  • Data accessibility, i.e. how well it allows the retrieval of its data/metadata by their identifier using a standardized communications protocol that is open, free and universally implementable;
  • Data interoperability, i.e. how well it ensures that the precise format and meaning of exchanged and shared data/metadata is preserved and understood;
  • Data reusability, i.e. how well it releases data/metadata with a clear and accessible data usage license, associated with detailed provenance and follows practices that promote the reuse and share of data, unless certain privacy or confidentiality restrictions apply.

In addition, the Working Group will design:

  • A self-assessment toolset that enables researchers and organisations to evaluate and improve the readiness and implementation level of their datasets vis-à-vis the FAIR data principles.
  • A lightweight version of the FAIR Data Maturity Model (aka FAIR data checklist), aiming to raise awareness on the main aspects related with the FAIR principles.

The outcomes of the Working Group will be possible to be applied not only to data in the conventional sense but also to data-related algorithms, tools, workflows, protocols and other data-related services produced or managed by the assessed entity.

  1. Value proposition

Given that the outcomes of the Working Group will be in the form of generic and reusable building blocks, researchers and organisations will be in a position to easily apply and extend them in order to address FAIR-related assessment needs specific to their own thematic disciplines and/or countries. That will increase the coherence and interoperability of existing or emerging FAIR assessment frameworks and it will ensure the combination and compatibility of their results in a meaningful way.

 

The outcomes of the Working Group "FAIR data maturity model: core criteria to assess the implementation level of the FAIR data principles" will benefit:

  • Researchers, data stewards and other data professionals who are involved in the production and management of research data and have to follow good data management and data stewardship practises (which include the notions of data collection, annotation, archival and long-term care, either alone or in combination with newly generated data).
  • Data services owners (data infrastructures, data repositories, owners of commercial and open-source tools), who are responsible for setting up and maintaining a data-related services and tools.
  • Organisations that capture, generate, manage, share, protect and preserve research data.
  • Policymakers who are responsible for defining data policies at international, European and national level.

The Working Group will provide to the aforementioned user categories an instrument with a three-fold nature:

  1. It will be descriptive, i.e. it will describe the as-is FAIR-related maturity level of a dataset,
  2. It will be prescriptive, i.e. it will provide guidance to researchers and organisations to improve the implementation of the FAIR data principles (aka 'FAIRness') through recommendations, and
  3. It will be comparative, i.e. it will allow a benchmark based comparison amongst peers.

In addition, the outcomes of the Working Group are expected to:

  • Contribute to growth and accelerate innovation in a global digital economy: since data is becoming increasingly important for all aspects of the international economy, a common set of core assessment criteria and the FAIR Data Maturity Model will improve the readiness and capability of organisations to open up their data in a way that creates potential benefits for their investment plans (a specific example in Europe of the economic impact of opening up data is the Copernicus earth observation system).
  • Provide savings in money: the outcomes of the Working Group will ensure money savings to researchers and organisations as it will deliver a reusable solution for measuring the FAIRness of their data. Also, it will contribute to the improvement of their readiness and implementation level of the FAIR principles, which will lead to money savings from the reuse of high-quality data, the combination of data sets across borders or disciplines and the avoidance of duplication.
  • Provide savings in time for researchers and organisations aiming to implement the FAIR principles.
  • Increase transparency: better and faster implementation of the FAIR data principles will help to increase the reproducibility of research, which currently can be as low as 10-30% in key areas, such as cancer research. This can have a positive impact for the scientific principle of credibility, replication and further research given that the scientific community has repeatedly experienced instances of misconduct and erroneous analyses, which may endanger whole scientific fields
  1. Engagement with existing work in the area

The Working Group "FAIR data maturity model: core criteria to assess the implementation level of the FAIR data principles" must build upon existing relevant efforts at international, European and sectorial level and will complement emerging activities (e.g. funded by the H2020 Work Programme 2018-20) that support the FAIR data uptake and compliance across borders/disciplines.

 

Research Data Alliance offers an ideal environment for an engagement of this kind because RDA can bring deep knowledge from the promotion of research data interoperability at disciplinary levels together with hands-on experience in leveraging such knowledge in order to improve interoperability amongst scientific disciplines too.

 

Two of RDA groups having this twofold nature are the Disciplinary Collaboration Framework Interest Group and the Domain Repositories Interest Group, which both enhance communication with other RDA IGs and WGs and represent the interests of specific disciplines in those groups. The Working Group "FAIR data maturity model: core criteria to assess the implementation level of the FAIR data principles" will work closely with the aforementioned Interest Groups aiming to: a) capture interoperability needs between disciplines and in view of specific scientific challenges, b) gather relevant input from different disciplines and c) develop and apply a structured methodology for prioritising, harmonising and efficiently articulating inter-disciplinary needs.

 

In parallel, the Working Group will investigate opportunities for collaboration with existing or emerging RDA groups that address any aspect that is relevant with the implementation of the FAIR data principles, such as the:

  • Data Description Registry Interoperability (DDRI) WG
  • DMP Common Standards WG
  • Exposing Data Management Plans WG
  • RDA/FORCE11 FAIR Sharing Working Group
  • Metadata Standards Catalog WG
  • WDS/RDA Assessment of Data Fitness for Use WG
  • Research Data Repository Interoperability WG
  • RDA/CODATA Legal Interoperability IG
  • Data policy standardisation and implementation IG
  • Education and Training on handling of research data IG
  • Metadata IG
  • PID IG

In addition, the Working Group will engage with pertinent international and European actors and activities such as:

  • FAIRmetrics.org: a group collaborating with a broad set of stakeholders to design a framework for evaluating "FAIRness" that enables both qualitative and quantitative assessment of the degree to which online resources comply with the FAIR Data principles.
  • Horizontal or discipline-specific initiatives to measure the implementation of the FAIR data principles such as:
    • DANS FAIR data assessment tool: an online tool prototype which guides the user through a set of questions to assess a specific dataset.
    • ARDC FAIR Data self-assessment: a self-assessment tool designed predominantly for data librarians and IT staff to assess the 'FAIRness' of a dataset and determine how to enhance its FAIRness (where applicable).
    • CSIRO 5-star data rating tool: a tool that allows users to carry out a self-assessment based on 5 qualities of data – Findable, Accessible, Interoperable, Reusable and Trusted. For each quality, a number of specific questions have been curated to allow users to rate their data according to its current state.
  • GO FAIR: a community-led initiative to contribute to and coordinate the coherent development of the Internet of FAIR Data & Services. GO FAIR is analysing the possibility to organise a FAIR certification mechanism of services, tools, organisations, and people (including data stewards) aiming to help research funders and other stakeholders to promote open science, for instance by enabling researchers to incorporate a certified service in their data stewardship plans.
  • FORCE11 FAIR Data Management Plans Working Group: FORCE11, the international community/platform that hosted the open consultation for the definition of the 15 FAIR guiding principles in 2016, has established the "FAIR DMPs" Working Group aiming to provide a simple set of principles, along with examples of domain-specific implementations and recommendations for best practices, that emphasizes good data management, stewardship and machine-readablity for making data FAIR.
  • RDA/FORCE11 FAIR Sharing Working Group: connecting data policies, standards & databases " Working Group (former FAIR Sharing WG): a use cases-driven joint effort between RDA and Force11 to develop: a) a set of recommendations to guide users and producers of databases and content standards to select and describe them, or recommend them in data policies, and b) a curated registry, which enacts the recommendations and assists a variety of end users, providing well described, interlinked, and cross-searchable records on content standards, databases and data policies.
  • CODATA: the Committee on Data of the International Council for Science (ICSU) that promotes global collaboration to improve the availability and usability of data for all areas of research.
  • Science Europe: an association of European Research Funding Organisations (RFO) and Research Performing Organisations (RPO), active in the field of alignment of the research data management policies and templates.
  • European Commission Expert Group on "Turning FAIR data into reality": established by the Commission, this expert group is working together with European and global initiatives towards a proposal for a FAIR Data Action Plan for consideration by the Commission, Member States and stakeholders in the research and data communities. The draft proposal presented by the Expert Group at the 2nd EOSC Summit on 11 June 2018, suggests the design of an agreed set of basic core FAIR metrics, which will be "standardised" and extendible in order to cover the needs and practises of different communities.
  • EU-funded projects (e.g. EOSC pilot, EOSC hub, FREYA, Open AIRE Advanced etc.) supporting the first phase in the development of the European Open Science Cloud (EOSC):
  • European Commission: DG RTD, DG CNECT, DG DIGIT and the Publications Office.
  1. Work Plan

The Working Group "FAIR data maturity model: core criteria to assess the implementation level of the FAIR data principles" will build on top and combine the most salient characteristics of existing efforts for measuring the readiness and implementation level of a dataset vis-à-vis the FAIR data principles.

 

The outcomes of the Working Group (a common set of core assessment criteria for FAIRness in the form of a Recommendation and a generic and expandable self-assessment model for measuring the maturity level of a dataset FAIR) will be generic - and not specific to a certain discipline or country – and apply to any type of data in the conventional sense as well as to data-related algorithms, tools, workflows, protocols and other data-related services. They will be based on a core set of mutually exclusive and collectively exhaustive assessment criteria and be populated in a way that allows their extension in order to meet specific FAIR-related assessment needs, at national and/or discipline level (for example, for providing additional layers of detail for a number of discreet areas). Furthermore, design method will allow in the future the provision of estimations about the costs and benefits for organisations, both in economic and non-economic terms, for moving their datasets to a higher FAIR maturity level.

 

The outcomes will be developed following a progressive approach via a number of iterations. In each iteration, the current structure of the FAIR assessment criteria and the maturity model will be examined and validated in order to evolve to a revised version. The development process will be open, ensuring an active and continuous engagement of user communities and stakeholders in all development phases (including scoping, construction and testing). For that purpose, well-defined working and decision-making mechanisms will be defined and agreed from the beginning in order to facilitate the operation of the Working Group.

 

The main phases and deliverables will be the following:

  1. A: Initiation: during the first phase, the exact scope of the work will be defined including the objectives, the usage and the purpose of the assessment criteria and the model. Similar assessment criteria and models will be systematically analysed in order to identify components that could be reused either as they are or after applying some improvements, aiming to avoid the duplication of efforts.

Main outputs:

  • Scope definition
  • Literature review: an overview of existing approaches (generic or specific-purpose)

Timeline: M1 – M2

  1. B: Stakeholder identification: the initiation phase will be followed by the identification of the main actors who will be related with the outcomes of the Working Group from three perspectives: development process, execution and interest in the results.

Main outputs:

  • Stakeholder matrix

Timeline: M3

  1. C: Design methodology: the Working Group will define and agree on a systematic, effective and efficient design methodology that will lead to results that are rigorous and both theoretically founded and empirically validated. The design methodology will follow an iterative approach, leveraging the most appropriate techniques for the population of the expected results. A special role should be foreseen for RDA Working Groups and Interest Group with pertinent objectives.

Main outputs:

  • Design methodology

Timeline: M4 – M6

  1. D: Design: this phase will define all aspects with regard to the structure and the body of the FAIR assessment criteria and the model. The design phase will answer questions such as:
  • How many different maturity stages will be foreseen?
  • How many dimensions or layers will the model assess?
  • Will be any documented maturation paths?
  • How many questions will be included in the model?
  • What will be the type of dependencies in the implementation of the foreseen model’s capabilities or attributes (implicit / explicit)?
  • Which techniques will be used for the population of the model (e.g. literature review, case study interviews, focus groups etc.)?
  • Will be the measurement of the maturity quantitative and/or qualitative?

Main outputs:

  • Core assessment criteria for FAIR
  • FAIR data maturity model
  • FAIR data checklist

Timeline: M7 – M16

  1. E: Testing: the assessment criteria and the model will be verified and validated following a well-defined evaluation methodology.

Main outputs:

  • Testing results

Timeline: M13 – M16

  • Delivery: when the main building blocks of the outcomes will be constructed, various characteristics regarding their distribution will be decided such as: what kinds of materials will be publicly available, in what format etc. The exact set of the outputs of this phase will be progressively decided by the members of the Working Group.

Timeline: M17 – M18

  1. Adoption Plan

The members of the Working Group will incorporate the outcomes of their work in their policies and practices that promote the implementation of the FAIR data principles at national, discipline and international level. The set of core assessment criteria for FAIRness will be used as the basis to examine the compatibility and alignment of their instruments (such as recommendations, frameworks, templates, toolsets etc.) and any corrective activities will be planned and implemented in a coherent way. In addition, all members will systematically promote the outcomes to their specific communities, aiming to raise awareness and support their real-life adoption.

 

Furthermore, the Working Group will create and publish a number of guidelines for the extension of the core assessment criteria and the FAIR maturity model by user communities and organisations with specific needs in the evaluation of the implementation of the FAIR data principles. That will allow and facilitate a wider adoption of the outcomes of the Working Group by existing and emerging initiatives.

  1. Initial Membership

This is an initial membership, which gathers together representatives from organisations with experience in the area of FAIR data uptake, real-lie implementation and compliance across borders/disciplines. The Working Group will progressively get in touch with disciplinary specific initiatives to get their input too.

 

Co-chairs:

 

Review period start:
Sunday, 23 September, 2018 to Tuesday, 23 October, 2018
Custom text:
Body:
Review period start:
Sunday, 19 August, 2018
Custom text:
Body:

In mitigating against food insecurity how can TVET institutions be practically engaged to transmit practical technical knowhow to the local people to transform their lives? This is a case in Taita Taveta County, Kenya.

 

Review period start:
Wednesday, 4 July, 2018
Custom text:
Body:

Note:  The following text is deprecated in favor of the revised Case Statement attached.

 

 

RDA Working Group: Software Source Code Identification Case Statement

(this is a joint effort, coordinated with FORCE11)

 

Charter​

(A concise articulation of what issues the WG will address within a 18 month time frame and what its “deliverables” or outcomes will be.)

 

Software, and in particular source code, plays an important role in science: it is used in all research fields to produce, transform and analyse research data, and is sometimes itself an object of research and/or an output of research.

 

Unlike research data and scientific articles, though, software source code has only very recently been recognised as important subject matter in a few initiatives related to scholarly publication and archiving. These initiatives are now working on a variety of plans for handling the identification of software artifacts.

 

At the same time, unlike research data and scientific articles, the overwhelming majority of software source code is developed and used outside the academic world, in industry and in developer communities where software is routinely referenced, in practice, through methods that are totally different from the ones used in scholarly publications.

 

The objective of this working group is to bring together a broad panel of stakeholders directly involved in software identification. The planned output will be concrete recommendations for the academic community to ensure that the solutions that will be adopted by the academic players are compatible with each other and especially with the software development practice of tens of millions of developers worldwide.

The output of this working group is highly relevant for the broader RDA community, because most research datasets are created and/or transformed using software, so a common standard for software identification will enable better traceability and reproducibility of research data.

Value Proposition​

(A specific description of who will benefit from the adoption or implementation of the WG outcomes and what tangible impacts should result)

 

The planned outcomes of the working group are recommendations and guidelines for software artifact identification (in particular in its source code form), targeted specifically at scholarly stakeholders that are willing to integrate software artifact into their workflow: scientific publishers, institutional repositories, and archives.

 

We believe that bringing together a broad panel of stakeholders is the best approach to avoid fragmentation in the emerging scholarly software identification landscape.

 

We also believe that connecting scholarly players with the daily practice of software development in industry will ease the adoption by these emerging scholarly initiatives of standards that are compatible with the well established practice of software development worldwide.

 

To this end, we plan to engage a dialogue with software industry bodies and software foundations that are working on standard approaches for identification of software components, like the Linux Foundation. An endorsement from such organizations would have a significant positive impact, as a shared standard will allow one to refer to both research and

industry software in exactly the same way.

 

Engagement with existing work in the area​

(A brief review of related work and plan for engagement with any other activities in the area)

 

The initial participants of the working group are member of, or have direct connections with the following related initiatives:

  • FORCE11 Software Citation Implementation WG This group builds on the previous FORCE11 Software Citation Working Group, which developed and published an initial set of software citation principles (https://doi.org/10.7717/peerj-cs.86). The activities of the Software Citation Implementation Working Group will be conducted with relevant stakeholders (publishers, librarians, archivists, funders, repository developers, other community forums with related working groups, etc.) to: endorse the principles; develop sets of guidelines for implementing the principles; help implement the principles; and test specific implementations of the principles.
  • Software Heritage

    The Software Heritage archive provides unique, intrinsic, persistent identifiers for over 7 billion software source code artifacts worldwide, and is tightly connected with industry players working on source code qualification (Intel, Microsoft, Google, GitHub, Nokia Bell Labs, etc.)

  • swMath

    swMath is a project that has indexed and referenced over 20.000 research software projects in Mathematics

  • DataCite

    DataCite, working with about 100 members and 1,500 repositories, is providing persistent identifiers in the form of DOIs to scholarly outputs, including software.

  • FREYA

    The European Commission-funded FREYA project provides persistent identifier infrastructure for the European Open Science Cloud, and is working on increasing the adoption of persistent identifiers, including software.

  • OpenAire

    OpenAIRE is the European infrastructure in support of Open Science. It fosters and monitors the adoption of Open Science across Europe and beyond, at the level of the Countries for legal issues, and cross-boundaries to address research community specific requirements. In particular, it is building a portal indexing all open access articles, and will soon expand its scope to cover scientific software.

Related RDA Groups

 

We have identified the following initial list of RDA groups whose activity and scope is related to this working group:

  • PID IG

  • Reproducibility IG

  • Data versioning WG

  • Research Data Provenance IG

  • Research Data Repository Interoperability WG

  • Repository Platforms for Research Data IG

Work Plan​

(A specific and detailed description of how the WG will operate including)

 

The target outcome of the working group is composed of the following documents that can be separated into two categories medium-term goals and long-term goals:

 

Medium-term goals (M12)

  • An initial collection of software identification use cases and software identifier schemas.
  • An overview of the different contexts in which software artifact identification is relevant, including
    • Scientific reproducibility
    • Fine grained reference to specific code fragments from scientific articles or documentation
    • Description of dependency information
    • Citation of software projects for proper credit attribution

 

Long-term goals (M18)

  • Call out other RDA groups, in particular those working on citation and versioning issues, for consultation on the draft guidelines

  • A set of guidelines for persistent software artifact identification, in each of the above contexts

 

Mode of operation

  • Open a GitHub repository where issues are used to discuss topics that will be discussed and meetings are documented.
  • Schedule a monthly on-line conf-call or group-mail informing the advancement made during the month and opening issues to discussion.
  • Schedule meetings during the 12th, 13th, 14th and 15th plenaries (18M)

Timeline

Nov 18: [12th plenary] first meeting start discussion on medium-term goals

Dec 18 - Mar 19: medium-term goals Apr 19: [13th plenary] progress report

May 19 - Aug 19: medium-term goals and long-term goals

Sep 19: [14th plenary] medium-term goals report and draft Long-term deliverable

Oct 19 - Fev 20: long-term goals

Mar 20: [15th plenary] outputs publication

 

Adoption Plan​

(A specific plan for adoption or implementation of the WG outcomes within the organizations and institutions represented by WG members, as well as plans for adoption more broadly within the community. Such adoption or implementation should start within the 18 month timeframe before the WG is complete.)

 

Adoption by organizations and institutions represented by WG members

 

The first key step to broad adoption is to get the guidelines endorsed and adopted by all the initiatives that are represented in this working group: they are significant catalysers for adoption in the academic community.

 

Adoption by the academic community

 

The software identification guidelines are a stepping stone for software citation, where an identifier is needed to specify the exact software referenced, therefore its recommendations will be the first output formalizing the way software source code should be referenced in the academic community. Potentially, the adoption of the software identification guidelines will provide a consensual solution to identifying software when citing software. It will be the first document produced by the academic community for software identification in a time when software is starting to be considered a legitimate product of research and its adoption will ensure a standardized approach to identify software in scholarly workflows that is compatible with the well established practice of software development.

 

Initial Membership​

(A specific list of initial members of the WG and a description of initial leadership of the WG.)

 

First Name Last Name Email Institution Role
Roberto Di Cosmo roberto[at]dicosmo.org Inria/Software Heritage co-Chair
Neil Chue Hong N.ChueHong[at]software.ac.uk SSI  
Martin Fenner martin.fenner[at]datacite.org Datacite FREYA  
Daniel S. Katz d.katz[at]ieee.org University of Illinois  
Andrea Dell’Amico   OpenAIRE (ISTI-CNR, Italy)  
Peter Doorn   DANS  
Suenje Dallmeier-Tiessen   CERN  
Wolfram Sperber   swMATH  
Brian Matthews   STFC  
Morane Gruenpeter   Software Heritage / Crossminer

 

Review period start:
Friday, 15 June, 2018 to Sunday, 15 July, 2018
Custom text:
Body:

Name of Proposed Interest Group:

Interest Group on an Open Questionnaire for Research Data Sharing Survey

 

Introduction (A brief articulation of what issues the IG will address, how this IG is aligned with the RDA mission, and how this IG would be a value-added contribution to the RDA community):

 

The open data landscape is changing rapidly and we are only beginning to understand the impact of policies and changes in researchers’ practice.  As open data policies are implemented and data sharing practices evolve, comprehensive benchmarking and tracking of open data practices can serve to illuminate advances in data sharing (where and by whom) and help to understand the reasons for different data sharing practices. Various survey reports have examined the gap between policy and daily research practices and how they might be bridged but the granularity and coverage of surveys needs addressing as open data practices vary widely between scientific disciplines and regions. In addition existing surveys differ widely by their questions and by respondent groups. The RDA community, inclusive of researchers, practitioners, and decision-makers would benefit from a coordinated, common open survey approach that could be adopted and implemented to track changes in practice and policy overtime.

 

Following a successful BoF session at the RDA 10th Plenary Meeting, we would like to initiate an Interest Group to 1. develop a community-designed modular and interoperable open survey(s) questionnaire(s); 2. determine how such open survey(s) can be implemented; and 3. Explore how the open survey(s) results could be analyzed globally.

 

 

User scenario(s) or use case(s) the IG wishes to address (what triggered the desire for this IG in the first place):

 

Several stakeholders would be interested to use a survey questionnaire made freely available to investigate the progress, evolution and developments in data sharing among their constituents. Such stakeholders include but are not limited to policy makers, researchers and their research institutions, funding bodies and companies. The potential use cases for such stakeholders are:

Use case 1: The G7 of Science Ministers have organized a working party for Open Science since 2016. The latest meeting communiqué in 2017 (Turin, Italy www.g7italy.it/sites/default/files/documents/G7%20Science%20Communiqu%C3%A9.pdf) stated the importance of research metrics and indicators for Open Science. This IG and the survey(s) that it will generate would contribute directly and indirectly to this call for action.

 

Use case 2: As a result of the publication of the report ‘Open Data: a researcher perspective’, report partners CWTS and Elsevier were approached by the European Commission to inquire if the survey questionnaire used to perform the survey in their report could be used to run a survey among Horizon 2020 participants. Running the survey could have informed the European Commission on the practice of researchers funded through Horizon 2020, thus providing useful data to inform the European Commission both as a policy maker and a funder.

In 2017, the European Commission launched a call for tender for the development of the  next generation of their Open Science Monitor. This call specification included the requirement of running a survey on data sharing. The tender was won by a consortium including both CWTS and Elsevier which will lead in 2018 to a new revised version of the survey implemented for the report mentioned above. This first use case highlights the potential impact an open survey on data sharing could have for such players as the European Commission.

User case 3: The National Institute of Science and Technology Policy in Japan conducted “A Survey on Open Research Data and Open Access” to investigate Japan’s current status and challenges in Open Science. Authors conducted a survey of Japanese researchers at the Science and Technology Experts Network of NISTEP in Nov-Dec 2016. They were asked about their experience on sharing and using their article and data, their recognition of open research data, the sufficiency of resources, and the needs to support researchers. The response rate was very high (70.5%) and results were already used in a discussion on Open Science in the government’s Cabinet Office.

User case 4: Publishers and related companies are conducting similar surveys. For example, Digital Science published their report “The State of Open Data” with the results of a global survey of 2,000 researchers in 2016. This survey assessed the global landscape around open data and sharing practices. It  highlighted the extent of awareness around open data, the incentives around its use, and perspectives researchers have about making their own research data open. Such reports allow publishers and related companies activities to evaluate trends of data sharing in the research community to develop and offer best-fitted solutions. They also help researchers understand the potential of data sharing and enhances their practices.

User case 5: Science Granting Councils Initiative - The national research funding organizations involved in this peer-learning network (https://sgciafrica.org/en-za), which seeks to enhance the capacity of research planning and management capabilities, are interested in the potential of open data to support their research communities. The survey instrument proposed here would create an opportunity for this network to contribute to and benefit from its application. As research councils in this region formulate their strategies, comparable survey data would provide useful guidance to orient their policies and practices. Work in this direction would also provide an opportunity for SGCI members to contribute to a wider Global Research Council call for action on open data ‘to compare and learn from their emerging practices, and collaborate on training and outreach activities.’

 

 

Objectives (A specific set of focus areas for discussion, including use cases that pointed to the need for the IG in the first place.   Articulate how this group is different from other current activities inside or outside of RDA.):

 

Through this Interest Group, our objectives are three-fold:

  1. Develop the User Community:
  • Participation and perspectives: to support application of proposed activities, initial efforts will focus on involving and understanding the perspectives of users who have commissioned and are interested in commissioning open data surveys. What did they seek to understand through their survey? What data proved useful and for what purpose?
  • Promote dialogue among questionnaire designers and survey users
  1. Develop a community-designed modular and interoperable open survey(s):
  • Horizon scanning: identify and analyze existing surveys to compare similarities and differences in topics addressed and indicators used; identify relevant stakeholders.
  • Engagement & recruitment: engage with stakeholders (those who have commissioned and developed such surveys and other relevant ones identified in previous step) to raise awareness of this IG, assess their interest in collaborating, and identify skills and expertise available.
  • Develop survey(s) and/or survey modules taking advantage of existing ones, available and willing expertise, focusing on communities with high interest and willingness to participate. Survey(s) could, for example, be developed in a step-wise approach developing specific pilots for specific communities, geographies, etc. (e.g. initial interest from funding bodies community could lead to a first Working Group).
  • Address language and cultural differences to identify common grounds that can be applied globally.
  1. Determine how such open survey(s) could be implemented and results analyzed globally:
  • Assess available existing technical tools to choose the solution(s) best fitted for our purpose.
  • Identify multiplying networks to deploy survey(s) such as societies or associations that can help achieve a deep reach out effect to run survey(s).
  • Run survey(s) using above tool(s) and networks.
  • To avoid duplication of efforts, we will investigate opportunities to perform survey(s) analysis in a coordinated fashion to reduce costs by avoiding duplication.
  • This coordinated approach also seeks to promote reliability in the analysis of surveys and comparability of findings, which should allow for better benchmarking.

 

 

Participation (Address which communities will be involved, what skills or knowledge should they have, and how will you engage these communities.  Also address how this group proposes to coordinate its activity with relevant related groups.):

 

We anticipate this IG will attract a broad-based audience. During the BoF session at the 10th RDA Plenary Meeting, we engaged scholars, research administrators, research funders, government representatives, publishers and research data service providers. Given the interest in the topic and the potential benefits flowing from comparable and harmonized survey data, we are confident this IG will be able to engage representatives from these sectors on an ongoing basis.

 

In order to be successful, this IG will need to count on skills and knowledge from experts who have experience in developing survey questionnaire development as well as in analysing and interpreting the results of these survey. Accordingly we will aim to engage and recruit in the IG the authors of existing surveys. Experts of research policies in local setting and specific disciplines will also be essential for the IG. This is why we plan to involve experts from various scientific communities.

We will engage stakeholders through our personal network and by direct interaction when names are available (e.g. authors of existing survey reports; list of 70+ participants to the BoF session we organized at the 10th Plenary in Montreal). The IG will also run an engagement exercise through RDA plenaries and side events, or additional events of relevance which will be identified (e.g. submitting abstract to SciDataCon-IDW 2018).

 

Coordination with other RDA groups will take two forms:

  1. Some IGs and WGs have activities that could have common interests with our IG as they deal policy, rewards, metrics, etc. which are key to development of data practices and policies. The two relevant IGs are Data policy standardization and implementation IG and Sharing rewards and credit IG. There might additionally be overlapping interests with the following groups: Data usage metrics WG, Exposing data management plans WG and Mapping the landscape IG.
  2. There are many RDA IGs that represent specific scientific communities that we will want to engage as they might have interest to explore data sharing practices through a survey for their own community. Example of scientific communities IG: Agricultural Data IG, Biodiversity Data Integration IG, Chemistry Research Data IG, Digital Practices in History and Ethnography IG, Linguistics Data IG, Health Data IG, etc.

Similarly there are non-scientific communities also represented as IGs that we will want to engage such as Early Career and Engagement IG.

We will contact and communicate with co-chairs of those IGs in the first months of the IG and meet those interested at the two RDA Plenaries in year 1.

 

 

Outcomes (Discuss what the IG intends to accomplish.  Include examples of WG topics or supporting IG-level outputs that might lead to WGs later on.):

 

The first outcome of the IG will be the development of community-designed modular and interoperable open survey(s) questionnaire(s). These questionnaires will be made open and freely available. We anticipate this contribution will promote the use of surveys by organizations who would like to better understand research data sharing practices and/or policy effects. Being open and freely available, will reduce barriers  that may prevent organizations that have the interest but  would otherwise not be able to perform such an undertaking for lack of expertise and/or resources.

 

The second outcome of the IG will be a consequence of the first as survey(s) will provide results to track changes in practice and policy overtime. These results will help articulate better policies, identify existing gaps, prioritize research funding, initiate projects and initiatives such as for example research infrastructure.

 

The third outcome is dependent on the success of the third objective described above (analyze survey(s) globally) which could achieve if fruitful an aggregated survey result analysis. This last outcome will help improve reliability in the analysis of surveys and comparability of findings. It would also allow for better benchmarking, leading to global or regional views on research data sharing but also global approaches to data sharing in specific communities (e.g. universities).

 

While each of these outcomes could be driven through dedicated Working Groups, it is more likely that WGs emerge for specific time-limited tasks. The first outcome might, for example, lead to pilots for the development of a survey module targeting a specific stakeholder community (e.g. funding bodies). The second outcome could lead to a WG looking into the policy dimensions of data sharing. And the third outcome might in itself become a WG to explore a centralized survey(s) analysis scheme.

Finally, this survey might become a good example of RDA-based survey in the context of Open Science and Data Sharing as community-based development involving various stakeholders, developing a de-facto standard to conduct global and comprehensive surveys.

 

 

Mechanism (Describe how often your group will meet and how will you maintain momentum between Plenaries.):

 

An IG ‘Executive Committee’, composed of a limited group (max. 12-15), will meet virtually by video conference and/or phone on a monthly basis during the first year. The Executive Committee will focus its efforts on generating awareness, engagement, and recruitment of stakeholders into an IG community. The frequency of calls will be assessed for the following years depending on coordination needs and support generated by emerging sub-groups and WGs, who might lead on IG activities for limited periods of time. In addition to the IG co-chairs, the executive committee will take on board a representative from each sub-group and/or WG who will be the main contact point and who will report progress and decisions.

Sub-groups and/or Working Groups are indeed likely to emerge as the activities of the IG develop. Each will have a specific tasks and/or community focus. Their work will be self-organized but the co-chairs will create a structure to coordinate and support their contributions. Their interactions could take place virtually or physically (e.g. regionally localized group).

A calendar of events (e.g. conferences, workshops, etc.) will be created in order to collect and use opportunities for physical meetings beyond the RDA Plenaries.

 

More broadly, should the IG be approved and collaborations initiated to develop common survey instruments, we are aware that the issues of governance and infrastructure will need to be specifically addresses by the IG. Such support could be obtained for example through international organisations like the OECD who have an interest in developing policy indicators.

 

 

Timeline (Describe draft milestones and goals for the first 12 months):

 

The Gantt chart below shows how objectives 1 and 2 described above will start in year 1 and will overlap with an earlier start for objective 1 as its initial outcomes are needed to move forward objective 2. Objective 3 will start during phase 3 as we start exploring how survey(s) can be analyzed globally.

The modular approach suggested below will allow each survey/module developed in subsequent phases to benefit from expertise, experience and networks developed in previous surveys/modules. Process will be streamlined thus increasing efficiency and allowing use of Standard Operating Procedures developed and improved in previous phases.

Review period start:
Monday, 29 January, 2018
Custom text:

Pages