You are here

Body:

Name of Proposed Interest Group: Data Properties as Economic Goods (Data Economics)

 

Introduction (A brief articulation of what issues the IG will address, how this IG is aligned with the RDA mission, and how this IG would be a value-added contribution to the RDA community):

 

The fact that data has a value is commonly recognised. However, data value is different from those associated with the consumable goods. Supporting this area of inquiry and development, there are a number of initiatives to create data markets and data exchange services.

 

As part of this trend, existing business models of paid or commercial data(sets) services such as archives are based rather on services subscription fee. Quality of datasets  in many cases  is assessed by independent certification body or based on peer review by expert. This type of model is useful for specific use cases, although this does not provide a consistent model to make data an economic goods and enable data commoditisation.  Another key factor is growing interest in making the best use of research data resulted in formulation of the FAIR (Findable – Accessible – Interoperable – Reusable) data principles, which are widely supported by industry and business.

 

However, emerging data driven technologies and economy facilitate interest to making data a new economic value (data commoditisation) and consequently identification of the new properties of data as economic goods. The following properties leverage the FAIR data properties and are defined as STREAM for industrial and commoditised data

 

[S] Sovereign

[T] Trusted

[R] Reusable

[E] Exchangeable

[A] Actionable

[M] Measurable

Other properties to be considered and necessary for defining workable business and operational models: nonrival nature of data, data ownership, data quality, measurable use of data, privacy, integrity, and provenance.

 

 

User scenario(s) or use case(s) the IG wishes to address (what triggered the desire for this IG in the first place):

 

The proposed Interest Group is a direct and well supported outcome of the BoF on Data as economic goods that took place at the RDA 12 Plenary on 5-8 Nov 2018 in Gaborone.  The BoF drew 29  participants from 13 countries, and addressed and included four presentations. The agenda and notes are found at: https://www.rd-alliance.org/bof-data-properties-economic-goods-rda-12th-...

 

Following the presentation, a thoughtful and energetic discussion took place, and the need for continued discussion was explicit, leading to the unanimously supported recommendation to create corresponding IG within the RDA community. The importance of addressing data economics and data market issues is supported by numerous national and international projects and initiatives in Europe (such as MATES project, DL4LD project, International Data Space Association (IDSA), others). Data Markets is also a topic of the ICT139(a) call in Horizon2020 work programme for 2019-2020. Data markets is an important component of the ongoing industry digitalisation.

 

Creating consistent and workable models for data exchange and commoditisation is critical to facilitating research data usage, creation of new value added data driven services bringing additional resources to research organisations. Consistent data pricing and data markets models are equally important for government funded and sponsored research, open data and governmental data. A number of examples from industry demonstrate practical interest to facilitate and obtain value from data exchange, interoperability, and adopting common architectures for data markets. This includes but not limited to creating IDSA Architecture (https://www.internationaldataspaces.org/wp-content/uploads/2018/04/InternationalDataSpacesAssociation_ReferenzArchitecture2.0.pdf), recent Open Data Initiative (ODI) by Microsoft and associates SAP and Abode, and others.  

 

These developments, and the mission and goals of the RDA lend support and underscore the need for an interest group on Data Economics, and inter-related topics of data properties, the association between data and economic goods, data markets and other topics. An interest group will provide an important open forum for further exploration and open discussion, and allow for important connections to be made to data economies as they intersect with other RDA IGs and WGs.

 

 

Objectives (A specific set of focus areas for discussion, including use cases that pointed to the need for the IG in the first place.   Articulate how this group is different from other current activities inside or outside of RDA.):

 

The objective of this IG is the exchange of information about existing developments and initiatives and promotion of best practices in data markets and data marketplace operation.

  1. Provide and open forum and venue for gathering individuals interested in discussing and defining set of properties that are related to data as economic goods
  2. Compile initial agreed set of data properties and validate currently denoted STREAM properties
  3. Identify key actions to further research topics and coordination actions
  4. Identify set of documents to be produced by IG, in particular STREAM-like properties definition and overview of business models for research data and commoditised data trading and exchange
  5. Provide input to other RDA WGs and IGs, as well as to other associations and standardisation bodies such as BDVA, IDSA, NIST, ISO and IEEE, using existing RDA channels or establishing new channels.

 

 

Participation (Address which communities will be involved, what skills or knowledge should they have, and how will you engage these communities.  Also address how this group proposes to coordinate its activity with relevant related groups.):

 

IG members will provide an initial community to start working on the documents.

 

Interaction with other IG and WG is expected, external experts and contributors will be invited. Interaction with external organisations is expected: NIST, IEEE, International Data Space Association.

 

 

Outcomes (Discuss what the IG intends to accomplish.  Include examples of WG topics or supporting IG-level outputs that might lead to WGs later on.):

 

At the initial stage of 1-1.5 year, the IG plans to produce an information document(s) providing overview of existing approaches and concept related to data economic values, data pricing, economical models related to several use cases identified by RDA community on using and managing research data.

 

Other general outcomes may include:

  • Increased use and sharing data; in particular between research community and industry
  • Establishing compatible metadata and measurable data properties.
  • Bringing economic value to research based on better defined cooperation with industry

The IG activity will be driven by community and other expected outcomes will address community needs and requests. In particular, he IG will make the case for creating a taxonomy of the data properties as economic good that be a spin off Working Group.

 

 

Mechanism (Describe how often your group will meet and how will you maintain momentum between Plenaries.):

 

The IG will meet at the RDA Plenaries, will manage discussion on the list, and establish regular call (expected monthly) between plenaries. Activity between plenaries will be facilitated by working of the IG outcomes and documents.

 

Timeline (Describe draft milestones and goals for the first 12 months):

 

The suggested timeline is presented in the table below

 

Time span or milestone

Activity

Goals and outcomes

2019-2020

  • Meetings at RDA Plenaries (first expected at RDA13Spring 2019) – twice a year
  • Regular tel/skype/video call – as required by work items (expected 3-4 between plenaries)

Build community

2021

Review IG results outcome and identify needs for renewed Charter and mandate

Renewed IG Charter

September 2019

Initial Draft information document on Data properties as economic goods

Draft

RDA14 Autumn 2019

Interacting with other RDA IG/WG and external standardisation bodies and organisations

IG network is built

Spring 2020

Final draft on Data properties as economic goods. Call for IG and community comments.

Final drat published

Autumn 2020

IG information finalisation

Document published by RDA

 

 

 

 

 

Potential Group Members (Include proposed chairs/initial leadership and all members who have expressed interest):

 

 

FIRST NAME

LAST NAME

EMAIL

TITLE

Yuri

Demchenko

y.demchenko@uva.nl

Senior Researcher , University of Amsterdam

Jane

Greenberg

jg3243@drexel.edu

Drexel University

Gary

Berg-Cross

gbergcross@gmail.com

Ontolog

Steven

Brewer

S.Brewer@soton.ac.uk

University of Southampton

Wouter

Los

W.Los@uva.nl

University of Amsterdam

Markus

Spiekermann

Markus.Spiekermann@isst.fraunhofer.de

Fraunhofer ISST

Sebastian

Steinbuss

@internationaldataspaces.org

International Data Spaces Association

Rebecca

Koskela

rkoskela@unm.edu

Univ of New Mexico

Keith

Jeffery

Keith.Jeffery@keithgjefferyconsultants.co.uk

Keith Jeffery Consultants

  

Additionally, 27 attendees of the BoF on Data Economics will be added when the meeting sign-in sheet will be decoded 

Review period start:
Friday, 4 January, 2019 to Tuesday, 12 February, 2019
Custom text:
Body:

Case Statement

 

  1. WG charter
    1. Context

 

Technological advancements have made science more data intensive and interconnected, with researchers producing and sharing increasing volumes of research data. To maximise the value of science, research data (sets) should have four foundational characteristics; they should be:

  • 'Findable', i.e. discoverable with machine readable metadata, identifiable and locatable by means of a standard identification mechanism;
  • 'Accessible', i.e. available and obtainable;
  • 'Interoperable', i.e. both syntactically parseable and semantically understandable, allowing data exchange and reuse among scientific disciplines, researchers, institutions, organisations and countries; and
  • 'Reusable', i.e. sufficiently described and shared with the least restrictive licences, allowing the widest reuse possible across scientific disciplines and borders, and the least cumbersome integration with other data sources.

Findability, Accessibility, Interoperability and Reusability – the FAIR principles – were first introduced in 2014 and intend to define a minimal set of community-agreed guiding principles and practices that allow both machines and humans to find, access, interoperate and re-use research data. The FAIR principles define characteristics that contemporary research data resources, vocabularies and infrastructures should exhibit to assist discovery and reuse by third-parties and they can be further refined into a range of facets that have the potential to: a) improve scientific research, b) contribute to growth and accelerate innovation in a global digital economy, c) increase the reproducibility of research and d) better inform citizens and society about the results and value of research (through thorough and comprehensible description of the data sets).

    1. problem

The aspirational nature of the FAIR data principles and their rapid adoption at international level has led to an ambiguity and a wide range of interpretations of FAIRness since the principles do not strictly define how to achieve a state of FAIRness but rather they describe a continuum of features, attributes and behaviours that move a digital object closer to that goal. As a result, a number of incompatible methodologies to assess FAIRness have been developed already and relevant work is in under way by various groups.

 

Due to the lack of a common set of core assessment criteria for FAIRness, researchers and organisations cannot evaluate the readiness and implementation level of their datasets vis-à-vis the FAIR data principles in a coherent way. The majority of the available FAIR assessment frameworks: i) produce results which cannot be combined or compared and ii) do not allow a benchmark based on the comparison amongst peers. In addition, research performing organisations and data infrastructures cannot develop or follow a minimum set of shared guidelines to climb up the ladder of FAIR because of the increased heterogeneity of the offered FAIR metrics tools.

    1. Outcomes

The Working Group "FAIR data maturity model: core criteria to assess the implementation level of the FAIR data principles" will bring together stakeholders from different scientific and research disciplines, the industry and public sector, who are active and/or interested in the FAIR data principles and in particular in assessment criteria and methodologies for evaluating their real-life uptake and implementation level. The Working Group will develop as an RDA Recommendation a common set of core assessment criteria for FAIRness and a generic and expandable self-assessment model for measuring the maturity level of a dataset from the following perspectives:

  • Data findability, i.e. how well it describes the data it produces or manages with rich metadata, assigns to data/metadata a globally unique persistent identifier and registers or indexes them in a searchable resource;
  • Data accessibility, i.e. how well it allows the retrieval of its data/metadata by their identifier using a standardized communications protocol that is open, free and universally implementable;
  • Data interoperability, i.e. how well it ensures that the precise format and meaning of exchanged and shared data/metadata is preserved and understood;
  • Data reusability, i.e. how well it releases data/metadata with a clear and accessible data usage license, associated with detailed provenance and follows practices that promote the reuse and share of data, unless certain privacy or confidentiality restrictions apply.

In addition, the Working Group will design:

  • A self-assessment toolset that enables researchers and organisations to evaluate and improve the readiness and implementation level of their datasets vis-à-vis the FAIR data principles.
  • A lightweight version of the FAIR Data Maturity Model (aka FAIR data checklist), aiming to raise awareness on the main aspects related with the FAIR principles.

The outcomes of the Working Group will be possible to be applied not only to data in the conventional sense but also to data-related algorithms, tools, workflows, protocols and other data-related services produced or managed by the assessed entity.

  1. Value proposition

Given that the outcomes of the Working Group will be in the form of generic and reusable building blocks, researchers and organisations will be in a position to easily apply and extend them in order to address FAIR-related assessment needs specific to their own thematic disciplines and/or countries. That will increase the coherence and interoperability of existing or emerging FAIR assessment frameworks and it will ensure the combination and compatibility of their results in a meaningful way.

 

The outcomes of the Working Group "FAIR data maturity model: core criteria to assess the implementation level of the FAIR data principles" will benefit:

  • Researchers, data stewards and other data professionals who are involved in the production and management of research data and have to follow good data management and data stewardship practises (which include the notions of data collection, annotation, archival and long-term care, either alone or in combination with newly generated data).
  • Data services owners (data infrastructures, data repositories, owners of commercial and open-source tools), who are responsible for setting up and maintaining a data-related services and tools.
  • Organisations that capture, generate, manage, share, protect and preserve research data.
  • Policymakers who are responsible for defining data policies at international, European and national level.

The Working Group will provide to the aforementioned user categories an instrument with a three-fold nature:

  1. It will be descriptive, i.e. it will describe the as-is FAIR-related maturity level of a dataset,
  2. It will be prescriptive, i.e. it will provide guidance to researchers and organisations to improve the implementation of the FAIR data principles (aka 'FAIRness') through recommendations, and
  3. It will be comparative, i.e. it will allow a benchmark based comparison amongst peers.

In addition, the outcomes of the Working Group are expected to:

  • Contribute to growth and accelerate innovation in a global digital economy: since data is becoming increasingly important for all aspects of the international economy, a common set of core assessment criteria and the FAIR Data Maturity Model will improve the readiness and capability of organisations to open up their data in a way that creates potential benefits for their investment plans (a specific example in Europe of the economic impact of opening up data is the Copernicus earth observation system).
  • Provide savings in money: the outcomes of the Working Group will ensure money savings to researchers and organisations as it will deliver a reusable solution for measuring the FAIRness of their data. Also, it will contribute to the improvement of their readiness and implementation level of the FAIR principles, which will lead to money savings from the reuse of high-quality data, the combination of data sets across borders or disciplines and the avoidance of duplication.
  • Provide savings in time for researchers and organisations aiming to implement the FAIR principles.
  • Increase transparency: better and faster implementation of the FAIR data principles will help to increase the reproducibility of research, which currently can be as low as 10-30% in key areas, such as cancer research. This can have a positive impact for the scientific principle of credibility, replication and further research given that the scientific community has repeatedly experienced instances of misconduct and erroneous analyses, which may endanger whole scientific fields
  1. Engagement with existing work in the area

The Working Group "FAIR data maturity model: core criteria to assess the implementation level of the FAIR data principles" must build upon existing relevant efforts at international, European and sectorial level and will complement emerging activities (e.g. funded by the H2020 Work Programme 2018-20) that support the FAIR data uptake and compliance across borders/disciplines.

 

Research Data Alliance offers an ideal environment for an engagement of this kind because RDA can bring deep knowledge from the promotion of research data interoperability at disciplinary levels together with hands-on experience in leveraging such knowledge in order to improve interoperability amongst scientific disciplines too.

 

Two of RDA groups having this twofold nature are the Disciplinary Collaboration Framework Interest Group and the Domain Repositories Interest Group, which both enhance communication with other RDA IGs and WGs and represent the interests of specific disciplines in those groups. The Working Group "FAIR data maturity model: core criteria to assess the implementation level of the FAIR data principles" will work closely with the aforementioned Interest Groups aiming to: a) capture interoperability needs between disciplines and in view of specific scientific challenges, b) gather relevant input from different disciplines and c) develop and apply a structured methodology for prioritising, harmonising and efficiently articulating inter-disciplinary needs.

 

In parallel, the Working Group will investigate opportunities for collaboration with existing or emerging RDA groups that address any aspect that is relevant with the implementation of the FAIR data principles, such as the:

  • Data Description Registry Interoperability (DDRI) WG
  • DMP Common Standards WG
  • Exposing Data Management Plans WG
  • RDA/FORCE11 FAIR Sharing Working Group
  • Metadata Standards Catalog WG
  • WDS/RDA Assessment of Data Fitness for Use WG
  • Research Data Repository Interoperability WG
  • RDA/CODATA Legal Interoperability IG
  • Data policy standardisation and implementation IG
  • Education and Training on handling of research data IG
  • Metadata IG
  • PID IG

In addition, the Working Group will engage with pertinent international and European actors and activities such as:

  • FAIRmetrics.org: a group collaborating with a broad set of stakeholders to design a framework for evaluating "FAIRness" that enables both qualitative and quantitative assessment of the degree to which online resources comply with the FAIR Data principles.
  • Horizontal or discipline-specific initiatives to measure the implementation of the FAIR data principles such as:
    • DANS FAIR data assessment tool: an online tool prototype which guides the user through a set of questions to assess a specific dataset.
    • ARDC FAIR Data self-assessment: a self-assessment tool designed predominantly for data librarians and IT staff to assess the 'FAIRness' of a dataset and determine how to enhance its FAIRness (where applicable).
    • CSIRO 5-star data rating tool: a tool that allows users to carry out a self-assessment based on 5 qualities of data – Findable, Accessible, Interoperable, Reusable and Trusted. For each quality, a number of specific questions have been curated to allow users to rate their data according to its current state.
  • GO FAIR: a community-led initiative to contribute to and coordinate the coherent development of the Internet of FAIR Data & Services. GO FAIR is analysing the possibility to organise a FAIR certification mechanism of services, tools, organisations, and people (including data stewards) aiming to help research funders and other stakeholders to promote open science, for instance by enabling researchers to incorporate a certified service in their data stewardship plans.
  • FORCE11 FAIR Data Management Plans Working Group: FORCE11, the international community/platform that hosted the open consultation for the definition of the 15 FAIR guiding principles in 2016, has established the "FAIR DMPs" Working Group aiming to provide a simple set of principles, along with examples of domain-specific implementations and recommendations for best practices, that emphasizes good data management, stewardship and machine-readablity for making data FAIR.
  • RDA/FORCE11 FAIR Sharing Working Group: connecting data policies, standards & databases " Working Group (former FAIR Sharing WG): a use cases-driven joint effort between RDA and Force11 to develop: a) a set of recommendations to guide users and producers of databases and content standards to select and describe them, or recommend them in data policies, and b) a curated registry, which enacts the recommendations and assists a variety of end users, providing well described, interlinked, and cross-searchable records on content standards, databases and data policies.
  • CODATA: the Committee on Data of the International Council for Science (ICSU) that promotes global collaboration to improve the availability and usability of data for all areas of research.
  • Science Europe: an association of European Research Funding Organisations (RFO) and Research Performing Organisations (RPO), active in the field of alignment of the research data management policies and templates.
  • European Commission Expert Group on "Turning FAIR data into reality": established by the Commission, this expert group is working together with European and global initiatives towards a proposal for a FAIR Data Action Plan for consideration by the Commission, Member States and stakeholders in the research and data communities. The draft proposal presented by the Expert Group at the 2nd EOSC Summit on 11 June 2018, suggests the design of an agreed set of basic core FAIR metrics, which will be "standardised" and extendible in order to cover the needs and practises of different communities.
  • EU-funded projects (e.g. EOSC pilot, EOSC hub, FREYA, Open AIRE Advanced etc.) supporting the first phase in the development of the European Open Science Cloud (EOSC):
  • European Commission: DG RTD, DG CNECT, DG DIGIT and the Publications Office.
  1. Work Plan

The Working Group "FAIR data maturity model: core criteria to assess the implementation level of the FAIR data principles" will build on top and combine the most salient characteristics of existing efforts for measuring the readiness and implementation level of a dataset vis-à-vis the FAIR data principles.

 

The outcomes of the Working Group (a common set of core assessment criteria for FAIRness in the form of a Recommendation and a generic and expandable self-assessment model for measuring the maturity level of a dataset FAIR) will be generic - and not specific to a certain discipline or country – and apply to any type of data in the conventional sense as well as to data-related algorithms, tools, workflows, protocols and other data-related services. They will be based on a core set of mutually exclusive and collectively exhaustive assessment criteria and be populated in a way that allows their extension in order to meet specific FAIR-related assessment needs, at national and/or discipline level (for example, for providing additional layers of detail for a number of discreet areas). Furthermore, design method will allow in the future the provision of estimations about the costs and benefits for organisations, both in economic and non-economic terms, for moving their datasets to a higher FAIR maturity level.

 

The outcomes will be developed following a progressive approach via a number of iterations. In each iteration, the current structure of the FAIR assessment criteria and the maturity model will be examined and validated in order to evolve to a revised version. The development process will be open, ensuring an active and continuous engagement of user communities and stakeholders in all development phases (including scoping, construction and testing). For that purpose, well-defined working and decision-making mechanisms will be defined and agreed from the beginning in order to facilitate the operation of the Working Group.

 

The main phases and deliverables will be the following:

  1. A: Initiation: during the first phase, the exact scope of the work will be defined including the objectives, the usage and the purpose of the assessment criteria and the model. Similar assessment criteria and models will be systematically analysed in order to identify components that could be reused either as they are or after applying some improvements, aiming to avoid the duplication of efforts.

Main outputs:

  • Scope definition
  • Literature review: an overview of existing approaches (generic or specific-purpose)

Timeline: M1 – M2

  1. B: Stakeholder identification: the initiation phase will be followed by the identification of the main actors who will be related with the outcomes of the Working Group from three perspectives: development process, execution and interest in the results.

Main outputs:

  • Stakeholder matrix

Timeline: M3

  1. C: Design methodology: the Working Group will define and agree on a systematic, effective and efficient design methodology that will lead to results that are rigorous and both theoretically founded and empirically validated. The design methodology will follow an iterative approach, leveraging the most appropriate techniques for the population of the expected results. A special role should be foreseen for RDA Working Groups and Interest Group with pertinent objectives.

Main outputs:

  • Design methodology

Timeline: M4 – M6

  1. D: Design: this phase will define all aspects with regard to the structure and the body of the FAIR assessment criteria and the model. The design phase will answer questions such as:
  • How many different maturity stages will be foreseen?
  • How many dimensions or layers will the model assess?
  • Will be any documented maturation paths?
  • How many questions will be included in the model?
  • What will be the type of dependencies in the implementation of the foreseen model’s capabilities or attributes (implicit / explicit)?
  • Which techniques will be used for the population of the model (e.g. literature review, case study interviews, focus groups etc.)?
  • Will be the measurement of the maturity quantitative and/or qualitative?

Main outputs:

  • Core assessment criteria for FAIR
  • FAIR data maturity model
  • FAIR data checklist

Timeline: M7 – M16

  1. E: Testing: the assessment criteria and the model will be verified and validated following a well-defined evaluation methodology.

Main outputs:

  • Testing results

Timeline: M13 – M16

  • Delivery: when the main building blocks of the outcomes will be constructed, various characteristics regarding their distribution will be decided such as: what kinds of materials will be publicly available, in what format etc. The exact set of the outputs of this phase will be progressively decided by the members of the Working Group.

Timeline: M17 – M18

  1. Adoption Plan

The members of the Working Group will incorporate the outcomes of their work in their policies and practices that promote the implementation of the FAIR data principles at national, discipline and international level. The set of core assessment criteria for FAIRness will be used as the basis to examine the compatibility and alignment of their instruments (such as recommendations, frameworks, templates, toolsets etc.) and any corrective activities will be planned and implemented in a coherent way. In addition, all members will systematically promote the outcomes to their specific communities, aiming to raise awareness and support their real-life adoption.

 

Furthermore, the Working Group will create and publish a number of guidelines for the extension of the core assessment criteria and the FAIR maturity model by user communities and organisations with specific needs in the evaluation of the implementation of the FAIR data principles. That will allow and facilitate a wider adoption of the outcomes of the Working Group by existing and emerging initiatives.

  1. Initial Membership

This is an initial membership, which gathers together representatives from organisations with experience in the area of FAIR data uptake, real-lie implementation and compliance across borders/disciplines. The Working Group will progressively get in touch with disciplinary specific initiatives to get their input too.

 

Co-chairs:

 

Review period start:
Sunday, 23 September, 2018 to Tuesday, 23 October, 2018
Custom text:
Body:
Review period start:
Sunday, 19 August, 2018
Custom text:
Body:

In mitigating against food insecurity how can TVET institutions be practically engaged to transmit practical technical knowhow to the local people to transform their lives? This is a case in Taita Taveta County, Kenya.

 

Review period start:
Wednesday, 4 July, 2018
Custom text:
Body:

Note:  The following text is deprecated in favor of the revised Case Statement attached.

 

 

RDA Working Group: Software Source Code Identification Case Statement

(this is a joint effort, coordinated with FORCE11)

 

Charter​

(A concise articulation of what issues the WG will address within a 18 month time frame and what its “deliverables” or outcomes will be.)

 

Software, and in particular source code, plays an important role in science: it is used in all research fields to produce, transform and analyse research data, and is sometimes itself an object of research and/or an output of research.

 

Unlike research data and scientific articles, though, software source code has only very recently been recognised as important subject matter in a few initiatives related to scholarly publication and archiving. These initiatives are now working on a variety of plans for handling the identification of software artifacts.

 

At the same time, unlike research data and scientific articles, the overwhelming majority of software source code is developed and used outside the academic world, in industry and in developer communities where software is routinely referenced, in practice, through methods that are totally different from the ones used in scholarly publications.

 

The objective of this working group is to bring together a broad panel of stakeholders directly involved in software identification. The planned output will be concrete recommendations for the academic community to ensure that the solutions that will be adopted by the academic players are compatible with each other and especially with the software development practice of tens of millions of developers worldwide.

The output of this working group is highly relevant for the broader RDA community, because most research datasets are created and/or transformed using software, so a common standard for software identification will enable better traceability and reproducibility of research data.

Value Proposition​

(A specific description of who will benefit from the adoption or implementation of the WG outcomes and what tangible impacts should result)

 

The planned outcomes of the working group are recommendations and guidelines for software artifact identification (in particular in its source code form), targeted specifically at scholarly stakeholders that are willing to integrate software artifact into their workflow: scientific publishers, institutional repositories, and archives.

 

We believe that bringing together a broad panel of stakeholders is the best approach to avoid fragmentation in the emerging scholarly software identification landscape.

 

We also believe that connecting scholarly players with the daily practice of software development in industry will ease the adoption by these emerging scholarly initiatives of standards that are compatible with the well established practice of software development worldwide.

 

To this end, we plan to engage a dialogue with software industry bodies and software foundations that are working on standard approaches for identification of software components, like the Linux Foundation. An endorsement from such organizations would have a significant positive impact, as a shared standard will allow one to refer to both research and

industry software in exactly the same way.

 

Engagement with existing work in the area​

(A brief review of related work and plan for engagement with any other activities in the area)

 

The initial participants of the working group are member of, or have direct connections with the following related initiatives:

  • FORCE11 Software Citation Implementation WG This group builds on the previous FORCE11 Software Citation Working Group, which developed and published an initial set of software citation principles (https://doi.org/10.7717/peerj-cs.86). The activities of the Software Citation Implementation Working Group will be conducted with relevant stakeholders (publishers, librarians, archivists, funders, repository developers, other community forums with related working groups, etc.) to: endorse the principles; develop sets of guidelines for implementing the principles; help implement the principles; and test specific implementations of the principles.
  • Software Heritage

    The Software Heritage archive provides unique, intrinsic, persistent identifiers for over 7 billion software source code artifacts worldwide, and is tightly connected with industry players working on source code qualification (Intel, Microsoft, Google, GitHub, Nokia Bell Labs, etc.)

  • swMath

    swMath is a project that has indexed and referenced over 20.000 research software projects in Mathematics

  • DataCite

    DataCite, working with about 100 members and 1,500 repositories, is providing persistent identifiers in the form of DOIs to scholarly outputs, including software.

  • FREYA

    The European Commission-funded FREYA project provides persistent identifier infrastructure for the European Open Science Cloud, and is working on increasing the adoption of persistent identifiers, including software.

  • OpenAire

    OpenAIRE is the European infrastructure in support of Open Science. It fosters and monitors the adoption of Open Science across Europe and beyond, at the level of the Countries for legal issues, and cross-boundaries to address research community specific requirements. In particular, it is building a portal indexing all open access articles, and will soon expand its scope to cover scientific software.

Related RDA Groups

 

We have identified the following initial list of RDA groups whose activity and scope is related to this working group:

  • PID IG

  • Reproducibility IG

  • Data versioning WG

  • Research Data Provenance IG

  • Research Data Repository Interoperability WG

  • Repository Platforms for Research Data IG

Work Plan​

(A specific and detailed description of how the WG will operate including)

 

The target outcome of the working group is composed of the following documents that can be separated into two categories medium-term goals and long-term goals:

 

Medium-term goals (M12)

  • An initial collection of software identification use cases and software identifier schemas.
  • An overview of the different contexts in which software artifact identification is relevant, including
    • Scientific reproducibility
    • Fine grained reference to specific code fragments from scientific articles or documentation
    • Description of dependency information
    • Citation of software projects for proper credit attribution

 

Long-term goals (M18)

  • Call out other RDA groups, in particular those working on citation and versioning issues, for consultation on the draft guidelines

  • A set of guidelines for persistent software artifact identification, in each of the above contexts

 

Mode of operation

  • Open a GitHub repository where issues are used to discuss topics that will be discussed and meetings are documented.
  • Schedule a monthly on-line conf-call or group-mail informing the advancement made during the month and opening issues to discussion.
  • Schedule meetings during the 12th, 13th, 14th and 15th plenaries (18M)

Timeline

Nov 18: [12th plenary] first meeting start discussion on medium-term goals

Dec 18 - Mar 19: medium-term goals Apr 19: [13th plenary] progress report

May 19 - Aug 19: medium-term goals and long-term goals

Sep 19: [14th plenary] medium-term goals report and draft Long-term deliverable

Oct 19 - Fev 20: long-term goals

Mar 20: [15th plenary] outputs publication

 

Adoption Plan​

(A specific plan for adoption or implementation of the WG outcomes within the organizations and institutions represented by WG members, as well as plans for adoption more broadly within the community. Such adoption or implementation should start within the 18 month timeframe before the WG is complete.)

 

Adoption by organizations and institutions represented by WG members

 

The first key step to broad adoption is to get the guidelines endorsed and adopted by all the initiatives that are represented in this working group: they are significant catalysers for adoption in the academic community.

 

Adoption by the academic community

 

The software identification guidelines are a stepping stone for software citation, where an identifier is needed to specify the exact software referenced, therefore its recommendations will be the first output formalizing the way software source code should be referenced in the academic community. Potentially, the adoption of the software identification guidelines will provide a consensual solution to identifying software when citing software. It will be the first document produced by the academic community for software identification in a time when software is starting to be considered a legitimate product of research and its adoption will ensure a standardized approach to identify software in scholarly workflows that is compatible with the well established practice of software development.

 

Initial Membership​

(A specific list of initial members of the WG and a description of initial leadership of the WG.)

 

First Name Last Name Email Institution Role
Roberto Di Cosmo roberto[at]dicosmo.org Inria/Software Heritage co-Chair
Neil Chue Hong N.ChueHong[at]software.ac.uk SSI  
Martin Fenner martin.fenner[at]datacite.org Datacite FREYA  
Daniel S. Katz d.katz[at]ieee.org University of Illinois  
Andrea Dell’Amico   OpenAIRE (ISTI-CNR, Italy)  
Peter Doorn   DANS  
Suenje Dallmeier-Tiessen   CERN  
Wolfram Sperber   swMATH  
Brian Matthews   STFC  
Morane Gruenpeter   Software Heritage / Crossminer

 

Review period start:
Friday, 15 June, 2018 to Sunday, 15 July, 2018
Custom text:
Body:

Name of Proposed Interest Group:

Interest Group on an Open Questionnaire for Research Data Sharing Survey

 

Introduction (A brief articulation of what issues the IG will address, how this IG is aligned with the RDA mission, and how this IG would be a value-added contribution to the RDA community):

 

The open data landscape is changing rapidly and we are only beginning to understand the impact of policies and changes in researchers’ practice.  As open data policies are implemented and data sharing practices evolve, comprehensive benchmarking and tracking of open data practices can serve to illuminate advances in data sharing (where and by whom) and help to understand the reasons for different data sharing practices. Various survey reports have examined the gap between policy and daily research practices and how they might be bridged but the granularity and coverage of surveys needs addressing as open data practices vary widely between scientific disciplines and regions. In addition existing surveys differ widely by their questions and by respondent groups. The RDA community, inclusive of researchers, practitioners, and decision-makers would benefit from a coordinated, common open survey approach that could be adopted and implemented to track changes in practice and policy overtime.

 

Following a successful BoF session at the RDA 10th Plenary Meeting, we would like to initiate an Interest Group to 1. develop a community-designed modular and interoperable open survey(s) questionnaire(s); 2. determine how such open survey(s) can be implemented; and 3. Explore how the open survey(s) results could be analyzed globally.

 

 

User scenario(s) or use case(s) the IG wishes to address (what triggered the desire for this IG in the first place):

 

Several stakeholders would be interested to use a survey questionnaire made freely available to investigate the progress, evolution and developments in data sharing among their constituents. Such stakeholders include but are not limited to policy makers, researchers and their research institutions, funding bodies and companies. The potential use cases for such stakeholders are:

Use case 1: The G7 of Science Ministers have organized a working party for Open Science since 2016. The latest meeting communiqué in 2017 (Turin, Italy www.g7italy.it/sites/default/files/documents/G7%20Science%20Communiqu%C3%A9.pdf) stated the importance of research metrics and indicators for Open Science. This IG and the survey(s) that it will generate would contribute directly and indirectly to this call for action.

 

Use case 2: As a result of the publication of the report ‘Open Data: a researcher perspective’, report partners CWTS and Elsevier were approached by the European Commission to inquire if the survey questionnaire used to perform the survey in their report could be used to run a survey among Horizon 2020 participants. Running the survey could have informed the European Commission on the practice of researchers funded through Horizon 2020, thus providing useful data to inform the European Commission both as a policy maker and a funder.

In 2017, the European Commission launched a call for tender for the development of the  next generation of their Open Science Monitor. This call specification included the requirement of running a survey on data sharing. The tender was won by a consortium including both CWTS and Elsevier which will lead in 2018 to a new revised version of the survey implemented for the report mentioned above. This first use case highlights the potential impact an open survey on data sharing could have for such players as the European Commission.

User case 3: The National Institute of Science and Technology Policy in Japan conducted “A Survey on Open Research Data and Open Access” to investigate Japan’s current status and challenges in Open Science. Authors conducted a survey of Japanese researchers at the Science and Technology Experts Network of NISTEP in Nov-Dec 2016. They were asked about their experience on sharing and using their article and data, their recognition of open research data, the sufficiency of resources, and the needs to support researchers. The response rate was very high (70.5%) and results were already used in a discussion on Open Science in the government’s Cabinet Office.

User case 4: Publishers and related companies are conducting similar surveys. For example, Digital Science published their report “The State of Open Data” with the results of a global survey of 2,000 researchers in 2016. This survey assessed the global landscape around open data and sharing practices. It  highlighted the extent of awareness around open data, the incentives around its use, and perspectives researchers have about making their own research data open. Such reports allow publishers and related companies activities to evaluate trends of data sharing in the research community to develop and offer best-fitted solutions. They also help researchers understand the potential of data sharing and enhances their practices.

User case 5: Science Granting Councils Initiative - The national research funding organizations involved in this peer-learning network (https://sgciafrica.org/en-za), which seeks to enhance the capacity of research planning and management capabilities, are interested in the potential of open data to support their research communities. The survey instrument proposed here would create an opportunity for this network to contribute to and benefit from its application. As research councils in this region formulate their strategies, comparable survey data would provide useful guidance to orient their policies and practices. Work in this direction would also provide an opportunity for SGCI members to contribute to a wider Global Research Council call for action on open data ‘to compare and learn from their emerging practices, and collaborate on training and outreach activities.’

 

 

Objectives (A specific set of focus areas for discussion, including use cases that pointed to the need for the IG in the first place.   Articulate how this group is different from other current activities inside or outside of RDA.):

 

Through this Interest Group, our objectives are three-fold:

  1. Develop the User Community:
  • Participation and perspectives: to support application of proposed activities, initial efforts will focus on involving and understanding the perspectives of users who have commissioned and are interested in commissioning open data surveys. What did they seek to understand through their survey? What data proved useful and for what purpose?
  • Promote dialogue among questionnaire designers and survey users
  1. Develop a community-designed modular and interoperable open survey(s):
  • Horizon scanning: identify and analyze existing surveys to compare similarities and differences in topics addressed and indicators used; identify relevant stakeholders.
  • Engagement & recruitment: engage with stakeholders (those who have commissioned and developed such surveys and other relevant ones identified in previous step) to raise awareness of this IG, assess their interest in collaborating, and identify skills and expertise available.
  • Develop survey(s) and/or survey modules taking advantage of existing ones, available and willing expertise, focusing on communities with high interest and willingness to participate. Survey(s) could, for example, be developed in a step-wise approach developing specific pilots for specific communities, geographies, etc. (e.g. initial interest from funding bodies community could lead to a first Working Group).
  • Address language and cultural differences to identify common grounds that can be applied globally.
  1. Determine how such open survey(s) could be implemented and results analyzed globally:
  • Assess available existing technical tools to choose the solution(s) best fitted for our purpose.
  • Identify multiplying networks to deploy survey(s) such as societies or associations that can help achieve a deep reach out effect to run survey(s).
  • Run survey(s) using above tool(s) and networks.
  • To avoid duplication of efforts, we will investigate opportunities to perform survey(s) analysis in a coordinated fashion to reduce costs by avoiding duplication.
  • This coordinated approach also seeks to promote reliability in the analysis of surveys and comparability of findings, which should allow for better benchmarking.

 

 

Participation (Address which communities will be involved, what skills or knowledge should they have, and how will you engage these communities.  Also address how this group proposes to coordinate its activity with relevant related groups.):

 

We anticipate this IG will attract a broad-based audience. During the BoF session at the 10th RDA Plenary Meeting, we engaged scholars, research administrators, research funders, government representatives, publishers and research data service providers. Given the interest in the topic and the potential benefits flowing from comparable and harmonized survey data, we are confident this IG will be able to engage representatives from these sectors on an ongoing basis.

 

In order to be successful, this IG will need to count on skills and knowledge from experts who have experience in developing survey questionnaire development as well as in analysing and interpreting the results of these survey. Accordingly we will aim to engage and recruit in the IG the authors of existing surveys. Experts of research policies in local setting and specific disciplines will also be essential for the IG. This is why we plan to involve experts from various scientific communities.

We will engage stakeholders through our personal network and by direct interaction when names are available (e.g. authors of existing survey reports; list of 70+ participants to the BoF session we organized at the 10th Plenary in Montreal). The IG will also run an engagement exercise through RDA plenaries and side events, or additional events of relevance which will be identified (e.g. submitting abstract to SciDataCon-IDW 2018).

 

Coordination with other RDA groups will take two forms:

  1. Some IGs and WGs have activities that could have common interests with our IG as they deal policy, rewards, metrics, etc. which are key to development of data practices and policies. The two relevant IGs are Data policy standardization and implementation IG and Sharing rewards and credit IG. There might additionally be overlapping interests with the following groups: Data usage metrics WG, Exposing data management plans WG and Mapping the landscape IG.
  2. There are many RDA IGs that represent specific scientific communities that we will want to engage as they might have interest to explore data sharing practices through a survey for their own community. Example of scientific communities IG: Agricultural Data IG, Biodiversity Data Integration IG, Chemistry Research Data IG, Digital Practices in History and Ethnography IG, Linguistics Data IG, Health Data IG, etc.

Similarly there are non-scientific communities also represented as IGs that we will want to engage such as Early Career and Engagement IG.

We will contact and communicate with co-chairs of those IGs in the first months of the IG and meet those interested at the two RDA Plenaries in year 1.

 

 

Outcomes (Discuss what the IG intends to accomplish.  Include examples of WG topics or supporting IG-level outputs that might lead to WGs later on.):

 

The first outcome of the IG will be the development of community-designed modular and interoperable open survey(s) questionnaire(s). These questionnaires will be made open and freely available. We anticipate this contribution will promote the use of surveys by organizations who would like to better understand research data sharing practices and/or policy effects. Being open and freely available, will reduce barriers  that may prevent organizations that have the interest but  would otherwise not be able to perform such an undertaking for lack of expertise and/or resources.

 

The second outcome of the IG will be a consequence of the first as survey(s) will provide results to track changes in practice and policy overtime. These results will help articulate better policies, identify existing gaps, prioritize research funding, initiate projects and initiatives such as for example research infrastructure.

 

The third outcome is dependent on the success of the third objective described above (analyze survey(s) globally) which could achieve if fruitful an aggregated survey result analysis. This last outcome will help improve reliability in the analysis of surveys and comparability of findings. It would also allow for better benchmarking, leading to global or regional views on research data sharing but also global approaches to data sharing in specific communities (e.g. universities).

 

While each of these outcomes could be driven through dedicated Working Groups, it is more likely that WGs emerge for specific time-limited tasks. The first outcome might, for example, lead to pilots for the development of a survey module targeting a specific stakeholder community (e.g. funding bodies). The second outcome could lead to a WG looking into the policy dimensions of data sharing. And the third outcome might in itself become a WG to explore a centralized survey(s) analysis scheme.

Finally, this survey might become a good example of RDA-based survey in the context of Open Science and Data Sharing as community-based development involving various stakeholders, developing a de-facto standard to conduct global and comprehensive surveys.

 

 

Mechanism (Describe how often your group will meet and how will you maintain momentum between Plenaries.):

 

An IG ‘Executive Committee’, composed of a limited group (max. 12-15), will meet virtually by video conference and/or phone on a monthly basis during the first year. The Executive Committee will focus its efforts on generating awareness, engagement, and recruitment of stakeholders into an IG community. The frequency of calls will be assessed for the following years depending on coordination needs and support generated by emerging sub-groups and WGs, who might lead on IG activities for limited periods of time. In addition to the IG co-chairs, the executive committee will take on board a representative from each sub-group and/or WG who will be the main contact point and who will report progress and decisions.

Sub-groups and/or Working Groups are indeed likely to emerge as the activities of the IG develop. Each will have a specific tasks and/or community focus. Their work will be self-organized but the co-chairs will create a structure to coordinate and support their contributions. Their interactions could take place virtually or physically (e.g. regionally localized group).

A calendar of events (e.g. conferences, workshops, etc.) will be created in order to collect and use opportunities for physical meetings beyond the RDA Plenaries.

 

More broadly, should the IG be approved and collaborations initiated to develop common survey instruments, we are aware that the issues of governance and infrastructure will need to be specifically addresses by the IG. Such support could be obtained for example through international organisations like the OECD who have an interest in developing policy indicators.

 

 

Timeline (Describe draft milestones and goals for the first 12 months):

 

The Gantt chart below shows how objectives 1 and 2 described above will start in year 1 and will overlap with an earlier start for objective 1 as its initial outcomes are needed to move forward objective 2. Objective 3 will start during phase 3 as we start exploring how survey(s) can be analyzed globally.

The modular approach suggested below will allow each survey/module developed in subsequent phases to benefit from expertise, experience and networks developed in previous surveys/modules. Process will be streamlined thus increasing efficiency and allowing use of Standard Operating Procedures developed and improved in previous phases.

Review period start:
Monday, 29 January, 2018
Custom text:
Body:

*** PLEASE NOTE: This Case Statement text has been revised. See the attached Case Statement with revisions to access the updated text. ****

 

 

  1. WG Charter 

The Health Data IG is sponsoring the idea of establishing a WG focusing on Blockchain in Health data, as a technological advanced solution for securing data sharing among clinical institutions and individuals.

Simply stated, the blockchain is a cryptographic protocol which makes it possible to run a distributed, public and trustable ledger where transactions referring to digital objects are signed with issuer and recipient’s identities, verified by a community of peers and stored as incremental “blocks” into a shared database. Beyond technicalities, the true disruption of the blockchain lies in the fact that it brings digital trust over a potentially un-trustable network.

The deployment of such a ledger at large scale can enable health data transactions based (when needed) on the appropriate patient’s consent and/or the hospital’s permission and operated through self-enacting smart contracts, in combination with a catalogue of all available data, which would be browsable at anytime, anywhere and by anyone, yet containing no sensitive information.

The aim of establishing a dedicated WG is:

  • to analyse and compare usages of the blockchain in healthcare, implementations of blockchain architectures, associated legal and socio-economic impacts and perspectives
  • to assess the potential of blockchain-based self-enacting smart contracts in handling consent and data permission systems minimising transaction costs
  • to assess whether and how the blockchain can ensure compliance with advanced data protection requirements (such as those defined by the EU General Data Protection Regulation – GDPR), yet making it happen seamlessly and efficiently, at scale.

Within 18 months of activity, starting from concrete examples, the group will draw a set of use-cases, thus feeding a working draft and concluding on good practices, technical recommendations, and guidance to healthcare professionals interested in having recourse to blockchain solutions.

  1. Value Proposition

Imagine a place where individuals, research centres, pharma companies, and healthcare professionals can easily search for and mobilise on demand large volumes of data while ensuring at all times patients’ clear consents and the highest standards of privacy protection and security, coping with any hurdle deriving from geographical location, data complexity, or data protection laws.

The blockchain can help to establish a solid technological backbone, supporting healthcare information systems’ resilience, and acting as an operational data protection regulation-compliant infrastructure, where data transactions are informed and controlled by informational self-determination and privacy-by-design/default principles.

The guidelines produced as WG’s outcome shall benefit all kind of stakeholders dealing with health data who require full traceability of data usage especially for research purposes, and who will benefit from transparency and trust, such as: biomedical researchers, clinicians, drug and device trials operators, individual patients wishing to know more about other people sharing similar medical conditions, as well as individuals/patients/citizens willing to make use of trustful blockchain-based systems for contributing to data sharing to enhance scientific research and medical knowledge.

  1. Engagement with existing work in the area

The WG on Blockchain applications in Health will be directly associated with the Health Data IG, and it will seek cooperation with all groups interested in applying blockchain to other areas, such as the Ethics and Social Aspects of Data IG or the Working Group for Data Security and Trust (WGDST), as well as any group interested in the future in better understanding the blockchain potential by clustering with the Blockchain applications in Health WG.

 

  1. Work Plan
  • The final deliverable of the WG will be a set of Guidelines for establishing a scalable blockchain-based data sharing system in healthcare. These guidelines will include a state-of-the-art report and a report on regulatory and legal issues, focussing on blockchain applications in health.
  • At 6 months interval, 3 reports will be presented at each RDA Plenary WG’s Session, highlighting the performed analysis and activities, following 3 steps: first, the state-of-the-art report (after 6 months) describing the current experiences in blockchain based handling of health data; second, the report on regulatory and legal issues (after 12 months); third, the comprehensive Guidelines on Blockchain applications in Health (after 18 months), inclusive also of an example of basic coding for a health-data blockchain architecture.
  • From the start of the WG, its members will be asked to join one or more of the proposed sub-groups
  • Working documents will be made public or accessible to WG/RDA members via open tools such as Google docs.
  • Over a period of one and a half year, the working group will host a general 2-hour telco on a quarterly basis and meet in person every six months at the RDA Plenary Assembly. Smaller groups, dedicated to the above-mentioned reports, will communicate on-line at least on a monthly basis.
  • The WG members will work individually or in small groups depending on the activities to be performed in relation to the above-mentioned WG outcomes. Activities assigned, and draft outcomes will be discussed, monitored and reviewed during the quarterly telcos and Plenary sessions, and additional TCs will be organized when needed.
  • The WG Guidelines will be reported to the Health Data IG so that HDIG members may share and disseminate them in all relevant events they will happen to attend.

 

  1. Initial Membership

Name

Membership

Region/Country

Edwin Morley-Fletcher

Co-Chair

Italy

David Manset

Co-Chair

France

Aggelos Kiayias

Co-Chair

UK

Yannis Ioannidis

 

Greece

Leslie McIntosh

 

USA

Patrick Ruch

 

Switzerland

Anne-Marie Tassé

 

Canada

Ludovica Durst

 

Italy

Andreas Rauber

 

Austria

Laurence Claeys

 

Belgium

Mirko De Maldé

 

Italy

Aurélie Bayle

 

France

Davide Zaccagnini

 

USA

 

Download the case statement below. 

Review period start:
Thursday, 25 January, 2018
Custom text:
Body:

** Note - the following text has been deprecated in favor of the attached Version 2 Case Statement document as of 21 May 2018.

 

 

 

Case Statement to Create a Working Group On Capacity Development for Agriculture Data

 

Submitted to Research Data Alliance

 

Case Statement for Capacity Development for Agriculture Data WG

Charter

Introduction and Rationale

 

Introduction

 

This case statement is an effort to highlight the adequate need for capacity development of research data and open data practitioners in agriculture and related sciences, in particular to the existing RDA working groups related to IGAD. It shall broadly look at the existing gaps that requires urgent attention, so as to provide the right capacity for roles of all the stakeholders within the agriculture and nutrition ecosystem that can support international development goals. It shall present diverse use cases and linkages to different sectors of development, and cites resources for more in-depth examination of capacity development issues. It will aim to provide in-depth strategic actionable insights into capacity and capabilities necessary for managing today’s large scale data assets and opportunities for stakeholders to network; set methodologies for curriculum development and new approaches to some of the key techniques and tools for agricultural data management. It will provide practitioners with the necessary capacities to contribute to and utilize the vast open data ecosystem in agriculture and nutrition.

 

1. Rationale

 

The aim of this Working Group (WG) is to develop synergies between existing education and training activities and agricultural science needs by performing a landscape assessment to identify existing gaps and training requirements within the Interest Group on Agricultural Data (IGAD) WGs related. A particular focus will be on sharing knowledge about training initiatives and technologies, reducing digital divides so that researchers and practitioners in developing countries can also benefit. It will also empower the existing collaboration with GODAN and GODAN Action.

 

 

Formed in 2013, since its inception the Interest Group on Agricultural Data (IGAD) has grown in community strength to becoming one of the RDA’s most prominent Thematic Groups. IGAD is a domain-oriented group working on all issues related to global agriculture data. It represents stakeholders in managing data for agricultural research and innovation, including producing, aggregating and consuming data. Beyond this IGAD promotes good practices in research with regard to data sharing policies, data management plans and data interoperability, and it is a forum for sharing experience and providing visibility to research and work in agricultural data.

 

At the RDA 9 in Barcelona , more than 100 data experts from more than 35 countries came together for the IGAD pre-meeting where the participants were briefed about ongoing activities of IGAD Working Groups (WGs) and conveyed new ideas through the proposals for  future work. At the RDA9 Pre-meeting for IGAD there was strong interest for developing a WG for building synergies in Capacity Development, in the context of research data in agriculture and its potential contribution to the realization of the Sustainable Development Goals (SDGs).

 

2. Proposal

 

The  Capacity Development Working Group aims to

 

  1. Perform a landscape assessment to identify existing gaps and training requirements within the context of IGAD network including GODAN
    • Using existing needs assessment survey data from GODAN Action
    • Literature review of existing training capacity challenges and opportunities, taking into account language barriers
    • Focused around an identified IGAD research data community
  2. Present a report of the landscape assessment to RDA11 and circulate for expert comment.
  3. Develop a case study around an identified community to identify good practices for capacity development and advocacy.
  4. Build synergies with other initiatives, such as GODAN Action,  to  increase the coherence and impact on the delivery of training.

 

The working group aims to identify and mitigate the gaps in capacity development as evidenced by the needs assessment of the different stakeholders in domains under agriculture. The WG also aims to map and enlist the active initiatives, organisations and agencies already involved in capacity development activities  and build synergies between them and then evidence needs of the stakeholders. In view of the above, we will identify possible stakeholders such as the content providers, technology solution providers, user community (practitioners, researchers, policy makers) and resource persons.

 

3. Value proposition

 

There is a recognised need for capacity development in the fields of agricultural data management. A number of different organisations are attempting to address this, however there does not appear to be a coordinated approach across all agricultural domains which can adequately identify training gaps and make recommendations. Language is a further obstacle for identifying and evaluating existing resources. This working group aims to address this gap by acting as a hub for coordinating the development and provision of training to agriculture practitioners, researchers and policy makers.

This WG outputs will be of benefit to a wide range of stakeholders, users and communities in the Agriculture and Nutrition domain.

Key impacts

  • Reduce effort and duplication in development of curricula
  • Support capacity development for individuals, communities, and initiatives that will benefit from utilising Agricultural data outputs
  • Advocate for collaborative efforts agriculture and nutrition open data initiatives through knowledge sharing among capacity development activities
  • Advocate programs, good practices, and lessons learned that enable the use of open data. Promote capacity development and diversity of open data users for a more effective accessibility, use, engagement and understand of open data.

 

4. Work plan

 

The deliverables for the Capacity Development WG will be:

 

  • Bring multidisciplinary inputs about best practices into the curriculum development process for Agriculture data with the objective to expand the societal impact of open agricultural research.
  • To explore ideas for closer collaborations between RDA/IGAD - RDA/Geospatial IG - GODAN Capacity Development - CODATA and synergies to set up joint education and training activities on Open Data in food and agricultural sciences.  We will work with RDA-CODATA Summer Schools, Teaching TDM for Education and skills development initiatives to build synergies. A particular focus will be on sharing knowledge about training programs and platforms for reducing digital divides so that researchers in developing countries can also benefit.

 

The outcomes for the Capacity Development WG will be:

 

  1. Report on the current landscape assessment.
  2. From the landscape assessment, identify and report, with interested stakeholders, existing capacity gaps, including language barriers.
  3. Develop a case study around the identified community to identify good practices for capacity development and advocacy
  4. Initiate ideas for Joint training programs in Agricultural data (both online and hands- on ) in collaboration with interested organisations for example GODAN as well as other IGs in RDA (Geospatial IG)

 

7. Milestones

 Deliverables and Outcomes

 Engagement

 

IGAD has spearheaded two important capacity development initiatives. These include two introductory online MOOCS course on data management, principally aimed at learners in Latin America, and four Forums on Open Data and Open Science in Agriculture for Africa. The MOOCs courses offer an introduction to themes of data management using examples from the international sphere and with a particular focus on data management research in fisheries and nutrition. As well as teaching how to define data; the course shall look at issues of data storage, sharing and long term preservation, highlighting the different resources available to researchers in data management.  IGAD leads the course in conjunction with the FAO of the United Nations, Spain’s Polytechnic University of Valencia and the International Food Policy Research Institute (IFPRI).

 

The goal of the forums is to provide a platform for stakeholders in Africa to share knowledge on institutional and national initiatives aimed at enhancing the visibility, accessibility and usability of agricultural data and science, and the potential contribution to the realization of Africa’s sustainable development goals and the SDGs. We aim to reach out to GODAN Capacity Building activities to expand our synergies and work jointly for expanding capacity development in agriculture data globally.

 

Work Plan

A specific and detailed description of how the WG will operate includes:

 

  1. The final deliverables of the Capacity Development in Agriculture Data WG will consist of:
    1. A landscape assessment on the current state of play for agricultural research data with emphasis on a selected case study group
    2. Good practice guidelines for identifying and addressing capacity gaps, derived from the case study (Including guidance on advocacy to support adoption )
    3. A capacity building resource hosted on the RDA website

 

  1. Milestones for the WG shall include:

 

  1. November 2017 - Submission of the WG Case Statement to  RDA
  2. November 2017 (M1) - Acceptance and endorsement of the WG Case Statement by  RDA?
  3. March 2018 (M5)  - Identified an Ag data community for the case study and completed and initial landscape assessment to identify the current state of play for the community,
  4. March 2018 (M5) - Present a draft report on the current state of play for the case study community to IGAD @ RDA11.
  5. April 2018 (M6) - Circulate report to IGAD community for expert feedback
  6. April 2019 (M18) - Report of good practices and lessons learned
  7. April 2019 (M18) - Handover of developed research data capacity development resource to the wider IGAD/RDA community
  • A description of how the WG plans to develop consensus, address conflicts, stay on track and within scope, and move forward during operation

The WG will have all the discussions through the IGAD maillists to ensure openness and inclusivity  in the capacity development training activities. This will ensure all viewpoints are discussed in an open and participatory way.

As contributors will be meeting regularly through GoTo Meeting facility and various conferences, they will be in an ideal position to let the wider agricultural community know about the importance and the status of the work done within the WG.

 

Adoption Plan

Work for Agreement among key agriculture capacity building communities (GODAN, IGAD, CODATA etc) by April 2019 to expand synergies in training programs using the guidelines developed as part of the deliverables has been established as an objective of the WG.

 

Initial Membership

 

  • Muchiri Nyaggah (Local Development Research Institute, Kenya)
  • Karel Chavat (Czech Center for Science and Society, Czech Republic)
  • Antonio Sanchez-Padial (INIA, Spain)
  • Andreas Kamilaris (IRTA, Spain)
  • Elizabeth Zeitler (AAAS Science and Technology Policy Fellow, USA)
  • Ruthie Musker (GODAN, USA)
  • Imma Subirats (FAO of the UN, Italy)
  • Karna Wegner(FAO of the UN, Germany)
  • Patricia Rocha (Embrapa, Brazil)
  • Devika Madalli (Indian Statistical Institute, UK)
  • David Tarrant (ODI, UK)
  • Isaura Ramos Lopes (Netherlands)
  • Michael Ball (Biotechnology and Biological Sciences Research Council (BBSRC), UK)
  • Shaik N. Meera (Rice Knowledge Management Portal, India)
  • Carme Reverté Reverté (IRTA, Spain)
  • Sonigitu Ekpe (Nigeria)
  • Telemachos Koliopoulos (University of Srathclyde, Greece)
  • Salman Siddiqui(IWMI, SriLanka)
  • Sophie Fortuno (CIRAD, France)
  • Marie-Claude Deboin (CIRAD, France)
  • Zhang Xuefu (CAAS, China)
  • Richard Ostler (Rothamsted Research, UK)
  • Boniface Akuku (KALRO, Chair of the CODATA Task Group on Agriculture Data, Knowledge for Learning and Innovation, Kenya)

 

Initial leadership in English(tbc):

 

  1. Suchith Anand (GODAN( UK/India) - Coordination
  2. Chipo Msengezi (CTA, The Netherlands)
  3. Karna Wegner (FAO of the UN, Italy)

 

 

Review period start:
Wednesday, 3 January, 2018 to Saturday, 3 February, 2018
Custom text:
Body:

** Please Note - the following text has been deprecated in favor of the revised Charter Statement in the attached document - 7 June 2018 **

 

 

Introduction (A brief articulation of what issues the IG will address, how this IG is aligned with the RDA mission, and how this IG would be a value-added contribution to the RDA community):

The Earth, space, and environmental science communities are developing, through multiple international efforts, both general and domain-specific leading practices for data management, infrastructure development, vocabularies, and common data/digital services. This Interest Group will work towards coordinating and harmonizing these efforts to reduce possible duplication, increase efficiency, share use cases, and promote partnerships and adoption in the community.

 

User scenario(s) or use case(s) the IG wishes to address (what triggered the desire for this IG in the first place):

This Interest Group builds on the BOF specific to EarthCube held at the RDA P10 meeting.  Results from that session showed strong interest in intenational collaboration for Earth, space, and environmental infrastructure concerns that are currently be addressed in RDA across all the sciences and external to RDA across many geographically oriented efforts. 

Key participating groups and their use cases include:

1. The Earth Science Information Partners (ESIP) is expanding to include communities outside of the United States.  Specifically the next area of growth is Australia.  ESIP is an independent forum used to address topics of interest to the Earth science data and technology community, such as data management, data citation and documentation. The work of the ESIP community is advanced through our collaboration areas, where participants contribute their expertise toward resolving common problems of the Earth science data and technology community.

2. The Australian AuScope program provides research infrastructure to the Earth and Geospatial Science research communities with a focus on data discovery, delivery and interoperability with an increasing focus on FAIR data principles.  AuScope has been developing related data and interoperability platforms for the last decade and is currently focusing on increased international collaboration.  AuScope is Federally funded by the National Collaborative Research Infrastructure (NCRIS) program.

3.  The European Union funded European Plate Observing System (EPOS) is in the midst of its implementation phase, with increasing interest to be interoperable with their international counterparts. EPOS, the European Plate Observing System, is a long-term plan to facilitate integrated use of data, data products, and facilities from distributed research infrastructures for solid Earth science in Europe.

4.  The American Geophysical Union (AGU) is leading an international community-driven effort, Enabling FAIR Data, to adopt existing and develop new leading practices that result in data supporting a publication to be preserved in an appropriate repository with proper data citations placed in the reference. Data will no longer be in the supplement of the paper. This effort seeds to coordinate across similar and supporting efforts. Partners include RDA, ESIP, ANDS, NCI, AuScope, Nature, Science, PNAS, and the Center for Open Science.

5. The Environmental Data Initiative (EDI) is an NSF-funded project meant to accelerate curation and archive of environmental data, emphasizing data from projects funded by the NSF DEB.  Programs served include Long Term Research in Environmental Biology (LTREB), Organization for Biological Field Stations (OBFS), Macrosystems Biology (MSB), and Long Term Ecological Research (LTER).

6.  The U. S. National Science Foundation’s EarthCube effort is moving into implementation phase and reaching out internationally to be informed and coordinate efforts. EarthCube, initiated by the National Science Foundation (NSF) in 2011, transforms geoscience research by developing cyberinfrastructure to improve access, sharing, visualization, and analysis of all forms of geosciences data and related resources.  

7. The Open Geospatial Consortium has a series of Domain Working Groups including Earth Systems, Geoscience, Hyrdology and Marine. These Domain Working Groups provide a forum for discussion of key interoperability requirements and issues, discussion and review of implementation specifications, and presentations on key technology areas relevant to solving geospatial interoperability issues. However, their supporters come mainly from the Government and Industry Sectors, and it is hoped by linking through this interest group there will be greater connectivity to those equivalent activities in the academic/research sector.

 

Objectives (A specific set of focus areas for discussion, including use cases that pointed to the need for the IG in the first place.   Articulate how this group is different from other current activities inside or outside of RDA.):

This Interest Group will focus on awareness and coordination where applicable of dependent efforts across the international Earth, space, and environmental science communities.

 

This group is different from other current activities as it is focused on the scientific domains specific to the Earth, space, and environmental sciences.  Some overlap potentially exists with interdisciplinary work in the biological community and social science community.

 

Within RDA, we will leverage the existing working groups and interest groups to help integrate and coordinate the needs of the Earth, space, and environmental sciences with the objectives and deliverables across those related efforts. 

 

Participation (Address which communities will be involved, what skills or knowledge should they have, and how will you engage these communities.  Also address how this group proposes to coordinate its activity with relevant related groups.):

 

External (to RDA) Communities Include:

EarthCube

Council for Data Facilities (part of EarthCube, specific to digital repositories)

iSamples

AuScope

European Plate Observing System (EPOS)

OGC

Enabling FAIR Project (AGU)

Earth Space Information Partners (co-manage the group)

ICSU CODATA (e.g. Science Unions)

IGSN e.V. (International Geo Sample Number)

Environmental Data Initiative

 

RDA Communities:

RDA Working Groups:

Data Citation WG

Data Usage Metrics WG

DMP Common Standards WG

Metadata Standards Catalog WG

Persistent Identification of Instruments 

RDA/WDS Publishing Data Workflows WG

RDA/WDS Scholarly Link Exchange (Scholix) WG

Research Data Repository Interoperability WG

 

RDA Interest Groups:

Active Data Management Plans IG

Big Data IG

Chemistry Research Data IG

Data Discovery Paradigms IG

Data Foundation and Terminolgy IG

Data Policy Standardization and Implementation

Disciplinary Collaboration Framework 

Domain Repositories Interest Group

From Observational Data to Information

Geospatial IG

Global Water Information IG

Long tail of research data IG

Mapping the Landscape

Marine Data Harmonization

Physical Samples and Collections in the Research Data Ecosystem

PID IG

Quality of Urban Life IG

RDA/CODATA Legal Interoperability IG

RDA/WDS Certification of Digital Repositories

RDA/WDS Publishing Data IG

Reproducibility IG

Software Source Code IG

Virtual Research Environment IG

Vocabulary Services IG

Weather, Climate and Air Quality IG

 

Outcomes (Discuss what the IG intends to accomplish.  Include examples of WG topics or supporting IG-level outputs that might lead to WGs later on.):

 

Improved awareness, sharing, leveraging synergies, and coordination on projects that have dependencies, overlap, or multiple stakeholder communities.

 

Mechanism (Describe how often your group will meet and how will you maintain momentum between Plenaries.):

 

The group will meet quarterly with two of the yearly meetings at the RDA Plenary. Between plenaries, the IG will maintain momentum with face-to-face meetings at domain events: AGU, EGU, ESIP Winter, ESIP Summer, and EarthCube All Hands.  Virtual task force and planning meetings will support objective needs.

 

Timeline (Describe draft milestones and goals for the first 12 months):

Milestones and Goals

1.  Communicate with Earth, space, and environmental science communities not yet part of the Interest Group; invite them to join.

2. Establish mechanism to share project information and assess any dependencies, overlaps, and gaps. This needs to include allowing multiple efforts that are similar to proceed.

3. Co-chairs will reach out to RDA Working Groups and Interest Groups to determine potential collaboration.

 

Potential Group Members (Include proposed chairs/initial leadership and all members who have expressed interest):

 

FIRST NAME

LAST NAME

TITLE

Erin

Robinson

Proposed Chair

Lesley

Wyborn

Proposed Chair

Danie

Kinkade

Proposed Chair

Shelley

Stall

Proposed Chair

Helen

Glaves

Proposed Chair

Mohan

Ramamurthy

 

Simon

Cox

 

Brooks

Hanson

 

Lynn

Yarmey

 

Kerstin

Lehnert

 

Simon

Hodson

 

Tim

Rawling

 

Roger

Proctor

 

Siddeswara

Guru

 

Kirsten

Elger

 

Grellet

Sylvain

 

Scott

Simmons

 

Greg

Buehler

 

George

Percivall

 

Denise

McKenzie

 

Ben

Evans

 

Andrew

Treloar

 

Corrina

Gries

 

Ari Asmi  
Jens Klump  
Mark  Parsons  
Martina Stockhause  
Steve Diggs  
Tobias Weigel  
Ben Evans  
Jingbo Wang  
Clare Richards  
Ted Haberman  
Jane Wynngaard  
Ruth Duerr  
Denise Hills  
Sophie Hou  

  Add more lines as needed by hitting the ‘tab’ key at the very end of the ‘Title’ line.

 

 

 

Review period start:
Thursday, 28 December, 2017 to Sunday, 28 January, 2018
Custom text:
Body:

Revised version (26/6/2018)

 

Introduction

The Research Data Architectures in Research Institutions Interest Group is primarily concerned with technical architectures for managing research data within universities and other multi-disciplinary research institutions. It provides insight into the approaches being taken to the development and operation of such architectures and their success or otherwise in enabling good practice.

 

An institution’s research data management infrastructure consists of more than just a data repository and discovery mechanisms. It includes the underlying storage technologies, the networking, hardware, system interfaces, authentication mechanisms, data brokers, monitoring platforms, semantic interoperability tools, long-term preservation services, high-performance and high-throughput computing facilities, data science platforms, and potentially many other technologies that process data and control the flows of data and metadata between systems.

 

Seamless data interoperability and movement between different systems both local and in national or disciplinary services is a particular challenge at present, given the need to provide researchers with a smooth and efficient user experience – a key requirement for any research data service. Governance and policies, project management environments, and communications platforms are also vital elements in shaping and informing IT architectures, as is the management of business information associated with research.

 

This IG seeks to understand the various architectures used by institutions globally, identify pain points within those architectures, and learn from those who have overcome or avoided those pain points.
The general approach of this IG is to encourage discussion about architectures and enable interested parties to collaborate and learn from one another. Many institutions are at present planning and working towards overarching data management architectures, and there is a legitimate concern that without such a forum as is provided by this IG institutions will relive the same experiences and repeat the same mistakes as their peers.

 

User scenario(s) or use case(s) the IG wishes to address

The Research Data Architectures Interest Group treats the following user scenarios:

  • Knowledge support for research data infrastructure architects, and project and service managers
  • Knowledge sharing regarding enterprise architecture best practises  at multi-disciplinary research institutions
  • Development of a greater understanding of objectives and goals between different stakeholders at research institutions, including  management, infrastructure developers, research data engineers,  IT systems architects, and technology vendors.

 

Objectives

The main themes of the IG are

  • Exploring how diverse tools, technologies, and services can be integrated to meet the evolving needs of researchers in research institutions.
  • Considering interoperability between institutional research data infrastructures and (inter)national or discipline-based infrastructures
  • Understanding the different institutional approaches to governance structures and business processes in responding to research ICT demands (e.g. capacity planning/forecasting for storage)
  • Sharing case studies of solutions developed by data infrastructure projects in research institutions
  • Presenting technical innovations and ideas that can further the development of integrated research data infrastructures
  • Agreeing best practice relating to research data architectures in research institutions

This group has connections with some other RDA groups, such as:

  • National Data Services IG – research institutions frequently support, use or deliver services and solutions provided by national data service vendors or operators, and these therefore form part of an institution’s architecture
  • Repository Platforms for Research Data IG and Research Data Repository Interoperability WG – repositories are an essential component of research data architectures, but need to be considered as one element of a larger infrastructure
  • Data Fabric IG – technology and interoperability solutions for research institutions are an essential part of the bigger picture)Storage Service Definitions WG (formerly “WG QoS-DataLC Definitions”) – storage definitions, vocabularies etc. are important for achieving mutual understanding of research data architectures

The IG differs from these RDA IG’s or WG’s in following areas:

  • The scope of the group is institutional-level solutions and architectures rather than for national or disciplinary-specific architectures, although it will be important to consider the relationships between these different levels
  • The Group will discuss technological solutions at research institutions at the enterprise architecture level, not focus on a single element in isolation

 

Participation

The target audience of this IG includes anyone involved in research data infrastructure planning projects or services as well as researchers with an interest in systems, technologies and data flows at the institutions level. This includes research data infrastructure project managers, ICT architects, senior managers with responsibility for research IT services, developers, data managers, data engineers, and data scientists. Also representatives of the service or technology vendors and data industry are welcome.

 

Outcomes

  • Outcomes of the IG:
  • Shared knowledge about research data tools, infrastructures and architectures
  • Knowledge base for best practises and lessons learned log
  • More specific outputs will be determined by the participants in this IG, but might conceivably include:
  • Shared repository of architectural diagrams
  • List of technologies used for specific purposes, and their interfaces
  • Landscape report and gap analysis

 

Mechanism

The Interest Group will hold regular meetings at RDA plenaries (if accepted in the schedule). Between Plenaries the IG will collect case studies and examples of good practice and add these to the open knowledge base. Group will have 1 - 2 web meetings between plenaries (possibly more, if it is needed).

 

Timeline

P10: BoF and start of the review process
P11: Starting the IG; Discussion about tools, best practices, use cases and knowledge base
P12: Templates for the best practices reports
P13: First collection of the knowledge base

 

Review period start:
Thursday, 28 December, 2017 to Sunday, 28 January, 2018
Custom text:

Pages