You are here

Body:

WG Charter: A concise articulation of what issues the WG will address within an 18 month time frame and what its “deliverables” or outcomes will be.

 

The need for establishing this working group was articulated during the 9th plenary meeting in Barcelona during the Active DMPs IG session.  The discussion was framed by a white paper by Simms et al. on machine-actionable data management plans (DMPs). The white paper is based on outputs from the IDCC workshop held in Edinburgh in 2017 that gathered almost 50 participants from Africa, America, Australia, and Europe. It describes eight community use cases which articulate consensus about the need for a common standard for machine-actionable DMPs (where machine actionable is defined as “information that is structured in a consistent way so that machines, or computers, can be programmed against the structure”)

 

The specific focus of this working group is on developing common information model and specifying access mechanisms that make DMPs machine-actionable. The outputs of this working group will help in making systems interoperable and will allow for automatic exchange, integration, and validation of information provided in DMPs, for example, by checking whether a provided PID links to an existing dataset, if hashes of files match to their provenance traces, or whether a license was specified. The common information models are NOT intended to be prescriptive templates or questionnaires, but to provide re-usable ways of representing machine-actionable information on themes covered by DMPs.

 

The vision that this working group will work to realise is one where DMPs are developed and maintained in such a way that they are fully integrated into the systems and workflows of the wider research data management environment. To achieve this vision we will develop a common data model with a core set of elements. Its modular design will allow customisations and extensions using existing standards and vocabularies to follow best practices developed in various research communities. We will provide reference implementations of the data model using popular formats, such as JSON, XML, RDF, etc.  This will enable tools and systems involved in processing research data to read and write information to/from DMPs. For example, a workflow engine can add provenance information to the DMP, a file format characterization tool can supplement it with identified file formats, and a repository system can automatically pick suitable content types for submission and later automatically identify applicable preservation strategies.

 

The deliverables will be publicly available under CC0 license and will consist of models, software, and documentation. The documentation will describe functionality and semantics of terms used, rationale, standard compliant ways for customisation, and requirements for supporting systems to fully utilise the capabilities of the developed model.

 

The working group will be open to everyone and will involve all stakeholders representing the whole spectrum of entities involved in research data management, such as: researchers, tool providers, infrastructure operators, repository staff and managers, software developers, funders, policy makers, and research facilitators. We will take into account requirements of each group.This will likely speed up and increase adoption of the working group outcomes.

 

The group will predominantly collaborate online, but will use any possibility to meet in person during RDA plenaries, conferences, workshops, hackathons or other events in which their members participate. All meetings in which decisions are made will be documented and their summaries will be circulated using the RDA website.

 

The work will be performed iteratively and incrementally following the best practices from system and software engineering. We will evaluate preliminary drafts of the model with community to receive early feedback and to ensure that the developed common model is interoperable and exchangeable across implementations. We will also express existing DMPs using the developed common model and will investigate how to support modification of machine actionable DMPs by various tools involved in data management process, while ensuring that proper provenance and versioning information is stored with. Finally, we will build prototypes to investigate possible system integrations and to evaluate to which degree the information contained in the DMPs can be automatically validated and which actions or alerts depending on a DMP state can be triggered, e.g. by sending notifications to repositories or funder systems.

 

During our work we will monitor parallel efforts and engage with various research communities to find candidates for pilot studies and to transfer the acquired know-how. Towards the end of the lifetime of the working group we will launch pilot projects in which the model will be customised to suit the needs of the identified interested communities. Pilot studies will use the models to integrate systems and demonstrate how machine-actionable DMPs can work.

 

We believe that the outcomes delivered by this group will contribute to improving the quality of research data and research reproducibility, while at the same time reducing the administrative burden for researchers and systems administrators.

 

Value Proposition: A specific description of who will benefit from the adoption or implementation of the WG outcomes and what tangible impacts should result.

 

A common data model for machine-actionable DMPs will enable interoperability of systems and will facilitate automation of data collection and validation processes. The common model and accompanying interfaces and libraries are an essential building block for the infrastructure. Although for some stakeholder groups, the developments will be invisible (and should be) so that the unification and standardisation of a DMP model will bring benefits to all of them.

  • Researchers will benefit from having fewer administrative procedures to follow.  Machine-actionable DMPs can facilitate the automatic collection of  metadata about experiments. They will accompany experiments from the beginning and will be updated over the course of the project. Consecutive tools used during processing can read and write data from machine-actionable DMPs. As a result, parts of the DMPs can be automatically generated and shared with other collaborators or funders. Furthermore, researchers whose data is reused in other experiments will gain recognition and credit because their data can be located, reused, and cited more easily.

  • Reusing parties will gain trust and confidence that they can build on others’ previous work because of a higher granularity of available information.

  • Funders and repositories will be able to automatically validate DMPs.  For example, they will be able to check whether the specified ORCID iD or e-mail are correct, whether the data is available at the specified repository, and whether the data checksums are correct – in other words, whether the information provided in a DMP reflects reality.

  • Infrastructure providers will get a universal format for exchange of (meta-) data between the systems involved in data processing and data storage. They could also be able to automate processes associated to DMPs, like backup, storage provision, grant access permissions, etc.

  • Society will be better able to safeguard investment made in research and will gain assurance that scientific findings are trustworthy and reproducible, while the underlying data is available and properly preserved.  


Download the full case statement 

Review period start:
Tuesday, 4 July, 2017 to Friday, 4 August, 2017
Custom text:
Body:

Case Statement for RDA WG DMP Common Standards

 It can also be found as PDF here: https://drive.google.com/open?id=0BwdfVsSKpOzveGJRSWxtaTBWbzA 

Contents

  • WG Charter

  • Value Proposition

  • Engagement with existing work in the area

  • Work Plan

  • Adoption Plan

  • Initial Membership

 

WG Charter: A concise articulation of what issues the WG will address within an 18 month time frame and what its “deliverables” or outcomes will be.

The need for establishing this working group was articulated during the 9th plenary meeting in Barcelona during the Active DMPs IG session.  The discussion was framed by a white paper by Simms et al. on machine-actionable data management plans (DMPs). The white paper is based on outputs from the IDCC workshop held in Edinburgh in 2017 that gathered almost 50 participants from Africa, America, Australia, and Europe. It describes eight community use cases which articulate consensus about the need for a common standard for machine-actionable DMPs (where machine actionable is defined as “information that is structured in a consistent way so that machines, or computers, can be programmed against the structure”)

The specific focus of this working group is on developing common information model and specifying access mechanisms that make DMPs machine-actionable. The outputs of this working group will help in making systems interoperable and will allow for automatic exchange, integration, and validation of information provided in DMPs, for example, by checking whether a provided PID links to an existing dataset, if hashes of files match to their provenance traces, or whether a license was specified. The common information models are NOT intended to be prescriptive templates or questionnaires, but to provide re-usable ways of representing machine-actionable information on themes covered by DMPs.

The vision that this working group will work to realise is one where DMPs are developed and maintained in such a way that they are fully integrated into the systems and workflows of the wider research data management environment. To achieve this vision we will develop a common data model with a core set of elements. Its modular design will allow customisations and extensions using existing standards and vocabularies to follow best practices developed in various research communities. We will provide reference implementations of the data model using popular formats, such as JSON, XML, RDF, etc.  This will enable tools and systems involved in processing research data to read and write information to/from DMPs. For example, a workflow engine can add provenance information to the DMP, a file format characterization tool can supplement it with identified file formats, and a repository system can automatically pick suitable content types for submission and later automatically identify applicable preservation strategies.

The deliverables will be publicly available under CC0 license and will consist of models, software, and documentation. The documentation will describe functionality and semantics of terms used, rationale, standard compliant ways for customisation, and requirements for supporting systems to fully utilise the capabilities of the developed model.

The working group will be open to everyone and will involve all stakeholders representing the whole spectrum of entities involved in research data management, such as: researchers, tool providers, infrastructure operators, repository staff and managers, software developers, funders, policy makers, and research facilitators. We will take into account requirements of each group.This will likely speed up and increase adoption of the working group outcomes.

The group will predominantly collaborate online, but will use any possibility to meet in person during RDA plenaries, conferences, workshops, hackathons or other events in which their members participate. All meetings in which decisions are made will be documented and their summaries will be circulated using the RDA website.

The work will be performed iteratively and incrementally following the best practices from system and software engineering. We will evaluate preliminary drafts of the model with community to receive early feedback and to ensure that the developed common model is interoperable and exchangeable across implementations. We will also express existing DMPs using the developed common model and will investigate how to support modification of machine actionable DMPs by various tools involved in data management process, while ensuring that proper provenance and versioning information is stored with. Finally, we will build prototypes to investigate possible system integrations and to evaluate to which degree the information contained in the DMPs can be automatically validated and which actions or alerts depending on a DMP state can be triggered, e.g. by sending notifications to repositories or funder systems.

During our work we will monitor parallel efforts and engage with various research communities to find candidates for pilot studies and to transfer the acquired know-how. Towards the end of the lifetime of the working group we will launch pilot projects in which the model will be customised to suit the needs of the identified interested communities. Pilot studies will use the models to integrate systems and demonstrate how machine-actionable DMPs can work.

We believe that the outcomes delivered by this group will contribute to improving the quality of research data and research reproducibility, while at the same time reducing the administrative burden for researchers and systems administrators.

 

Value Proposition: A specific description of who will benefit from the adoption or implementation of the WG outcomes and what tangible impacts should result.

A common data model for machine-actionable DMPs will enable interoperability of systems and will facilitate automation of data collection and validation processes. The common model and accompanying interfaces and libraries are an essential building block for the infrastructure. Although for some stakeholder groups, the developments will be invisible (and should be) so that the unification and standardisation of a DMP model will bring benefits to all of them.

  • Researchers will benefit from having fewer administrative procedures to follow.  Machine-actionable DMPs can facilitate the automatic collection of  metadata about experiments. They will accompany experiments from the beginning and will be updated over the course of the project. Consecutive tools used during processing can read and write data from machine-actionable DMPs. As a result, parts of the DMPs can be automatically generated and shared with other collaborators or funders. Furthermore, researchers whose data is reused in other experiments will gain recognition and credit because their data can be located, reused, and cited more easily.

  • Reusing parties will gain trust and confidence that they can build on others’ previous work because of a higher granularity of available information.

  • Funders and repositories will be able to automatically validate DMPs.  For example, they will be able to check whether the specified ORCID iD or e-mail are correct, whether the data is available at the specified repository, and whether the data checksums are correct – in other words, whether the information provided in a DMP reflects reality.

  • Infrastructure providers will get a universal format for exchange of (meta-) data between the systems involved in data processing and data storage. They could also be able to automate processes associated to DMPs, like backup, storage provision, grant access permissions, etc.

  • Society will be better able to safeguard investment made in research and will gain assurance that scientific findings are trustworthy and reproducible, while the underlying data is available and properly preserved.  

 

Engagement with existing work in the area: A brief review of related work and plan for engagement with any other activities in the area.

The need for machine-actionable DMPs is recognized by the community and is being discussed within the Research Data Alliance. Participants of the CERN workshop organized in 2016 identified “encodings for exporting DMPs” as one of the next developments needed[1].  Automation and machine actionability are meant to be key factors enabling deployment of the European Open Science Cloud. A workshop on machine-actionable DMPs organized by the Digital Curation Centre and University of California Curation Center at the California Digital Library at IDCC in Edinburgh in 2017 resulted in a white paper that describes the current state of the art and expresses a need for a common standard for machine-actionable DMPs.

As a result of these ongoing discussions the participants of the 9th plenary meeting in Barcelona during the Active DMPs IG session decided to establish specific working groups that address various identified challenges related to DMPs. The proposed group on DMP common standards will address a high-priority challenge based on the most recent assessments of community needs.

Members of the proposed group are well connected to various community-based initiatives and working groups that address similar topics. The group will monitor and align the efforts with others in this area. We will specifically monitor:

  • RDA groups related to DMPs, such as, but not limited to:

    • Active DMPs IG,

    • Research Data Repository Interoperability WG,

    • Reproducibility IG,

    • e-Infrastructure IG,

    • RDA/WDS Certification of Digital Repositories IG,

    • BioSharing Registry: connecting data policies, standards & databases in life sciences WG,

    • Exposing DMPs WG (under review),

  • tools, such as, but not limited to:

    • DMPTool,

    • DMPonline,

    • RDM Organiser,

  • the DMP fora e.g. Force 11 FAIR DMP or Belmont Forum e-Infrastructures and Data Management Collaborative Research Action

  • e-Infrastructure projects e.g. OpenAIRE, EUDAT, European Open Science Cloud (EOSC)

  • W3C,

  • and others.

 

Work Plan: A specific and detailed description of how the WG will operate including:

·         The form and description of final deliverables of the WG,

D1. Common data model for machine-actionable DMPs
This deliverable will contain the developed data model and documentation describing semantics of terms used, rationale, and standard compliant ways for customisation of the model.

D2. Reference implementations

Reference implementation of the common data model will provide ready to use models in popular standards such as JSON, XML, RDF, etc. It will also provide example models of DMPs in each format.

D3. Guidelines for adoption of the common data model

Guidelines will be based on lessons learned from the common model development and prototyping. They will describe requirements for supporting systems to fully utilise the capabilities of the common data model.

·     The form and description of milestones and intermediate documents, code or other deliverables that will be developed during the course of the WG’s work,

 

M1. Requirements and candidate solutions reviewed (M5)

We will analyse existing DMP tools, as well as tools from domains of digital preservation, reproducible research, open science, and data repositories that cover the full data lifecycle. We will look for mappings to popular DMP creation tools, such as checklists, discuss lessons learned, and identify limits of automation and machine actionability. We will also investigate modelling techniques used in model engineering and linked data domains to identify suitable notation and tools for the common model.  Furthermore, we will identify and analyse existing domain-specific standards and evaluate their applicability. Based on this research, we will define requirements for the common model and identify domain-specific models and controlled vocabularies that need to interoperate with the common data model.

 

M2. Common model specification drafted (M10)

We will design a common data model and example expressions in mainstream representation formats (e.g. JSON). The development will be iterative and based on both real and synthetic examples of DMPs. We will develop prototypes to demonstrate how the model works and what its capabilities are.

 

M3. Common model refined (M15)

We will develop further extensions to the core model (the model will likely be modular) to evaluate its scalability and customisability. Furthermore, we will test integrations with existing tools and continue evaluation using sample DMPs. Based on these activities we will introduce necessary refinements to the common data model.

 

M4. Dissemination and pilot studies (M18)

We will formulate guidelines for the adoption of the common model and release final documentation of the developed model and reference implementations. We will disseminate the results of our work through mailing lists, participation in conferences, as well as social media. We will launch pilot studies that implement the working group outcomes. We will facilitate and encourage crowd-sourced descriptions of implementations beyond the direct activities of the working group.

 

·         A description of the WG’s mode and frequency of operation (e.g. on-line and/or on-site, how frequently will the group meet, etc.),

The group will predominantly collaborate online, but will use any opportunity to meet in person during RDA plenaries, conferences, workshops, hackathons or other events in which their members participate. All meetings in which decisions are made will be documented and their summaries will be circulated using the RDA website.

The group will have regular monthly calls to report on progress and discuss open issues. We will also use GitHub to host developed models and source code. We will use issue tracking mechanisms to discuss enhancements, bugs, and other issues. Important updates, such as reaching a milestone, will be communicated through the RDA website.

·         A description of how the WG plans to develop consensus, address conflicts, stay on track and within scope, and move forward during operation, and

Group consensus will be achieved primarily through mailing list discussions, where opposing views will be openly discussed and debated amongst members of the group. If consensus cannot be achieved in this manner, the group co-chairs will make the final decision on how to proceed.

The co-chairs will keep the working group on track by setting milestones and reviewing progress relative to these targets. Similarly, scope will be maintained by tying milestones to specific dates, and ensuring that group work does not fall outside the bounds of the milestones or the scope of the working group.

 

·         A description of the WG’s planned approach to broader community engagement and participation.

The working group case statement will be disseminated to mailing lists in communities of practice related to research data and repositories (e.g. ICSU World Data System) in an effort to cast a wide net and attract a diverse, multi-disciplinary membership. Group activities, where appropriate, will also be published to related mailing lists and online forums to encourage broad community participation.

 

Adoption Plan: A specific plan for adoption or implementation of the WG outcomes within the organizations and institutions represented by WG members, as well as plans for adoption more broadly within the community. Such adoption or implementation should start within the 18 month timeframe before the WG is complete.

 

Representatives of various stakeholders groups who are prominent in the area of DMPs have already joined this working group, including:

  • DMPRoadmap

  • DMPonline / Digital Curation Centre

  • DMPTool / California Digital Library

  • ELIXIR data stewardship wizard

  • RDM Organiser

  • Islandora

  • Phaidra

  • Open Science Framework

  • Data Intensive Research Initiative of South Africa (DIRISA)

  • Belmont Forum e-Infrastructure and Data Management

  • DSA-WDS Core Trustworthy Data Repositories  

  • DMP OPIDoR

  • INESC-ID

  • INIA - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria

  • EDINA

  • Data Archiving and Networked Services (DANS)

These representatives have agreed to consider implementing the standards recommended by the working group in their respective tools. Some of them have already committed to active participation in the group and plan to adopt the outputs. We will continue to seek representatives from a variety of research communities to ensure that this working group’s deliverables are widely adopted.

 

Initial Membership: A specific list of initial members of the WG and a description of initial leadership of the WG.

Leadership:

  • Chair: Tomasz Miksa (SBA Research, Austria)

  • Co-chair: Paul Walk (University of Edinburgh, Great Britain)

  • Co-chair: Peter Neish (University of Melbourne, Australia)

Members/Interested (based on 9th Plenary volunteer list and subsequent calls):

  • Adil Hasan

  • Amir Aryani

  • Andreas Rauber

  • Andrew Janke

  • Anna Dabrowski

  • Antonio Sánchez-Padial

  • Christoph Becker

  • Cristina Ribeiro

  • Daniel Mietchen

  • Dessi Kirilova

  • Fernando Aguilar

  • Heike Görzig

  • Janez Štebe

  • Jens Ludwig

  • Jérôme Perez

  • Joao Aguiar Castro

  • João Cardoso

  • Jonathan Petters

  • Karsten Kryger Hansen

  • Lesley Wyborn

  • Madison Langseth

  • Marie-Christine Jacquemot-Perbal

  • Mark Leggott

  • Mustapha Mokrane

  • Myriam Mertens

  • Natalie Meyers

  • Nobubele Shozi

  • Paolo Budroni

  • Peter Doorn

  • Peter McQuilton

  • Peter Neish

  • Raman Ganguly

  • Rob Hooft

  • Sarah Jones

  • Stephanie Simms

  • Terry Longstreth

  • Timea Biro

  • Wim Hugo

     


[1] CERN workshop on Active DMPs: indico.cern.ch/event/520120/attachments/1302179/2036378/CERN-ADMP-iPRES206.pdf

 

Review period start:
Tuesday, 4 July, 2017
Custom text:
Body:

Interoperability is a wide concept that encompasses the ability of organisations to work together towards mutually beneficial and commonly agreed goals. The Working group is using the following definition from the EIF:  ‘An interoperability framework is an agreed approach to interoperability for organisations that wish to work together towards the joint delivery of public services. Within its scope of applicability, it specifies a set of common elements such as vocabulary, concepts, principles, policies, guidelines, recommendations, standards, specifications and practices.’

The working group aims to provide a common framework for describing, representing linking and publishing Wheat data with respect to open standards. Such a framework will promote and sustain Wheat data sharing, reusability and operability. Specifying the Wheat linked data framework will come with many questions: which (minimal) metadata to describe which type of data? Which vocabularies/ontologies/formats? Which good practices? 

Mainly based on the the needs of the Wheat initiatiative Information System (WheatIS) in terms of functionalities and data types, the working group will identify relevant use cases in order to produce a  “cookbook” on how to produce “wheat data” that are easily shareable, reusable and interoperable.

For more details download the Case statement

Review period start:
Tuesday, 25 June, 2013
Custom text:
Body:

Draft Case Statement to Create a Working Group Entitled “On-Farm Data Sharing”

 

 

Submitted to

Research Data Alliance

 

Submitted by

Tom Morris, PhD

Professor

Department of Plant Science and Landscape Architecture

University of Connecticut, Storrs, Connecticut, USA

thomas.morris@uconn.edu

 

Nicolas Tremblay, PhD

Research Scientist

Saint-Jean-sur-Richelieu Research and Development Centre

Agriculture and Agri-Food Canada

Saint-Jean-sur-Richelieu, Quebec, Canada J3B 3E6

nicolas.tremblay@agr.gc.ca

 

 

 

25 May 2017

 

 

Case Statement for On-Farm Data Sharing WG

 

Charter

 

Introduction and Rationale

 

Farmers have capabilities that they have never had before to critically evaluate management practices using field-scale replicated strip trials. Farmers have gained this powerful capability because yield monitors on combines enable accurate measurement of yields. Networks of farmers have been established around the world to exploit the potential of yield monitors to evaluate management practices at the field level. Networks of farmers have become increasingly common because farmers understand the power of evaluating management practices on their fields and across many fields in a similar agroecosystem. Scientists, and then policy makers, can also find value in data coming from a diversity of agroecosystems as previously unknown G x E x M (Genetics × Environment × Management Interactions) (Hatfield and Walthall, 2015) relationships could be derived from contrasted soil, climatic conditions, genotype evaluations, and farming practices.

 

Collection of results from strip trials across many farmers’ fields requires protocols for data stewardship, that is, for data reporting, sharing and archiving. Most farmer networks have developed data stewardship protocols. The protocols, however, vary from network to network, and the protocols are not easily accessible to people outside the networks. Creation of a standardized set of protocols for data stewardship that are publicly available, especially for confidentiality of the data and for sharing of data, would enable the pooling of results from many networks into one secure database. The protocols would be specific to on-farm research performed at a field scale with yields measured by yield monitors. Protocols developed for more general data collection by farmers such as the Thirteen Principles on Data Privacy and Security from the American Farm Bureau Federation, and those developed by the Agricultural Data Coalition will underpin these specific protocols. One big difference in the specific protocols we will create is that our protocols for on-farm research will include minimum data requirements, which other protocols for data stewardship do not include.

 

Questions to address in the protocols include life cycle, data quality, data infrastructure, formats, standards, protocols, archives, FAIR principles (Wilkerson, 2016), availability, provenance, stewardship, privacy, property rights, laws, confidentiality and governance. Creation of a standardized set of protocols also would promote the formation of new farmer networks and the collection of many more results from on-farm trials, which would greatly increase the value of a secure database. A secure database open to researchers from around the world based on the guidelines to be created by this WG would be an enormously valuable resource for farmers, farm advisors and policy makers.

 

As a first step, we aim at combining the results of thousands of field-scale replicated trials completed across a diversity of agroecosystems in the US Corn Belt. The Corn Belt covers much of the 65 million hectares of maize and soybeans planted in the US. This vast dataset would make possible new and previously unavailable analyses to improve productivity, profitability and environmental stewardship. One example of the type of research that could be completed with such a database is from a proposal submitted to the United States Department of Agriculture (USDA) by three farmer networks in the US. The three networks are seeking to develop an interactive, online tool for improved management of nitrogen (N) across the numerous agroecosystems in the Corn Belt of the US. The tool will provide information that farmers need to create locally adapted N recommendations. The tool will have four main components: 1) risk assessment of late-season deficient and excessive maize N status; 2) uncertainty of yield response in individual trials; 3) probability of an economic yield response for different levels of N fertilization, different N timings and fertilizer sources, different cropping systems, and observed rainfall and soil characteristics within fields based on aggregate data; and 4) statistical power analysis to estimate the number of locations and treatment replications needed to detect a specific yield response of interest to a farmer or agronomist.

 

The online tool will be based on the analysis of archived information from two types of data collected by three farmer networks: 1) 5,420 systematic surveys of the N status of maize fields from Ohio, Indiana, and Iowa across 13 years, and 2) 812 field-scale, replicated N rate on-farm trials in maize fields from Iowa, Ohio, Indiana, Illinois, Michigan, and South Dakota across 12 years.

 

This will be the first time data from different farmer networks shall be integrated if the proposal to USDA is funded. These data are only a small part of the data that resides in individual databases of farmer networks in the US. Only three of the six networks in the US were cooperators on this USDA proposal. The other networks were hesitant to contribute their data for several reasons, but the main reason was the lack of guidelines about who would have access to the data, for how long, and for what purposes. The data available in these farmer networks are not only the results of N rate trials but contain results from trials about fungicide effectiveness, plant population studies, the effectiveness of N stabilizers, effects of tillage on yield, and many other topics. 

 

One huge advantage of analyzing results from such a large database of fields over many years is that results can be displayed as probabilities. Typical N recommendations for grain crops are made with little to no estimate of the variability in N needs across fields. Because the variability in N needs across fields and years has been shown to be large (Dhital and Raun, 2016), current N recommendations are much less reliable than needed for widespread adoption by farmers.

 

An example of how results from large numbers of trials can be used to estimate the probability that a maize field will have deficient or excess N, and some of the factors affecting the N status at the field scale is shown in Kyveryga et al. (2013). This data set contained 56 field-scale, replicated, two-treatment studies over 2 years in the state of Iowa where the N fertilizer rate was decreased by 56 kg ha-1 compared with the rate normally applied by farmers who participated in the trials. The intent of this study was to help farmers decide whether they could profitably reduce their N rates by 56 kg ha-1. The results showed that the probability of increased profit with reduced N fertilizer was reduced by 35% when high amounts of rainfall occurred in June, but was increased by 20% when soil organic matter was high.

 

Other important advantages of combining data from farmer networks are that meta-analysis techniques are not needed because the data are the raw results from individual strips in the trials. Results from individual strips are preferable to aggregate data because more comprehensive data analyses can be performed to fully understand the treatment effect (Jones et al., 2009). Combined data also are of much greater value to other scientists such as economists who analyze data using different techniques and hypotheses than agronomists. And field-scale trials allow measurement of the effect of spatial variability within fields on yield and profit, which small research plot studies are not capable of measuring. Because management practices by farmers are greatly affected by spatial variation of soil properties (including topography) within fields, field-scale trials are needed to measure these effects.

 

Deliverables and Outcomes

 

The deliverables for the On-Farm Data Sharing WG will be:

  1. Minimum data requirements for field-scale, replicated strip trials completed by farmers using GPS-guided equipment including combines with calibrated yield monitors.
  2. Guidelines for collecting, handling, storage and formatting results and metadata from field-scale, replicated strip trials
  3. Guidelines for stewardship of data collected from field-scale, replicated trials completed on production grain fields, which will include guidelines for:
    1. Data accessibility
    2. Licensing options for allowable uses of the data (data sharing)
    3. Curation of the data
    4. Maintaining confidentially of the data

The outcomes for the On-Farm Data Sharing WG will be:

  1. Agreement among interested farmer networks in the world to place their data in one common database using the guidelines developed as part of the deliverables. The data and meta data managed by the 6 major farmer networks in the US1 will be the first data to populate the database with data from other networks in the US and in other parts of the world, especially from countries with many combines having yield monitors such as Canada, countries in western Europe, Australia, and Argentina who are interested to participate, added later.
  2. Submission of a proposal to the U.S. National Science Foundation or other funding organization for funding to clean and collate the data that will be entered into the database, and to create a common, secure database for the results of trials and for the field metadata.

When these outcomes are achieved, the guidelines established will serve as a baseline for other networks representing other agroecosystems to follow suit with their own adapted sets of requirements.

 

1 The 6 major farmer networks in the US are:

 

  1. On-Farm Network managed by the Iowa Soybean Association
  2. Adapt Network managed by Environmental Defense Fund
  3. Infield Advantage managed by the Indiana State Department of Agriculture
  4. On-Farm Research Network managed by the University of Nebraska Extension
  5. New York New York On-Farm Research Partnership managed by Cornell University
  6. K-State On-Farm Network managed by Kansas State University, Kansas State Research and Extension.

 

Value Proposition

 

Society will be the largest benefactor from implementation of the On-Farm Data Sharing WG outcomes. Grain crops will be grown with lower costs and less pollution. Specific benefactors will be farmers and farm advisors, scientists and policy makers.

The tangible benefits for each group are:

 

  1. Farmers and farm advisors will obtain more reliable and accurate recommendations for many management practices that are difficult or impossible to evaluate in small-plot research. Examples of management practices that are best evaluated on a field scale include: fertility management, especially N management; pest management; plant population management and interactions with fertility; soil and fertilizer enhancement products such as N stabilizers and products derived from humic acids, etc. Economic analysis of changes in management practices will also be more accurate and realistic with results from field-scale trials.
  2. Scientists will have access to reliable, replicated research results about the effects of changes in management practices on profit and the environment at an unprecedented scale, both geographically and numerically. Given the complexity and diversity of biological and physical conditions in agriculture fields, and the interactions of these conditions with the enormous number of management practices (types and degrees) used by farmers, large data sets of replicated, field-scale trials are needed to categorize practices into probabilities of success. Current methods of research are inadequate to create probability distributions of management practices by environment. The guidelines developed by this WG will enable scientists to publish much more reliable and accurate estimates of which management practices are best used in any environment. Also, a dataset of this quality and magnitude will be used to apply data mining algorithms and machine learning leading to further discoveries of potentially applicable decision rules.
  3. Policy makers will benefit by having access to more reliable conclusions about the effect of management practices on profit and the environment. This will enable policy makers to create better informed and effective programs for food production. 

 

Engagement

 

There are state and regional efforts to create databases of results of replicated field-scale trials to improve recommendations for management practices. The Iowa Soybean Association is the leader in this type of effort in the state of Iowa. The Indiana State Department of Agriculture’s INField Advantage program is modeled after the Iowa Soybean Association’s program, and the Environmental Defense Fund’s On-Farm Network is a similar program except their program crosses state borders to include trials from Ohio, Indiana, Michigan and Illinois. Members of these organizations are part of the International Society of Precision Agriculture’s Community entitled “On-Farm Data Sharing”, and representatives of these organizations will be part of the IGAD On-Farm Data Sharing WG.

 

The 4R Research Fund works to create databases of existing research on nutrient management, and to create new research to increase the size of the databases they are creating. This program is organized and run by the International Plant Nutrition Institute (IPNI). A scientist from IPNI is a member of the On-Farm Data Sharing Community and will be a member of the On-Farm Data Sharing WG.

 

One goal of the On-Farm Data Sharing WG will be to seek scientists from around the world who are working with farmers, either informally or formally in organizations, to implement replicated field-scale trials harvested by combines for the purpose of improving management practices.

 

Work Plan

 

A specific and detailed description of how the WG will operate including:

  1. The final deliverables of the On-Farm Data Sharing WG will consist of:
    1. Guidelines for minimum data requirements for field-scale, replicated strip trials completed by farmers using GPS-guided equipment including combines with calibrated yield monitors.
    2. Guidelines for collecting, handling, storage and formatting results and metadata from field-scale, replicated strip trials
    3. Guidelines for stewardship of data collected from field-scale, replicated trials completed on production grain fields, which will include guidelines for:
      1. Who has access to the data
      2. Allowable uses of the data
      3. Curation of the data
      4. Maintaining confidentially of the data
  2. Milestones for the WG include:
    1. September 2017. Acceptance of the WG Case Statement by the Research Data Alliance
    2. December 2017. Guidelines for minimum data requirements completed.
    3. February 2018. Guidelines for collecting, handling, storage and formatting results and metadata completed.
    4. April 2018. Guidelines for stewardship of data collected from field-scale, replicated trials completed.
    5. June 2018. Proposal submitted to a funder such as the U.S. National Science Foundation for funding to clean existing data, format data, and create a secure database for placement of data from 6 existing farmer networks.
    6. June 2018. Presentation of guidelines at the International Conference on Precision Agriculture.
    7. July 2019. Presentation of guidelines at the European Conference on Precision Agriculture.
    8.  

As the bulk of the contributors are coming from the crop science sector, it is unlikely that many of them will be attending the RDA plenaries. Our intention is to seize opportunities stemming from agronomical scientific meetings (for instance, before the launch of the OFDS-WG activities, a poster will be presented at the 11th European Conference on Precision Agriculture (ECPA 2017, July 16 – 20, 2017, Edinburgh, UK), and to work in a collaborative environment such as DropBox, which allows for fluid comment and changes to documents. It is expected that contributors will be meeting in person at the many conferences on crop science, agronomy or precision agriculture that are being held several times per year. Example of those are:

 

  • 7th Asian-Australasian Conference on Precision Agriculture (October 15 – 20, 2017, Hamilton, New Zealand)
  • American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America 2017 International Annual Meeting (October 22 – 25, 2017, Tampa, Florida)
  • 14th International Conference on Precision Agriculture (June 24 – 27, 2018, Montreal, Canada)

• A description of how the WG plans to develop consensus, address conflicts, stay on track and within scope, and move forward during operation.

 

Consensus will be built by submitting all drafts of the guidelines and proposals to all members of the WG, and by providing sufficient time, usually 3 weeks, for review of the documents. Conflicts will be addressed by discussion and by building consensus through discussion and email exchanges. With members of the WG located distant from each other discussions will occur by using Skype. Face-to-face meetings will be held at RDA plenaries and at meetings such as the American Society of Agronomy’s annual meeting to build consensus. To stay on track and within the scope of the work plan, monthly email exchanges will occur to check on progress of writing the guidelines and proposals. 

 

• A description of the WG’s planned approach to broader community engagement and participation.

 

We will attend many conferences on crop science, agronomy and precision agriculture, and we will inform the agronomy community about the importance and status of ongoing and completed work within the WG.

 

Adoption Plan

 

Agreement among the 6 major farmer networks in the US to place their data in one common database by December 2018 using the guidelines developed as part of the deliverables has been established as an objective of the WG. Also, the submission of a proposal to the U.S. National Science Foundation or other funding organization for funding to clean and collate the data in each of the 6 major farmer networks in the US, and to create a common, secure database for the results of trials and for the field metadata is part of the plan for adoption or implementation of the WG outcomes within the organizations and institutions represented by WG members, as well as plans for adoption more broadly within the community.

 

Initial Membership

 

Initial leadership:

 

Tom Morris                   U Connecticut                             Thomas.Morris@uconn.edu

Nicolas Tremblay          Agriculture Agri-Food Canada    Nicolas.Tremblay@agr.gc.ca

 

 

Initial members (TBC)

 

Bertin, Patricia              Embrapa, Brazil                          patricia.bertin@embrapa.br

Bonnet, Pascal              CIRAD, France                           pascalbonnet@cirad.fr

Ciampitti, Ignacio         K-State U                                     ciampitti@ksu.edu

Clay, David                    South Dakota State U                 david.clay@sdstate.edu

Craker, Ben                   AGCO                                          ben.craker@AGCOcorp.com

Ekpe, Sonigitu A.          Nigeria                                         sonigitu.ekpe@graduateinstitute.ch

Ferreyra, R. Andres      Ag Connections LLC                   andres.ferreyra@agconnections.com

Gullotta, Gaia               Bioversity International, Italy    

Hatfield, Gary               South Dakota State U                 gary.hatfield@sdstate.edu

Kyveryga, Peter            Iowa Soybean Association         pkyveryga@iasoybeans.com

Murrell, Scott               IPNI                                             smurrell@ipni.net

Neveu, Pascal               INRA                                          Pascal.Neveu@inra.fr

Rabe, Nicole                 Ontario Ministry of Ag                Nicole.rabe@ontario.ca

Reverte, Carmen          IRTA, Spain                                 carme.reverte@irta.cat

Soonho, Kim                 International Food Policy RI       soonho.kim@cgiar.org

Stavrataki, Maritina     Agroknow, Greece                       maritinastavrataki@agroknow.com

Stelford, Mark              Premier Crop                              mstelford@premiercrop.com

Thompson, Laura         U Nebraska-Lincoln                    Laura.thompson@unl.edu

Yost, Matt                     ARS – U Missouri                        Matt.Yost@ARS.USDA.GOV

 

References

 

Dhital, S., and W. R. Raun. 2016. Variability in optimum nitrogen rates for maize. Agronomy J. 108: 2165-2173.

 

Hatfield, J. L., and C. L. Walthall. 2015. Meeting global food needs: Realizing the potential via genetics × environment × management interactions. Agronomy J. 107: 1215-1226.

 

Jones, A.P., R. D. Riley, P. R. Williamson, A. Whitehead. 2009. Meta-analysis of individual patient data versus aggregate data from longitudinal clinical trials. Clinical Trials 6: 16–27.

 

Kyveryga, P. M., P. C. Caragea, M. S. Kaiser and T. M. Blackmer. 2013. Predicting risk from reducing nitrogen fertilization using hierarchical models and on-farm data. Agronomy J. 105: 85-94.

 

Wilkerson, M. D., M. Dumontier, I. J. Aalbersberg, G. Appleton, B. Mons et al. 2016. The FAIR guiding principles for scientific data management and stewardship. Nature. SCIENTIFIC DATA | 3:160018 | DOI: 10.1038/sdata.2016.18 1.

 

Review period start:
Tuesday, 6 June, 2017 to Friday, 30 June, 2017
Custom text:
Body:

This is not a real case statement. 

Review period start:
Tuesday, 30 May, 2017
Custom text:
Body:
Review period start:
Friday, 26 May, 2017
Custom text:
Body:
Review period start:
Thursday, 25 May, 2017
Custom text:
Body:

The purpose of the Early Career and Engagement Interest Group (ECEIG) is to provide a focal point for Early and Mid-Career Researchers and Professionals, including those involved in various RDA-related fellowships and Early Career programs.

While other efforts exist in RDA to support Early Career Researchers and Professionals, such as the RDA-US and RDA-EU fellowship programs, the RDA ECEIG seeks to complement existing efforts by (i) building a peers network, (ii) maintaining a “live” document of advice for Early Career Researchers and Professionals and (iii) creating opportunities for formal and informal mentoring within RDA.

Specifically, objectives of this IG are to:

1. Focus on Early Career Researchers and Professionals because they need the most support

2. Establish a volunteer-based mentoring progam 3. Network across domains to establish an interdisciplinary network of peers

4. Provide a space for people who have more experience with RDA to pass on knowledge and lessons learned to Early Career Researchers and Professionals

5. Create a social outlet specifically for Early Career Researchers and Professionals

To find out more, please view the IG Charter and leave your comments below. 

Review period start:
Tuesday, 2 May, 2017
Custom text:
Body:

Version 1.0 14th June 2017

(the draft version 0.1 and final approved version are available as attached documents at the bottom the page]

 

Introduction

Data are fundamental to the field of linguistics. Examples drawn from natural languages provide a foundation for claims about the nature of human language, and validation of these linguistic claims relies crucially on these supporting data. Yet, while linguists have always relied on language data, they have not always facilitated access to those data. Publications typically include only short excerpts from data sets, and where citations are provided, the connections to the data sets are usually only vaguely identified. At the same time, the field of linguistics has generally viewed the value of data without accompanying analysis with some degree of skepticism, and thus linguists have murky benchmarks for evaluating the creation, curation, and sharing of data sets in hiring, tenure and promotion decisions.

 

This disconnect between linguistics publications and their supporting data results in much linguistic research being unreproducible, either in principle or in practice. Without reproducibility, linguistic claims cannot be readily validated or tested, rendering their scientific value moot. In order to facilitate the development of reproducible research in linguistics, The Linguistics Data Interest Group (LDIG) plans to develop the discipline-wide adoption of common standards for data citation and attribution. In our parlance citation refers to the practice of identifying the source of linguistic data, and attribution refers to mechanisms for assessing the intellectual and academic value of data citations. The LDIG is for data at all linguistic levels (from individual sounds or words to video recordings of conversations to experimental data) and data for all of the world’s languages, and acknowledges that many of the world’s languages have high cultural value and are underrepresented with regards to the amount of information that is available about them.

 

This interest group is aligned with the RDA mission to improve open sharing of data through forming transparent discipline-specific data citation and attribution conventions to be adopted by the international research community. This interest group will add value to the RDA community by providing breadth to the current roster of RDA interest groups: linguistics is a discipline that straddles social/behavioral sciences and the humanities, and thus we have a great deal to contribute to the general RDA discussion on a multiplicity of data types. This group ties in with other initiatives in transparent research methods in linguistics at all stages of the workflow, including Open Access data archiving and publishing, reproducible methodologies and critical consideration of data licensing. The LDIG seeks to support these initiatives while focusing on data citation specifically. The LDIG provides an ongoing space for linguists to come together to improve how we manage and cite our data, and how we train linguists in good practice.

 

Who this group is for?

The LDIG is for people who work with linguistic and language data. This work includes, but is not limited to, the collection, management and analysis of linguistic data. We encourage participation from academic and speaker communities.

 

Objectives and outcomes

Our overarching objective is to contribute to a positive culture of linguistic data management and transparency in ways that are in keeping with what is happening in the larger digital data management community. To do this we aim to be a group that is able to provide tangible tools (e.g. guidelines, software) for improving the culture of data citation and attribution within linguistics. This will also involve understanding the breadth of data types linguists work with, and current uses of persistent identifiers. We outline three main objectives. For each objective we also suggest specific outcomes, which would be the focus of shorter term timelines (e.g. Working Groups):

  • Development and adoption of common principles and guidelines for data citation and attribution by professional organizations, such as the Linguistic Society of America and the Societas Linguistica Europaea, academic publishers, and archives for linguistic and language data. Principles and guidelines will follow the recommendations in the Joint Declaration of Data Citation Principles.

    Potential WG topics include:

    • Development of a common stylesheet for citation of linguistic data

    • Adoption of the style sheet by publishers, archives, organisations and individuals

    • Integrating RIS with linguistic data services like the Open Language Archives Community

  • Education and outreach efforts to make linguists more aware of the principles of reproducible research and the value of data creation methodology, curation, management, sharing, citation and attribution. Practical training also helps make proper data preparation less burdensome for researchers, and normalises this work as an expectation of the discipline. While much of this work will be practical training, outreach also needs to take into account the complex and varying attitudes towards creation of open access data sets across linguistics.
    Potential WG topics include:

    • Development of training modules

    • Delivery of training at conferences and workshops

    • Development of tools for the management of linguistic data

  • Efforts to ensure greater attribution of linguistic data set preparation within the linguistics profession.
    Potential WG topics include:

    • Framework for valuing the development of linguistic data sets in job appointments, tenure and promotion applications and in research degrees and postdoctoral research projects.

It will be up to the LDIG to decide if any of these specific outcomes would be best met by forming short term working groups with specific timelines for the deliverables. Other outcomes may be worked on within the LDIG on a more open timeline. Further goals include fostering greater transparency in research methodology, and data access rights. We expect that other outcomes will be developed as LDIG grows and responds to the changing research environment.

 

Mechanism

The co-chairs will hold a conference call every two months. The wider LDIG will convene quarterly meetings. The timezone spread of LDIG members means that these meetings will be held asynchronously in an editable document. The agenda will be posted with discussion points, and will be open for comment for a week, before actions are decided upon and delegated. We will also host face-to-face meetings at relevant linguistics conferences, such as Societas Linguistica Europaea, Linguistic Society of America, the Australian Linguistics Society, and at the RDA plenaries.

 

Interaction with groups in RDA

The following RDA groups have been identified as having interests that are relevant to LDIG, both in terms of technical and ethical issues in linguistic data management:

While setting up the LDIG we will ask at least four of our members to nominate themselves to participate in one of these other groups and be officially named as our cross-group co-ordinator. This will facilitate cross-group relevance.

Linguists from particular subfields may find that particular interest groups are relevant to particular issues in their area, for example corpus linguists may find that the Big Data IG addresses relevant issues. We encourage LDIG participants to also engage with other interest groups and working groups in the RDA.

 

Related projects and activities

There are also a number of organisations and groups outside the RDA that LDIG will engage with directly as the objectives of the group are addressed.

 

Contributors

Co-Chairs:

Andrea L. Berez-Kroeker, U Hawai‘i at Mānoa

Lauren Gawne, La Trobe University

Helene N. Andreassen, UiT The Arctic University of Norway

Potential members are welcome to sign up to the LDIG or contact the co-chairs for more information. LDIG has been promoted through the LINGUIST List, and we invite any interested party to participate.

 

Timeline

The LDIG aims to be an ongoing group, whose overall aim is to promote better practice in linguistic data management. A general timeline is given, however some of these responsibilities may be handed over to a working group specifically set up for the delivery of the data citation standards.

Outreach - first 6 months (May-November 2017)

  • April 2017    Draft charter posted

  • May 2017    Group advertised publically

  • June 2017    Amended charter posted

  • Sept 2017    Attend Montreal RDA plenary and connect with relevant RDA groups

  • Oct 2017    Finalise LDIG structure and communication processes

Groundwork - second 6 months (November 2017-May 2018)

This groundwork helps us expand the reach of the LDIG and ensures that we are as relevant and inclusive as possible. Includes attendance at April 2018 RDA plenary:

  • Survey of linguists on current data citation practice (individual practice and institutional level training opportunities)

  • Collate possible citation practices

  • Survey of linguists on current practices for academic attribution of curation of linguistic data sets in departmental tenure and promotion

Review period start:
Saturday, 8 April, 2017
Custom text:
Body:

 

The demand for reproducibility of research results is growing, therefore it will become increasingly important for a researcher to be able to cite the exact version of the data set that was used to underpin their research publication. The capacity of computational hardware infrastructures have grown and this has encouraged the development of concatenated seamless data sets where users can use web services to select subsets based on spatial and time queries. Further, the growth in computer power has meant that higher level data products can be generated in really short time frames. This means that we need a systematic way to refer to the exact version of a data set or data product that that was used to underpin the research findings, or was used to generate higher level products.

Versioning procedures and best practices are well established for scientific software and can be used enable reproducibility of scientific results. The codebase of very large software projects does bear some semblance to large dynamic datasets. Are these practices suitable for data sets or do we need different practices for data versioning? The need for unambiguous references to specific datasets was recognised by the RDA Working Group on Data Citation, whose final report recognises the need for systematic data versioning practices.

This gap was discussed at a BoF meeting held at the RDA Plenary in September 2016 in Denver, resulting in the formation of an Interest Group on data versioning. A review of the recommendations by the RDA Data Versioning IG (the precursor to this group) concluded that systematic data versioning practices are currently not available. The Working Group will produce a white paper documenting use cases and recommended practices, and make recommendations for the versioning of research data. To further adoption of the outcomes, the Working Group will contribute the use cases and recommended data versioning practices to other groups in RDA, W3C, and other emerging activities in this field. Furthermore, versioning concepts developed for research data will need to be brought in line with versioning concepts used in persistent identifier systems.

 

Value Proposition

Data versioning is a fundamental element in work related to ensuring the reproducibility of research. Work in other RDA groups on data provenance and data citation, as well as the W3C Dataset Exchange Working Group, have highlighted that definitions of data versioning concepts and recommended practices are still missing. The outcomes of the Data Versioning Working Group will add a central element to the systematic management of research data at any scale by providing recommendations for standard practices in the versioning of research data. These practice guidelines will be illustrated by a collection of use cases.

Engagement with existing work in the area

A lack of accepted data versioning practices has been recognised in different fields where reproducibility of research is a concern, e.g. data citation, data provenance, and virtual research environments. Versioning procedures and standard practices are well established for scientific software and can be used to facilitate the goals of reproducibility of scientific results. The Working Group will work with other groups within RDA and external on topics where data versioning is of importance to develop a common understanding of data versioning and standard practices.

Within RDA the Working Group will work together with the Data Citation WG to include its outputs into the collection of use cases, and with the Data Foundations and Terminology IG, the Research Data Provenance IG, the Provenance Patterns WG, and the Software Source Code IG to align data versioning concepts

The Working Group will work closely with the W3C Dataset Exchange Working Group to introduce the use cases collected by the RDA Data Versioning Working Group into the W3C Working Group’s collection of use cases and align versioning concepts. Additionally, the RDA Versioning Working Group will work closely with the AGU FAIR Data Project, in particular Task Group E on Data Workflows.

Work Plan

The outcome and deliverable of the Data Versioning WG will be a white paper documenting use cases, and recommending standard practices for data versioning. The use cases and recommendations will be aligned with the recommendations from other working groups in RDA, and external, where data versioning is of concern.

Milestones for the development of the document will be aligned with the coming RDA plenaries. The final document will be presented at the RDA Plenary in early 2019.

The Data Versioning WG will meet face-to-face at the RDA plenaries for broader discussions of the group’s findings and recommendations with other relevant RDA Groups. Between plenaries, the group will work online.

Besides sessions at the RDA plenaries, members of the working group will present the working group’s findings and recommendations at disciplinary conferences and in national working groups to achieve a broader community involvement in the development of the recommendations for data versioning.

The work on the data versioning white paper will be coordinated by the chairs of the working group. A collection of use cases will serve to illustrate the recommended practices for data versioning. The outcomes will be contributed as an addendum to the RDA Data Citation Recommendations to resolve differences between file-based and database-based applications.

Use cases collected by the Working Group will be fed into the W3C Dataset Exchange WG. This W3C Working Group has parallel timelines to the proposed RDA Data Versioning WG and will end in July 2019. It is now six months into its two year term.

Adoption Plan

The Working Group will work with existing adopters to support the adoption process and document any successes, failures, and lessons learnt. The Working Group will collect feedback from adopters and make sure it is considered for inclusion in the outputs.

The Working Group will work closely with the W3C Dataset Exchange Working Group to introduce the use cases collected by the RDA Versioning WG into the W3C Working Group’s collection of use cases and align versioning concepts. Initial outcomes will also be exchanged with the AGU FAIR Data Project.

Initial Membership

The initial membership of the Data Versioning WG will be drawn from the membership of the Data Versioning IG. The initial membership will include links to other RDA groups, e.g. Research Data Provenance, Provenance Patterns WG, and Software Source Code IG (Mingfang Wu), to the W3C Dataset Exchange Working Group (Simon Cox), and the AGU FAIR Data Project (Jens Klump).

The Data Versioning WG will initially be led by Jens Klump (CSIRO), Lesley Wyborn (ANU), Robert Downs (Columbia University) and Ari Asmi (University of Helsinki).

 

Review period start:
Tuesday, 9 January, 2018 to Friday, 9 February, 2018
Custom text:

Pages