During the previous RDA plenaries it gradually became obvious that the environment-related Interest and Working Groups share concerns and face similar challenges regarding the lifecycle of Life data.
Many of our urgent societal challenges, require the effective cross-fertilisation of people, data and processes across multiple biodiversity and environmental disciplines. In order to address these overarching challenges we need to promote a targeted dialogue within existing fora and use these discussions to build the road-mapping documents that will drive our efforts in the years to come.
10.15-10.20 | Introduction and scope of session | Dimitris Koureas | |
10.20-10.28 |
|
Helen Glaves | |
10.28-10.36 | Wim Hugo | ||
10.36-10.44 | Rebecca Koskela | ||
10.44-10.52 | Susanna Sansone | ||
10.52-11.00 | Nicky Nicolson |
Scope of the session and next steps
Dimitris Koureas, Natural History Museum London, UK & Biodiversity Data Integration IG, co-Chair
Session co-chair
With a PhD on Plant systematics and biodiversity, Dimitris is currently a biodiversity informatics scientist in the Natural History Museum London. Over the last 10 years he has been developing, contributing and managing European funded and co-funded research and research infrastructure projects. Dimitris has significant expertise in virtual research environments and he liaises between different biodiversity research and e-infrastructure teams across Europe investing actively in capacity and contact building. He also participates in the strategic steering of the Scratchpads platform (http://scratchpads.eu) and is an invited lecturer in University of Reading, University of Oxford, and Aristotle University of Thessaloniki, teaching biodiversity informatics tools. Dimitris is a European representative in the Biodiversity Information Standards Organisation (http://tdwg.org) executive.
Keith Jeffery
Keith Jeffery Consultants, UK & Metadata Standards Directory IG, co-Chair
Session co-chair
![]() |
Common global framework for marine data management
Helen Glaves, British Geological Survey, UK & Marine Data Harmonisation IG
Abstract
In recent years marine research has undergone a paradigm shift, moving from the traditional discipline specific science towards a more ecosystem level approach. This more multidisciplinary approach to ocean science necessitates large amounts of good quality, interoperable data to be readily available for use in an increasing range of new and innovative applications.
This requirement for large volumes of marine data to be made readily available to users has been addressed on a regional scale by the development of e-infrastructures which are responsible for the managing and delivering data to the end user. However, each of these initiatives has been developed to address specific regional requirements and independently of those in other regions.
To establish a common framework for marine data management on a global scale requires interoperability across these existing data infrastructures and active collaboration between the organisations responsible for their management. The Ocean Data Interoperability Platform project in partnership with the RDA Marine Harmonisation IG is seeking to encourage co-ordination between these regional data infrastructures and capitalise on the range of expertise available in the Research Data Alliance to support the development of this global marine data infrastructure.
|
GEO BON Efforts to Establish Components for Global Research Data Infrastructure
Wim Hugo, South African Environmental Observation Network, South Africa
Abstract
The GEO BON Manifesto was developed and discussed at the GEO BON meeting in Asilomar II, December 2012. From an information technology perspective, the GEO BON Manifesto addresses description, discovery, assessment, access, analysis, and application or reporting, by stating that it is the interest of any specific community to do the following:
• Ensure that scientific data and services are described properly, preserved properly, and discoverable;
◦ This implies availability of metadata standards, harvesters, brokers, and meta-data interoperability.
◦ Persistent identifiers implied.
◦ Protocols and standards for data exchange/ uploads are implied.
◦ Preservation standards and formats implied.
◦ Tools and approaches to make searches more efficient (vocabularies, ontologies, dealing with massive meta-data collections, …).
◦ Sustainable data centers and long-term archives are implied.
• Once discovered, its utility, quality, and scope can be understood, even if the data sets are huge;
◦ Implies: Visualisations, feedback on quality, quality metrics and standards, viewing search results in relation to referenced spatial, temporal, and ontological/ taxonomic coverages, ability to dynamically extract 'thumbnail' views of large datasets, …
• Once understood; it can be accessed freely and openly;
◦ Implies: standardised services, licenses and policies, simplified distribution channels, even if costs are involved, …
• Once accessed, it can be included into distributed processes, and collated - preferably automatically, and on large scales (the ‘Model Web’);
◦ Implies: persistence of mash-ups and mediations, web context documents, web processing services, standards and guidelines for grid computing, ability to construct indicators and standardized, interoperable final products, …
• That due recognition is afforded to the creators of the data and services;
◦ Implies: data publication and citation, linking to scholarly articles, …
• Once processed, the mediations defined, usefulness, and knowledge gathered can be re-used.
◦ Implies: defining and storing templates and examples of finished work, processes, mash-ups, context documents, …
All of this needs to be implemented against the backdrop of
The push to extend formal meta-data with Linked Open Data;
The increased availability of crowd-sourced and citizen contributions;
A proliferation of devices and sensors;
And the construction of knowledge networks.
GEO BON Workgroup 8 is working towards addressing the gaps within this vision, largely by offering formally published guidance, and by engaging initiatives – including RDA -and programmes that can contribute. The standards and specifications landscape is reviewed in light of GEO-BON’s work on Essential Biodiversity Variables. Progress with this is discussed and summarized – defining the state of play in respect of end-to-end interoperability for biodiversity sciences.
|
Metadata Practices for Biological and Environmental Data
Rebecca Koskela, Universiy of New Mexico, USA & Keith Jeffery - Keith Jeffery Consultants, UK & Metadata Standards Directory IG
Abstract
The scope of biological and environmental research has been changing to focus on long-term, broad-scale, and complex questions that require diverse data collected by interdisciplinary science teams as well as different approaches for managing, preserving, analyzing, and sharing data. The RDA metadata groups focus on all aspects of metadata for research data, including data discovery, contextualization, validation, analytical processing, and interoperation. Metadata is not only important for documenting data (including rights, provenance as well as the usual descriptive information) and assessing datasets for relevance and quality (which includes dataset and publication citation) for the (re-)purpose in hand; it is also necessary for interoperation. This presentation will cover best practices for interoperable metadata useful for interdisciplinary research.
![]() |
Connecting data policies, standards & databases in life sciences
Susanna Sansone, University of Oxford, UK & Technical Advisory Board & Biosharing Registry WG
Abstract
As part of the worldwide growing movement for reproducible research, the efforts of funding agencies and journal editors are converging to encourage awardees and authors to provide the underlying data together with a description of that data and the methods used to generate the data, providing such details in a standardized manner and making it available (publicly or via controlled access) for reuse. In parallel, a growing number of community-based groups are developing standards, including content standards for both data and experimental metadata. As a consequence of this general mobilization to support reproducible research there are more than a 1000 databases in the life science, over 300 terminologies, more than 100 reporting guidelines, over 150 exchange formats, and a growing number of data preservation, management, sharing policies and plans that could help in the annotation, reporting and sharing of life science datasets. But what is relevant to the biodiversity and environmental disciplines?
Since 2011, BioSharing works to improve information about the content standards and the databases (maturity, uptake, implementation); provide information to funders and journals about what standards are the appropriate community norms, what databases implement which standards or is appropriate for a certain data types, or where data is curated and openly available (or access is regulated for e.g. ethical reasons) etc. Improving the quality in lists of databases and standards will allow funder/journal policies to encourage transparent information and recommendation of community norms. Interlinking allows the project to close the loop: here are the databases and standards; here are the policies that refer to them (or not). For example, when standards are mature and appropriate standards-compliant systems become available these are channeled to the appropriate stakeholder community, who in turn endorse (in policies) or implement (in databases) them achieving wider harmonization of the data. ![]() |
If you don't know the names, your knowledge gets lost
Nicky Nicolson, Royal Botanical Gardens Kew, UK & Biodiversity Data Integration IG
Abstract
Scientific names are used in all domains as entry points into biodiversity datasets - but names are updated over time as we refine our understanding of species diversity. Resolution services are essential for data integration efforts to build the linked open data that researchers require: these allow navigation between old and new names as represented in different taxonomies (organising systems) and thus provide access to all content linked to name variants. Names services, operating on high-quality, expert-curated, structured, linked data representing names and their inter-relationships allow the transition of static text to actionable data. These services should be usable by any domain of basic or applied science dealing with scientific names of organisms.
Nicky Nicolson is the senior research leader in the Kew’s Biodiversity Informatics team, and has over 15 years experience in biodiversity informatics – the effort to curate, mobilise and exploit global scale biodiversity data resources gathered over hundreds of years. |
Plenary session groups:
- WG BioSharing Registry: connecting data policies, standards & databases in life sciences
- WG Metadata Standards Directory
- WG Wheat Data Interoperability
- IG Agriculture Data Interoperability
- IG Biodiversity Data Integration
- IG ELIXIR Bridging Force
- IG Geospatial
- IG Marine Data Harmonization
- IG Metadata