GEO BON Efforts to Establish Components for Global Research Data Infrastructure
Wim Hugo, South African Environmental Observation Network, South Africa
The GEO BON Manifesto was developed and discussed at the GEO BON meeting in Asilomar II, December 2012. From an information technology perspective, the GEO BON Manifesto addresses description, discovery, assessment, access, analysis, and application or reporting, by stating that it is the interest of any specific community to do the following:
• Ensure that scientific data and services are described properly, preserved properly, and discoverable;
◦ This implies availability of metadata standards, harvesters, brokers, and meta-data interoperability.
◦ Persistent identifiers implied.
◦ Protocols and standards for data exchange/ uploads are implied.
◦ Preservation standards and formats implied.
◦ Tools and approaches to make searches more efficient (vocabularies, ontologies, dealing with massive meta-data collections, …).
◦ Sustainable data centers and long-term archives are implied.
• Once discovered, its utility, quality, and scope can be understood, even if the data sets are huge;
◦ Implies: Visualisations, feedback on quality, quality metrics and standards, viewing search results in relation to referenced spatial, temporal, and ontological/ taxonomic coverages, ability to dynamically extract 'thumbnail' views of large datasets, …
• Once understood; it can be accessed freely and openly;
◦ Implies: standardised services, licenses and policies, simplified distribution channels, even if costs are involved, …
• Once accessed, it can be included into distributed processes, and collated - preferably automatically, and on large scales (the ‘Model Web’);
◦ Implies: persistence of mash-ups and mediations, web context documents, web processing services, standards and guidelines for grid computing, ability to construct indicators and standardized, interoperable final products, …
• That due recognition is afforded to the creators of the data and services;
◦ Implies: data publication and citation, linking to scholarly articles, …
• Once processed, the mediations defined, usefulness, and knowledge gathered can be re-used.
◦ Implies: defining and storing templates and examples of finished work, processes, mash-ups, context documents, …
All of this needs to be implemented against the backdrop of
The push to extend formal meta-data with Linked Open Data;
The increased availability of crowd-sourced and citizen contributions;
A proliferation of devices and sensors;
And the construction of knowledge networks.
GEO BON Workgroup 8 is working towards addressing the gaps within this vision, largely by offering formally published guidance, and by engaging initiatives – including RDA -and programmes that can contribute. The standards and specifications landscape is reviewed in light of GEO-BON’s work on Essential Biodiversity Variables. Progress with this is discussed and summarized – defining the state of play in respect of end-to-end interoperability for biodiversity sciences.
Wim has a master's degree in Chemical Engineering, and many years experience in techno-economic feasibility studies, management consulting, systems engineering and systems architecture. Recent work (5-6 years) has focused on systems architecture and development in support of scientific data management and preservation. Research interests include Knowledge Networks and Certification of Trusted Digital Repositories. Member of the ICSU-World Data System Scientific Committee, and co-chair of the GEO-BON WG8 (Systems and Architecture), and of the newly constituted collaboration between RDA and WDS on repositories. Active in CoDATA, GEO, and GEOSS.