Domain Vocabularies RDA 8th Plenary BoF meeting

You are here

30 May 2016 2157 reads

Meeting title: BoF on Domain Vocabulary Development, Standardization, Registration, Harmonization and Support

A short introduction describing the scope of the group and if any previous activities

Discussions at previous plenaries have demonstrates a growing interest in the topic of domain vocabulary services from vocabulary registration to harmonization. Interest in this issues was, for example, manifest in the joint Data Foundations and Terminology and Vocabulary Services meeting at P7. To date RDA’s DFT WG & IG, for example have focused on the data domains as opposed to a non-data research domain such as marine science or chemistry. Past work such as in healthcare has demonstrated the role of controlled vocabularies as an “metadata” documentation aid to finding, using and integrating data

Increasingly standardizing the registration and management of domain vocabulary is needed since such terms are used as part of data/metadata documentation. Controlled vocabulary are often used for indexing and retrieval of data resources However terminologies, driven in part by early efforts are often arbitrary with little supporting conceptualizations or real standardization. This results in domain vocabularies that have the same concept scope; but are represented with different terms, use different formats and formalisms, and are published and stored with alternative access methods. For example, most water quality vocabularies conflate multiple concepts but insert these into a single, compounded term. Thus a term for some type of observation may mix the substance (or taxon) with the medium (e.g. water) observed, along with the procedure used as part of the observation and the units used for measurement. Alternate terms, the lack of clear definitions and poor maintenance of vocabularies makes systematic vocabulary use difficult and integration with other vocabularies difficult.
Despite decades of intensive work on controlled vocabularies (standardized sets of terms) and now human readable definitions accessible via URIs on the Web, problems remain. While linked data using RDF/S provides some help with representation and syntax there is often no supporting, systematic conceptualization. Often work with classification schemes and thesauri lack well-defined semantics and structural consistency, which makes matching up concepts that terms represent difficult.

Additional links to informative material related to the group i.e. Case statements, working documents etc

Chemistry Research Data IG, https://rd-alliance.org/groups/chemistry-research-data-interest-group.html
Data Foundations and Terminology IG, https://rd-alliance.org/groups/data-foundations-and-terminology-ig.html
Marine Data Harmonization IG, https://rd-alliance.org/groups/marine-data-harmonization-ig.html
Vocabulary Services Interest Group, https://rd-alliance.org/groups/vocabulary-services-interest-group.html

List of the meeting objectives

The intention of this session is to provide a suitable venue for RDA domain-oriented groups to illustrate their domain vocabularies and talk about their current approach to developing, registering and managing vocabularies along with issues that they have encountered. To organize discussion groups will be asked to provide one or more use cases and brief current approach to developing & importing vocabularies, harmonizing internal and external vocabularies with functions like "mapping.”

Discussion that would help provide a more standard way of addressing some issues and allow members from the RDA DFT and vocabulary services community describe their relevant activities. Examples of relevant tools are those that store various vocabularies in a common repository. The National Environment Research Council (NERC) Vocabulary Server, for example, is a tool that provides access to lists of standardized terms that cover a broad spectrum of disciplines of relevance to the oceanographic along with the wider Hydrology and Earth Sciences communities.
To a modest degree metadata from such tools can be used to identify and label data which helps mitigate some of the problems of ambiguities associated with data markup. A growing practice to add at least some basic thesaurus metadata using broader, narrower, related relations and Simple Knowledge Organization Systems (SKOS) properties. SKOS uses RDF to provide some formalization of various types of controlled vocabulary including classification schemes, subject heading lists, and taxonomies. This promises some degree of automation for finding relevant and similar terms, but it has limitations.

A goal of the session is to identify vocabulary services and associated methods that might be useful for such things as vocabulary development, standardization and harmonization.

Meeting agenda
• This session will start with and overview of objects and the proposed agenda.
• Each domain group will provide a brief summary of their work, relevant vocabularies and standards and an illustrative use case such as harmonization and mappings between 2 or more vocabularies and issues involved.
• Following presentations major time will be devoted to community discussion of common interests, issues and best practice solutions in the domain vocabulary space. 
o Whenever possible focus will be given to discuss issues of vocabularies reflecting actual innovative, tools and supporting activities and services available from current research and development particularly with RDA efforts such as Vocabularies Services and Terminology development.
• The session will conclude with discussion of follow up on common interests and venues for this such as in existing vocabulary group and exploration of interest future sessions. 

Audience: Please specify who is your target audience and how they should prepare for the meeting

Chemistry Research Data IG, Marine Data Harmonization IG. Global Water Information IG, Geospatial IG, Structural Biology IG, Biodiversity Data Integration IG Quality of Urban Life IG, Health Data
Each domain group will provide a brief summary of their work, relevant vocabularies and standards and an illustrative use case such as harmonization and mappings between 2 or more vocabularies and issues involved.
Data Foundations and Terminology IG will moderate the meeting and provide information on vocabulary harmonisation
Vocabulary Services Interest Group will provide information on available vocabulary services

Group chair serving as contact person Gary Berg-Cross

See attached file for some briefing information and slides from the session.

See also several pages of draft notes from the session.

There is also a virtual meeting  follow up session to this on Nov. 10th, 2016 from 12:30-2:30 Eastern time.

Mark Fox who presented at the BoF will provide a longer presentation on Ontology Design patterns for Global City Indicators

 See http://ontologforum.org/index.php/DomainVocabularies#Domain_Vocabularies... for 

information and slides that can be downlaoded. An audio recording of this will be placed on that site after the session