FAIR Data Maturity Model WG Activity Overview Workshop #2 Report

Workshop #2 Report

Creator

Discussion
April 17, 2019 at 8:24 am #108944

RDA Admin
Member

Dear members of the RDA FAIR Data Maturity Model Working Group,
We would like to thank you for attending the meeting of the Working Group in Philadelphia on 3 April 2019 and hope you found the meeting useful.
The report of the meeting is now available for download from the WG page on the RDA site at https://www.rd-alliance.org/workshop-2.
We are currently finalising a Google spreadsheet for your contributions to the development of the indicators for the FAIR principles following the approach presented at the meeting in Philadelphia, and we plan to share the spreadsheet with the Working Group in the coming days.
Many thanks!
Makx Dekkers
Editorial team
Creator

Discussion

Page 2 of 3

← 1 2 3 →

Author

Replies
April 19, 2019 at 8:07 am #130611

RDA Admin
Organizer

Many thanks, Keith.
Indeed, the details of access rights and conditions might be quite complex.
For this moment, though, Id like to propose that the groups does not dig
into the details of this issue.
Maybe we can establish: first, that expression of access rights and
conditions is important for FAIRness, and, second, where it could be
evaluated under an existing principle, or alternatively, as Andrea Perego
suggested, under the Beyond FAIR category.
The details of how exactly the access rights and conditions are expressed
can then be discussed at a later stage. At that stage, it will indeed be
very important to use what is already there and not reinvent the wheel.
May I request that we limit the discussion on this issue to two questions:
1. Is it important to include access rights and conditions in the
evaluation of FAIRness?
2. Under which area (F, A, I, R, beyond FAIR) or principle should it be
evaluated?
Kind regards, Makx
From: Keith Jeffery
Sent: 19 April 2019 09:20
To: Barend Mons ; Ge Peng – NOAA Affiliate
; FAIR Data Maturity Model WG

Cc: andrea.perego ;
***@***.***; makxdekkers ;
‘***@***.***-groups.org; ***@***.***
Subject: RE: [fair_maturity] Workshop #2 Report
All
I have been following the discussion with interest. Andrea is of course
correct and references (updated) DCAT in this regard. I agree with Maggies
suggested relationships of access to the relevant FAIR principles. I agree
strongly with Barend on rich metadata that is actionable (implies formal
syntax and declared semantics). I agree that licence conditions and access
constraints (security, privacy, trust, cost) are related but different. The
tracking of correct re-use conditions (attribution, use without change) is
related to provenance although I havent seen any reliable solutions yet.
From my perspective it is commonly difficult for any workflow/process to
distinguish A, I and R in terms of access permissions. The problem is the
multi-dimensionality. I suggest the basic access permission actually
relates to a process (e.g. Webservice) acting on a dataset (generalised as
any asset acting on or with any other asset including lab equipment,
sensors, software services, datasets, persons, organisations, publications)
but this is conditioned by who owns or manages each of the two assets
involved and whom those organisations or persons delegate to execute the
service on the dataset (or more generally the action of one asset on
another). Thus if the basic permission being managed is
then we have connected to each something
like:

(person in role user different from
the manager and owner persons)
Note that each should have a temporal duration (this allows for e.g. embargo
periods or elapsing of copyright). There can be interesting problems when
temporal intervals do not coincide a well-known NP-hard problem from
temporal database research.
Furthermore, access conditions can be quite complex; depending on the asset
the permission may be :
[know the existence of | read | copy | update (amend, insert, delete) |
delete | execute ] and in a different dimension [cost (which may have
complex sub-conditions)] in yet another dimension [attribution] and in yet
another dimension [trust/security/privacy] and of course as well as the
conventional assets (datasets, services etc) the metadata itself may be
subject to access conditions.
Hopefully most research can be open and free thus avoiding the need for such
access controls, but I fear that we shall have to provide them not least
because of increasing regulation (e.g. GDPR in Europe).
I am concerned that we do not re-invent the wheel. Years of research have
provided production-strength access control systems for large-scale IT
systems used all the time by commerce, industry, government. I believe the
problem is to relate the capabilities of these access control systems to the
FAIR principles using a form of logic notation representable in the metadata
as formal syntax and declared semantics.
Best
Keith
—————————————————————————-
—-
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***

T: +44 7768 446088
S: keithgjeffery
—————————————————————————-
——————————————————
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
—————————————————————————-
——————————————————
From:
barendmons=***@***.***-groups.org <

barendmons=***@***.***-groups.org> On Behalf Of Barend Mons
Sent: 18 April 2019 18:09
To: Ge Peng – NOAA Affiliate < ***@***.***>;
FAIR Data Maturity Model WG <
***@***.***-groups.org>
Cc: andrea.perego <
***@***.***>;
***@***.***; makxdekkers <
***@***.***>; ‘***@***.***-groups.org;
***@***.***
Subject: Re: [fair_maturity] Workshop #2 Report
Yes and as we support the statement: as open as possible, as closed as
necessary, we could even put the most liberal license as default UPRI in
metadata templates, so that researchers would have to increase restriction
deliberately if they have good reasons for that, which the funder will
accept (for instnace patient privacy).
In that way unlicensed and thereforwe in some circles re-useless data can
be avoided?
B
Prof. Barend Mons
Leiden University Medical Center
President of CODATA
GO FAIR international Support and coordination office
Mail: ***@***.***-fair.org
+31624879779
ORCID: 0000-0003-3934-0072
sent from my IPad
On 18 Apr 2019, at 19:01, Ge Peng – NOAA Affiliate <***@***.***
> wrote:
Thanks, Barend, for bringing up the point of open-minded researchers. For
them and many, if not all, US federal agencies, not having any restriction
clauses on access and/or use in metadata record or website means there is no
restriction, period. It has been on my mind but have not got a good time to
mention so until now.
However, requiring explicitly description of the license, access and use
rights may be a good practice.
Regards,
— Peng
On Thu, Apr 18, 2019 at 12:31 PM Barend Mons <***@***.***
> wrote:
Good discussion (just landed from Nairobi, so was a bit out of the loop.
Also there, access rights e.d. were a hot discussion topic.
I concur with Maggie that the A and part of R is implicitly requiring
access rights and licensing (the machine need to know what it is technically
able to do with the data as well as what it is allowed to do). To me this
is all covered by the generic principle of rich, machine actionable metadata
for all FAIR digital objects. Measuring can imho include any further
specification of that, and again, the requirements may vary per disciploine
as we stated at the meeting.
I know for instance from an earlier IMI project (Open PHACTS) that the
pharma is unlikely to even touch a data set without a clear license, even
if the researcher is so open minded that putting a license did not even
occur to her/him…
B
Prof. Barend Mons
Leiden University Medical Center
President of CODATA
GO FAIR international Support and coordination office
Mail: ***@***.***-fair.org
+31624879779
ORCID: 0000-0003-3934-0072
sent from my IPad
On 18 Apr 2019, at 18:17, andrea.perego <***@***.***
> wrote:
Hi, Maggie.
I agree access rights are implicitly related to the principles you mention.
My concern is whether a conformance test could require compliance with
requirements not explicitly stated in the FAIR principles.
This is why I was considering the possibility of including a requirement on
access rights the “Beyond FAIR” work which, ideally, can be a useful
feedback for a possible revision/extension to the FAIR principles, based on
usage and implementation evidence.
Andrea
—-
Andrea Perego, Ph.D.
Scientific / Technical Project Officer
European Commission DG JRC
Directorate B – Growth and Innovation
Unit B6 – Digital Economy
Via E. Fermi, 2749 – TP 262
21027 Ispra VA, Italy
https://ec.europa.eu/jrc/
—-
The views expressed are purely those of the writer and may
not in any circumstances be regarded as stating an official
position of the European Commission.
From: Margareta Hellström [
mailto:***@***.***]
Sent: Thursday, April 18, 2019 6:00 PM
To: makxdekkers; ‘ ***@***.***-groups.org;
Mercè’; PEREGO Andrea (JRC-ISPRA); FAIR Data Maturity Model WG
Subject: Re: [fair_maturity] Workshop #2 Report
Hi Makx,
my immediate reaction to this was that the access rights are connected to
principle A1.2 (“the protocol allows for an authentication and authorization
procedure, where necessary”, but that doesn’t specifically address the issue
of how to inform the user (human or machine) of what those restrictions are.
I guess that would be covered by the “plurality of accurate and relevant
attributes” of R.1 and possibly also the “domain-relevant community
standards” of R1.3.
Happy Easter!
/Maggie
——————
Associate Professor Margareta Hellström
ICOS Carbon Portal staff member
***@***.***
Lund University
Department of Physical Geography and Ecosystem Science
Sölvegatan 12, SE-22362 Lund, Sweden
Phone: +46-(0)46-2229683
_____
_____
From: mail=***@***.***-groups.org

<***@***.***-groups.org
> on behalf of makxdekkers

Sent: Thursday, April 18, 2019 17:46
To: ‘***@***.***-groups.org ; Mercè’;
‘andrea.perego’; FAIR Data Maturity Model WG
Subject: Re: [fair_maturity] Workshop #2 Report
Many thanks, Andrea and Mercè, for bringing this up.
Should the expression of access rights, as opposed to the re-use licence, be
covered under principle R1.1

ible-data-usage-license/> or somewhere else?
Kind regards, Makx
From: Crosas, Mercè < ***@***.***>
Sent: 18 April 2019 15:58
To: andrea.perego <
***@***.***>
Cc: makxdekkers < ***@***.***>;
***@***.***-groups.org
Subject: Re: [fair_maturity] Workshop #2 Report
That’s correct – I agree that it would be best to separate license metadata
from access metadata. The Data Documentation Initiative (DDI)
schema has this
distinction, for example.
Best,
Merce
On Thu, Apr 18, 2019 at 4:06 AM andrea.perego <***@***.***
> wrote:
Dear Makx, dear colleagues,
Thanks for sharing the report.
I would like to raise an issue on point (4) in slide 20:
“(4) Does the licence permit access?”
Strictly speaking, licences are just about *use* conditions, and not about
who can access what. Of course, ad hoc licences happen to include access
provisions as well, but this does not apply to standard ones (as the
Creative Commons suite).
For this reason, and to facilitate conformance testing, it would be
desirable that access conditions/restrictions were assessed separately from
the licence, and to require that this information be specified in a separate
metadata field.
Best,
Andrea
—-
Andrea Perego, Ph.D.
Scientific / Technical Project Officer
European Commission DG JRC
Directorate B – Growth and Innovation
Unit B6 – Digital Economy
Via E. Fermi, 2749 – TP 262
21027 Ispra VA, Italy
https://ec.europa.eu/jrc/
—-
The views expressed are purely those of the writer and may
not in any circumstances be regarded as stating an official
position of the European Commission.
________________________________________
From: mail=***@***.***-groups.org

Sent: 17 April 2019 10:24:54
To: ***@***.***-groups.org
Subject: [fair_maturity] Workshop #2 Report
Dear members of the RDA FAIR Data Maturity Model Working Group,
We would like to thank you for attending the meeting of the Working Group in
Philadelphia on 3 April 2019 and hope you found the meeting useful.
The report of the meeting is now available for download from the WG page on
the RDA site at https://www.rd-alliance.org/workshop-2.
We are currently finalising a Google spreadsheet for your contributions to
the development of the indicators for the FAIR principles following the
approach presented at the meeting in Philadelphia, and we plan to share the
spreadsheet with the Working Group in the coming days.
Many thanks!
Makx Dekkers
Editorial team
—
Full post:
https://www.rd-alliance.org/group/fair-data-maturity-model-wg/post/works…
2-report
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post:
https://www.rd-alliance.org/mailinglist/unsubscribe/62859
—
Mercè Crosas, Ph.D.
Harvard University’s Research Data Officer, Office of Vice Provost for
Research
Chief Data Science and Technology Officer, Institute for Quantitative Social
Science
***@***.*** |
@mercecrosas |
scholar.harvard.edu/mercecrosas
—
Full post:
https://www.rd-alliance.org/group/fair-data-maturity-model-wg/post/works…
2-report
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post:
https://www.rd-alliance.org/mailinglist/unsubscribe/62859
—
Full post:
https://www.rd-alliance.org/group/fair-data-maturity-model-wg/post/works…
2-report
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post:
https://www.rd-alliance.org/mailinglist/unsubscribe/62859
—
Ge Peng, PhD
Research Scholar
Cooperative Institute for Climate and Satellites – NC (CICS-NC)/NCSU at
NOAAs National Centers for Environmental Information (NCEI)
Center for Weather and Climate (CWC)
151 Patton Ave, Asheville, NC 28801
+1 828 257 3009; ***@***.***
ORCID: http://
orcid.org/0000-0002-1986-9115
Following CICS-NC on Facebook
April 19, 2019 at 8:13 am #130610

Edit Herczog
Member
April 19, 2019 at 8:30 am #130609

Barend Mons
Member

My two cents before Easter 🙂
> Is it important to include access rights and conditions in the evaluation of FAIRness?
It is not only important, it is a critical issue, covered entirely -at high level of abstraction- in the FAIR guiding principles. Remember that FAIR is not a goal in itself, the key is the R: we want reusable workflows and data. Data and services without proper, machine actionable, metadata on access, license and instructions for proper reuse (both technically and legally) can be FAI and still Re-useless. (If this term is new to you, you have to admit you did not read the Cloudy, increasingly FAIR paper 🙂
My two cents before Easter 🙂
> Is it important to include access rights and conditions in the evaluation of FAIRness?
It is not only important, it is a critical issue, covered entirely -at high level of abstraction- in the FAIR guiding principles. Remember that FAIR is not a goal in itself, the key is the R: we want reusable workflows and data. Data and services without proper, machine actionable, metadata on access, license and instructions for proper reuse (both technically and legally) can be FAI and still Re-useless. (If this term is new to you, you have to admit you did not read the Cloudy, increasingly FAIR paper 🙂
https://content.iospress.com/articles/information-services-and-use/isu824
My two cents before Easter 🙂
> Is it important to include access rights and conditions in the evaluation of FAIRness?
It is not only important, it is a critical issue, covered entirely -at high level of abstraction- in the FAIR guiding principles. Remember that FAIR is not a goal in itself, the key is the R: we want reusable workflows and data. Data and services without proper, machine actionable, metadata on access, license and instructions for proper reuse (both technically and legally) can be FAI and still Re-useless. (If this term is new to you, you have to admit you did not read the Cloudy, increasingly FAIR paper 🙂
https://content.iospress.com/articles/information-services-and-use/isu824
> Under which area (F, A, I, R, beyond FAIR) or principle should it be evaluated?
As I argued before, both under A and particularly R but I am a bit weary about the term ‘beyond’ FAIR, as it suggests that the principles do not cover certain aspects. So far, noone came up to me with aspects that are not covered -again at high level of abstraction- by F, A, I or R, especially R: For instance lageal aspects, quality of data etc. Are all related to acessibility under well defined conditions and ‘meaningful’ reusability. One example: totally fabricated data that are totally FAI: can be either R or not based on the use case. For example for testing model of software performance or to check VMs for their promised actions.
So let’s indeed concentrate this group, as Edith keeps emphasizing, correctly imho’, on WHAT needs to be measured to -in the end- judge the R level of data and services and worry about the how/details of such measuments later and in other contexts.
Prof. Barend Mons
Leiden University Medical Center
President of CODATA
GO FAIR international Support and coordination office
Mail: ***@***.***-fair.org
+31624879779
ORCID: 0000-0003-3934-0072
sent from my IPad
April 19, 2019 at 8:56 am #130608

Edit Herczog
Member
April 19, 2019 at 8:56 am #130607

Keith Jeffery
Member
April 19, 2019 at 3:51 pm #130605

RDA Admin
Organizer
April 20, 2019 at 7:27 am #130604

Keith Jeffery
Member
April 22, 2019 at 10:20 am #130603

Keith Jeffery
Member

Makx –
I have now had time to re-read all the material from the last RDA Plenary session (unfortunately I had a clash with another meeting). I would like to make a few comments.
1. Beyond FAIR. I believe the items here are all covered by the existing FAIR principles, however not necessarily one for one. (This raises another question about the intersections of concepts in the FAIR principles – see below). To take the Beyond FAIR list from the GitHub:
* Data Repository: a dataset may reside in a repository but not necessarily. Consider streamed data from sensors or satellite. Some colleagues get excited about repository quality or ‘trusted repositories’; from m experience the only real criteria for FAIR use by a user concern the relevance and quality of the dataset for their purpose at that time (assuming no limitations on (re-)use) – and this can only be judged from the rich metadata describing the dataset.
* Curation and maintenance: the DCC lifecycle (and associated DMP) is useful here. Curation involves difficult decisions based on the value of the dataset and the cost of curation. The real problem is the lack of economic models that deal with (potentially) infinite time. However, the FAIR requirement for an eternally persistent ID for the dataset implies curation (if a decision is taken to assign (manually or automatically) a persistent identifier an implicit decision is taken to preserve the dataset. Curation implies a DMP, repository (possibly more than one), appropriate metadata, management responsibility, ownership responsibility, licensing, usage rules/constraints… all largely covered by what I understand by rich metadata.
* Open Data: this is a very difficult term to define. Many conflate open and free and confuse with open access. Open data strictly concerns government data made available to citizens although the term is used more widely and is generally used meaning open data, open access, free access (possibly subject to licence conditions such as acknowledgement). I believe the open data concept is covered by the FAIR principles.
* Data Quality: as I suggest under (a) in my experience quality is determined by the end-user in the sense of appropriate quality for the current purpose. I suggest there is no absolute measure but only relative (to the purpose). Data quality is then determined by the intersection of the user requirement and the dataset quality as described by rich metadata. This is likely to include properties/attributes such as precision and accuracy although in some sciences the reputation of the experimental team is sufficient. The equipment used for data collection and the data collection/correction/summarisation method may also be relevant.
* Others: this is of course not yet defined. I would hope that all could be accommodated by the existing FAIR principles because they are relatively abstract; as always the ‘devil is in the detail’ and this will be specified by interpretation towards concreteness of the FAIR principles.
I am hopeful that as other potential ‘beyond FAIR’ concepts arise they will increase our understanding of the several more concrete interpretations of each of the FAIR principles.
1. Intersections in FAIR principles:
* I believe there is some confusion in the FAIR principles concerning data and metadata. Many of the principles start with (meta)data. While I subscribe to the principle that metadata is also data (library catalog cards are metadata to a researcher finding a particular paper but data to a librarian counting papers on the human genome) I fear the FAIR principles are not clear on what should be a property of the thing referred to(data) and what should be a property of the referring thing (metadata). The obvious example is persistent identifier (I prefer UUPID): while both a dataset and the metadata describing it should have a UUPID, A1 is relevant for data (where the UUPID is an attribute in the metadata as specified in F4) but not really for metadata where the retrieval is usually by user values for metadata attributes.
* I believe F2 and R1 are – for metadata – really the same principle and although different sets of attributes may be used for F and R there is likely to be a large intersection. For example R1.2 provenance metadata may well be highly relevant for a user finding appropriate (relevance, quality) datasets.
* It seems to me that I2 and I3 concern aspects of I1 and could equally be I1.1 (a formal language should have its semantics defined) and I1.2 (a formal language should support qualified references, this is, for example, the advantage of RDF over XML).
I raise these concerns now because – as we define progressively the metrics for assessing FAIRness – we have to be clear on exactly what is being measured.
Thanks for your patience!
Best
Keith
——————————————————————————–
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***
T: +44 7768 446088
S: keithgjeffery
———————————————————————————————————————————-
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
———————————————————————————————————————————-
– Show quoted text -From: mail=***@***.***-groups.org On Behalf Of makxdekkers
Sent: 17 April 2019 09:25
To: ***@***.***-groups.org
Subject: [fair_maturity] Workshop #2 Report
Dear members of the RDA FAIR Data Maturity Model Working Group,
We would like to thank you for attending the meeting of the Working Group in Philadelphia on 3 April 2019 and hope you found the meeting useful.
The report of the meeting is now available for download from the WG page on the RDA site at https://www.rd-alliance.org/workshop-2.
We are currently finalising a Google spreadsheet for your contributions to the development of the indicators for the FAIR principles following the approach presented at the meeting in Philadelphia, and we plan to share the spreadsheet with the Working Group in the coming days.
Many thanks!
Makx Dekkers
Editorial team
April 22, 2019 at 12:30 pm #130602

Ge Peng
Member

Agree with Keith on we need to be clear on exactly what is being measured
before defining the metrics to assess FAIRness.
My two cents on quality and accessibility:
The quality information about metadata and data may be covered under “rich
metadata”. Unfortunately, I believe that “rich metadata” falls short in
explicitly addressing data and information quality. Without good data and
information quality, the data may not be useful. It does not make any sense
to address its re-use. The question is: should this WG explicitly address
quality of data and metadata in the metrics?
It has been mentioned that FAIR does not mean OPEN. It is extremely hard
for me to see how one can satisfy “FAIR”, especially, accessible,
interoperable and reusable of data, without having its data available and
obtainable. Am I missing something? If so, should the FAIR principles be
amended?
Regards,
— Peng
On Mon, Apr 22, 2019 at 6:20 AM ***@***.***
wrote:
April 22, 2019 at 1:08 pm #130600

Keith Jeffery
Member

Peng –
Many thanks for responding to my email.
I agree quality (and relevance) is important and I believe the group has to answer the question on what FAIRness metrics apply. The paradox I was raising is that in my experience there is no absolute quality (or relevance) but only relative to the requirements of the (re-)using user – hence a generic FAIRness metric may be unreachable.
I agree that FAIR implies somehow openness; however a DO can (with its metadata) be FAIR even – for example – if it is embargoed (for prior publication) but available after some defined (and documented) time. The only FAIR principle to mention openness is A1.1 which is the protocol to retrieve by identifier yet A1.2 allows for authentication (of the user or agent) and authorisation (permission to access). I suspect the ‘spirit’ of FAIR was open and free but practical realities required these wordings of the principles.
Best
Keith
——————————————————————————–
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***
T: +44 7768 446088
S: keithgjeffery
———————————————————————————————————————————-
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
———————————————————————————————————————————-
From: Ge Peng – NOAA Affiliate
Sent: 22 April 2019 13:30
To: Keith Jeffery
Cc: ***@***.***-groups.org; Ge Peng
Subject: Re: [fair_maturity] Workshop #2 Report
Agree with Keith on we need to be clear on exactly what is being measured before defining the metrics to assess FAIRness.
My two cents on quality and accessibility:
The quality information about metadata and data may be covered under “rich metadata”. Unfortunately, I believe that “rich metadata” falls short in explicitly addressing data and information quality. Without good data and information quality, the data may not be useful. It does not make any sense to address its re-use. The question is: should this WG explicitly address quality of data and metadata in the metrics?
It has been mentioned that FAIR does not mean OPEN. It is extremely hard for me to see how one can satisfy “FAIR”, especially, accessible, interoperable and reusable of data, without having its data available and obtainable. Am I missing something? If so, should the FAIR principles be amended?
Regards,
— Peng
On Mon, Apr 22, 2019 at 6:20 AM ***@***.*** wrote:
Makx –
I have now had time to re-read all the material from the last RDA Plenary session (unfortunately I had a clash with another meeting). I would like to make a few comments.
1. Beyond FAIR. I believe the items here are all covered by the existing FAIR principles, however not necessarily one for one. (This raises another question about the intersections of concepts in the FAIR principles – see below). To take the Beyond FAIR list from the GitHub:
* Data Repository: a dataset may reside in a repository but not necessarily. Consider streamed data from sensors or satellite. Some colleagues get excited about repository quality or ‘trusted repositories’; from m experience the only real criteria for FAIR use by a user concern the relevance and quality of the dataset for their purpose at that time (assuming no limitations on (re-)use) – and this can only be judged from the rich metadata describing the dataset.
* Curation and maintenance: the DCC lifecycle (and associated DMP) is useful here. Curation involves difficult decisions based on the value of the dataset and the cost of curation. The real problem is the lack of economic models that deal with (potentially) infinite time. However, the FAIR requirement for an eternally persistent ID for the dataset implies curation (if a decision is taken to assign (manually or automatically) a persistent identifier an implicit decision is taken to preserve the dataset. Curation implies a DMP, repository (possibly more than one), appropriate metadata, management responsibility, ownership responsibility, licensing, usage rules/constraints… all largely covered by what I understand by rich metadata.
* Open Data: this is a very difficult term to define. Many conflate open and free and confuse with open access. Open data strictly concerns government data made available to citizens although the term is used more widely and is generally used meaning open data, open access, free access (possibly subject to licence conditions such as acknowledgement). I believe the open data concept is covered by the FAIR principles.
* Data Quality: as I suggest under (a) in my experience quality is determined by the end-user in the sense of appropriate quality for the current purpose. I suggest there is no absolute measure but only relative (to the purpose). Data quality is then determined by the intersection of the user requirement and the dataset quality as described by rich metadata. This is likely to include properties/attributes such as precision and accuracy although in some sciences the reputation of the experimental team is sufficient. The equipment used for data collection and the data collection/correction/summarisation method may also be relevant.
* Others: this is of course not yet defined. I would hope that all could be accommodated by the existing FAIR principles because they are relatively abstract; as always the ‘devil is in the detail’ and this will be specified by interpretation towards concreteness of the FAIR principles.
I am hopeful that as other potential ‘beyond FAIR’ concepts arise they will increase our understanding of the several more concrete interpretations of each of the FAIR principles.
1. Intersections in FAIR principles:
* I believe there is some confusion in the FAIR principles concerning data and metadata. Many of the principles start with (meta)data. While I subscribe to the principle that metadata is also data (library catalog cards are metadata to a researcher finding a particular paper but data to a librarian counting papers on the human genome) I fear the FAIR principles are not clear on what should be a property of the thing referred to(data) and what should be a property of the referring thing (metadata). The obvious example is persistent identifier (I prefer UUPID): while both a dataset and the metadata describing it should have a UUPID, A1 is relevant for data (where the UUPID is an attribute in the metadata as specified in F4) but not really for metadata where the retrieval is usually by user values for metadata attributes.
* I believe F2 and R1 are – for metadata – really the same principle and although different sets of attributes may be used for F and R there is likely to be a large intersection. For example R1.2 provenance metadata may well be highly relevant for a user finding appropriate (relevance, quality) datasets.
* It seems to me that I2 and I3 concern aspects of I1 and could equally be I1.1 (a formal language should have its semantics defined) and I1.2 (a formal language should support qualified references, this is, for example, the advantage of RDF over XML).
I raise these concerns now because – as we define progressively the metrics for assessing FAIRness – we have to be clear on exactly what is being measured.
Thanks for your patience!
Best
Keith
——————————————————————————–
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***
T: +44 7768 446088
S: keithgjeffery
———————————————————————————————————————————-
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
———————————————————————————————————————————-
– Show quoted text -From: mail=***@***.***-groups.org On Behalf Of makxdekkers
Sent: 17 April 2019 09:25
To: ***@***.***-groups.org
Subject: [fair_maturity] Workshop #2 Report
Dear members of the RDA FAIR Data Maturity Model Working Group,
We would like to thank you for attending the meeting of the Working Group in Philadelphia on 3 April 2019 and hope you found the meeting useful.
The report of the meeting is now available for download from the WG page on the RDA site at https://www.rd-alliance.org/workshop-2.
We are currently finalising a Google spreadsheet for your contributions to the development of the indicators for the FAIR principles following the approach presented at the meeting in Philadelphia, and we plan to share the spreadsheet with the Working Group in the coming days.
Many thanks!
Makx Dekkers
Editorial team
—
Full post: https://www.rd-alliance.org/group/fair-data-maturity-model-wg/post/works…
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/62859
—
Ge Peng, PhD
Research Scholar
Cooperative Institute for Climate and Satellites – NC (CICS-NC)/NCSU at
NOAA’s National Centers for Environmental Information (NCEI)
Center for Weather and Climate (CWC)
151 Patton Ave, Asheville, NC 28801
+1 828 257 3009; ***@***.***
ORCID: http://orcid.org/0000-0002-1986-9115
Following CICS-NC on Facebook
April 22, 2019 at 1:31 pm #130599

Ge Peng
Member

Thank you, Keith, very much for your detailed and helpful reply.
Regards,
— Peng
On Mon, Apr 22, 2019 at 9:09 AM ***@***.***
wrote:
April 22, 2019 at 2:45 pm #130598

Maggie Hellström
Member

Hi Keith, Peng, et al,
I found your comments more thought-provoking than provocative – and in any case, it is only by sharing ideas and then analysing and discussing disagreements that true progress can be made… 😉
A few quick comments:
1) On PIDs as metadata attributes: Well, you are right in that PIDs are assigned to a DO, and therefore in some sense the PID (string) becomes part of the metadata about that object. I also agree that F1 relates to actions by the data curator (which could be a repository and/or the data producer), which is then responsible for uploading correct & relevant PID kernel information.
But on the other hand, that particular, rather limited part of the metadata (at least the URI pointing to the location of the DO) is stored in a location – at the PID registry – that may or may not be (or remain) under the direct control of the curator of the data and the other metadata. In fact, “anyone” – not just the data curator – can in principle assign a persistent identifier to any DO, after which “they” are in control of the PID-related metadata. This raises interesting questions about who has the rights and/or possibility to update the PID registry kernel information, and how to ensure these capabilities are sustainable.
2) Metadata as data: Well, one man’s metadata is another man’s data – so the context has to decide whether a piece of information is ancillary or primary. If we are following F1 strictly, then also – as prescribed by F3 – the block of information (key-value pairs, triple assertions or the like) that are considered to collectively form the rich metadata associated with a DO should be assigned a PID of its own – probably a “collection PID” which links to the PIDs of the individual source(s) and/or search queries that need to be executed to retrieve the said metadata information on the DO.
This seems to indicate that we (i.e. the data producers and the repositories) need to become much more liberal with assigning PIDs than is currently the case – which means that there is an urgent need for domain- and/or community-specific recommendations on e.g. what types of (meta)data objects should be PID:ed and at what granularity. Without such at least domain-related standards (or best practices) for metadata, it is very difficult to define suitable (and fair) test criteria and scoring rules for e.g. F1 and F3.
I also note that at least from an end user perspective, the R principles are reliant on the ability to efficiently link together a DO with all its different types of metadata. The old traditional concept of a single metadata catalogue, holding all relevant metadata (in whatever form) and hosted by the repository/archive instritution that also holds the data, is becoming obsolete very fast – which raises concerns on the accuracy of the “rich metadata”, and stresses the need for some kind of objective assessment of not only the degree of “completeness” of metadata (e.g. looking at how many of the fields of an ISO standard are filled in) but also at what’s written in there. Should there be a R-related metric for this as well?
3) Examples of non-traditional metadata sources include stores for provenance information in separate central registries (to make it faster to search), the RDA recommendation to extend the metadata stored in the PID registry to support “pre-filtering” of DOs depending on types already at the PID resolution step, the trend to use “external” catalogues such as DataCite to store more and more non-administrative metadata, and finally initiatives to set up independent “annotation stores” for storing e.g. end user comments on data usability. All these represent a variety of “metadata sources”, with various degrees of trustability and sustainability associated with them
As long as care is taken to maintain an authoritative “master copy” of all the metadata that is originating from the original data producers, then this plurality is a good thing that can give data-intensive science a big boost. But at the same time, can see many issues, especially with smaller-scale research projects and groups – these often lack adequate data management support or resoources, and therefore resort to storing the only existing copies of both their data and metadata using the cheapest possible means available to them. That could give them a high FAIRness compliance score for their DMP evaluation, but it won’t be sustainable in the long run.
OK, sorry for this long outpouring of ideas – I’ll go back to lurking mode now!
Cheers,
Maggie
——————
Associate Professor Margareta Hellström
ICOS Carbon Portal staff member
***@***.***
Lund University
Department of Physical Geography and Ecosystem Science
Sölvegatan 12, SE-22362 Lund, Sweden
Phone: +46-(0)46-2229683
________________________________
– Show quoted text -From: keith.jeffery=***@***.***-groups.org on behalf of ***@***.***
Sent: Monday, April 22, 2019 15:08
To: ***@***.***-groups.org
Subject: Re: [fair_maturity] Workshop #2 Report
Peng –
Many thanks for responding to my email.
I agree quality (and relevance) is important and I believe the group has to answer the question on what FAIRness metrics apply. The paradox I was raising is that in my experience there is no absolute quality (or relevance) but only relative to the requirements of the (re-)using user – hence a generic FAIRness metric may be unreachable.
I agree that FAIR implies somehow openness; however a DO can (with its metadata) be FAIR even – for example – if it is embargoed (for prior publication) but available after some defined (and documented) time. The only FAIR principle to mention openness is A1.1 which is the protocol to retrieve by identifier yet A1.2 allows for authentication (of the user or agent) and authorisation (permission to access). I suspect the ‘spirit’ of FAIR was open and free but practical realities required these wordings of the principles.
Best
Keith
——————————————————————————–
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***
T: +44 7768 446088
S: keithgjeffery
———————————————————————————————————————————-
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
———————————————————————————————————————————-
From: Ge Peng – NOAA Affiliate
Sent: 22 April 2019 13:30
To: Keith Jeffery
Cc: ***@***.***-groups.org; Ge Peng
Subject: Re: [fair_maturity] Workshop #2 Report
Agree with Keith on we need to be clear on exactly what is being measured before defining the metrics to assess FAIRness.
My two cents on quality and accessibility:
The quality information about metadata and data may be covered under “rich metadata”. Unfortunately, I believe that “rich metadata” falls short in explicitly addressing data and information quality. Without good data and information quality, the data may not be useful. It does not make any sense to address its re-use. The question is: should this WG explicitly address quality of data and metadata in the metrics?
It has been mentioned that FAIR does not mean OPEN. It is extremely hard for me to see how one can satisfy “FAIR”, especially, accessible, interoperable and reusable of data, without having its data available and obtainable. Am I missing something? If so, should the FAIR principles be amended?
Regards,
— Peng
On Mon, Apr 22, 2019 at 6:20 AM ***@***.*** wrote:
Makx –
I have now had time to re-read all the material from the last RDA Plenary session (unfortunately I had a clash with another meeting). I would like to make a few comments.
1. Beyond FAIR. I believe the items here are all covered by the existing FAIR principles, however not necessarily one for one. (This raises another question about the intersections of concepts in the FAIR principles – see below). To take the Beyond FAIR list from the GitHub:
* Data Repository: a dataset may reside in a repository but not necessarily. Consider streamed data from sensors or satellite. Some colleagues get excited about repository quality or ‘trusted repositories’; from m experience the only real criteria for FAIR use by a user concern the relevance and quality of the dataset for their purpose at that time (assuming no limitations on (re-)use) – and this can only be judged from the rich metadata describing the dataset.
* Curation and maintenance: the DCC lifecycle (and associated DMP) is useful here. Curation involves difficult decisions based on the value of the dataset and the cost of curation. The real problem is the lack of economic models that deal with (potentially) infinite time. However, the FAIR requirement for an eternally persistent ID for the dataset implies curation (if a decision is taken to assign (manually or automatically) a persistent identifier an implicit decision is taken to preserve the dataset. Curation implies a DMP, repository (possibly more than one), appropriate metadata, management responsibility, ownership responsibility, licensing, usage rules/constraints… all largely covered by what I understand by rich metadata.
* Open Data: this is a very difficult term to define. Many conflate open and free and confuse with open access. Open data strictly concerns government data made available to citizens although the term is used more widely and is generally used meaning open data, open access, free access (possibly subject to licence conditions such as acknowledgement). I believe the open data concept is covered by the FAIR principles.
* Data Quality: as I suggest under (a) in my experience quality is determined by the end-user in the sense of appropriate quality for the current purpose. I suggest there is no absolute measure but only relative (to the purpose). Data quality is then determined by the intersection of the user requirement and the dataset quality as described by rich metadata. This is likely to include properties/attributes such as precision and accuracy although in some sciences the reputation of the experimental team is sufficient. The equipment used for data collection and the data collection/correction/summarisation method may also be relevant.
* Others: this is of course not yet defined. I would hope that all could be accommodated by the existing FAIR principles because they are relatively abstract; as always the ‘devil is in the detail’ and this will be specified by interpretation towards concreteness of the FAIR principles.
I am hopeful that as other potential ‘beyond FAIR’ concepts arise they will increase our understanding of the several more concrete interpretations of each of the FAIR principles.
1. Intersections in FAIR principles:
* I believe there is some confusion in the FAIR principles concerning data and metadata. Many of the principles start with (meta)data. While I subscribe to the principle that metadata is also data (library catalog cards are metadata to a researcher finding a particular paper but data to a librarian counting papers on the human genome) I fear the FAIR principles are not clear on what should be a property of the thing referred to(data) and what should be a property of the referring thing (metadata). The obvious example is persistent identifier (I prefer UUPID): while both a dataset and the metadata describing it should have a UUPID, A1 is relevant for data (where the UUPID is an attribute in the metadata as specified in F4) but not really for metadata where the retrieval is usually by user values for metadata attributes.
* I believe F2 and R1 are – for metadata – really the same principle and although different sets of attributes may be used for F and R there is likely to be a large intersection. For example R1.2 provenance metadata may well be highly relevant for a user finding appropriate (relevance, quality) datasets.
* It seems to me that I2 and I3 concern aspects of I1 and could equally be I1.1 (a formal language should have its semantics defined) and I1.2 (a formal language should support qualified references, this is, for example, the advantage of RDF over XML).
I raise these concerns now because – as we define progressively the metrics for assessing FAIRness – we have to be clear on exactly what is being measured.
Thanks for your patience!
Best
Keith
——————————————————————————–
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***
T: +44 7768 446088
S: keithgjeffery
———————————————————————————————————————————-
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
———————————————————————————————————————————-
From: mail=***@***.***-groups.org On Behalf Of makxdekkers
Sent: 17 April 2019 09:25
To: ***@***.***-groups.org
Subject: [fair_maturity] Workshop #2 Report
Dear members of the RDA FAIR Data Maturity Model Working Group,
We would like to thank you for attending the meeting of the Working Group in Philadelphia on 3 April 2019 and hope you found the meeting useful.
The report of the meeting is now available for download from the WG page on the RDA site at https://www.rd-alliance.org/workshop-2.
We are currently finalising a Google spreadsheet for your contributions to the development of the indicators for the FAIR principles following the approach presented at the meeting in Philadelphia, and we plan to share the spreadsheet with the Working Group in the coming days.
Many thanks!
Makx Dekkers
Editorial team
—
Full post: https://www.rd-alliance.org/group/fair-data-maturity-model-wg/post/works…
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/62859
—
Ge Peng, PhD
Research Scholar
Cooperative Institute for Climate and Satellites – NC (CICS-NC)/NCSU at
NOAA’s National Centers for Environmental Information (NCEI)
Center for Weather and Climate (CWC)
151 Patton Ave, Asheville, NC 28801
+1 828 257 3009; ***@***.***
ORCID: http://orcid.org/0000-0002-1986-9115
Following CICS-NC on Facebook
Hi Keith, Peng, et al,
I found your comments more thought-provoking than provocative – and in any case, it is only by sharing ideas and then analysing and discussing disagreements that true progress can be made… 😉
A few quick comments:
1) On PIDs as metadata attributes: Well, you are right in that PIDs are assigned to a DO, and therefore in some sense the PID (string) becomes part of the metadata about that object. I also agree that F1 relates to actions by the data curator (which could be a repository and/or the data producer), which is then responsible for uploading correct & relevant PID kernel information.
But on the other hand, that particular, rather limited part of the metadata (at least the URI pointing to the location of the DO) is stored in a location – at the PID registry – that may or may not be (or remain) under the direct control of the curator of the data and the other metadata. In fact, “anyone” – not just the data curator – can in principle assign a persistent identifier to any DO, after which “they” are in control of the PID-related metadata. This raises interesting questions about who has the rights and/or possibility to update the PID registry kernel information, and how to ensure these capabilities are sustainable.
2) Metadata as data: Well, one man’s metadata is another man’s data – so the context has to decide whether a piece of information is ancillary or primary. If we are following F1 strictly, then also – as prescribed by F3 – the block of information (key-value pairs, triple assertions or the like) that are considered to collectively form the rich metadata associated with a DO should be assigned a PID of its own – probably a “collection PID” which links to the PIDs of the individual source(s) and/or search queries that need to be executed to retrieve the said metadata information on the DO.
This seems to indicate that we (i.e. the data producers and the repositories) need to become much more liberal with assigning PIDs than is currently the case – which means that there is an urgent need for domain- and/or community-specific recommendations on e.g. what types of (meta)data objects should be PID:ed and at what granularity. Without such at least domain-related standards (or best practices) for metadata, it is very difficult to define suitable (and fair) test criteria and scoring rules for e.g. F1 and F3.
I also note that at least from an end user perspective, the R principles are reliant on the ability to efficiently link together a DO with all its different types of metadata. The old traditional concept of a single metadata catalogue, holding all relevant metadata (in whatever form) and hosted by the repository/archive instritution that also holds the data, is becoming obsolete very fast – which raises concerns on the accuracy of the “rich metadata”, and stresses the need for some kind of objective assessment of not only the degree of “completeness” of metadata (e.g. looking at how many of the fields of an ISO standard are filled in) but also at what’s written in there. Should there be a R-related metric for this as well?
3) Examples of non-traditional metadata sources include stores for provenance information in separate central registries (to make it faster to search), the RDA recommendation to extend the metadata stored in the PID registry to support “pre-filtering” of DOs depending on types already at the PID resolution step, the trend to use “external” catalogues such as DataCite to store more and more non-administrative metadata, and finally initiatives to set up independent “annotation stores” for storing e.g. end user comments on data usability. All these represent a variety of “metadata sources”, with various degrees of trustability and sustainability associated with them
As long as care is taken to maintain an authoritative “master copy” of all the metadata that is originating from the original data producers, then this plurality is a good thing that can give data-intensive science a big boost. But at the same time, can see many issues, especially with smaller-scale research projects and groups – these often lack adequate data management support or resoources, and therefore resort to storing the only existing copies of both their data and metadata using the cheapest possible means available to them. That could give them a high FAIRness compliance score for their DMP evaluation, but it won’t be sustainable in the long run.
OK, sorry for this long outpouring of ideas – I’ll go back to lurking mode now!
Cheers,
Maggie
——————
Associate Professor Margareta Hellström
ICOS Carbon Portal staff member
***@***.***
Lund University
Department of Physical Geography and Ecosystem Science
Sölvegatan 12, SE-22362 Lund, Sweden
Phone: +46-(0)46-2229683
________________________________
________________________________
From: keith.jeffery=***@***.***-groups.org on behalf of ***@***.***
Sent: Monday, April 22, 2019 15:08
To: ***@***.***-groups.org
Subject: Re: [fair_maturity] Workshop #2 Report
Peng –
Many thanks for responding to my email.
I agree quality (and relevance) is important and I believe the group has to answer the question on what FAIRness metrics apply. The paradox I was raising is that in my experience there is no absolute quality (or relevance) but only relative to the requirements of the (re-)using user – hence a generic FAIRness metric may be unreachable.
I agree that FAIR implies somehow openness; however a DO can (with its metadata) be FAIR even – for example – if it is embargoed (for prior publication) but available after some defined (and documented) time. The only FAIR principle to mention openness is A1.1 which is the protocol to retrieve by identifier yet A1.2 allows for authentication (of the user or agent) and authorisation (permission to access). I suspect the ‘spirit’ of FAIR was open and free but practical realities required these wordings of the principles.
Best
Keith
——————————————————————————–
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***
T: +44 7768 446088
S: keithgjeffery
———————————————————————————————————————————-
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
———————————————————————————————————————————-
From: Ge Peng – NOAA Affiliate
Sent: 22 April 2019 13:30
To: Keith Jeffery
Cc: ***@***.***-groups.org; Ge Peng
Subject: Re: [fair_maturity] Workshop #2 Report
Agree with Keith on we need to be clear on exactly what is being measured before defining the metrics to assess FAIRness.
My two cents on quality and accessibility:
The quality information about metadata and data may be covered under “rich metadata”. Unfortunately, I believe that “rich metadata” falls short in explicitly addressing data and information quality. Without good data and information quality, the data may not be useful. It does not make any sense to address its re-use. The question is: should this WG explicitly address quality of data and metadata in the metrics?
It has been mentioned that FAIR does not mean OPEN. It is extremely hard for me to see how one can satisfy “FAIR”, especially, accessible, interoperable and reusable of data, without having its data available and obtainable. Am I missing something? If so, should the FAIR principles be amended?
Regards,
— Peng
On Mon, Apr 22, 2019 at 6:20 AM ***@***.*** wrote:
Makx –
I have now had time to re-read all the material from the last RDA Plenary session (unfortunately I had a clash with another meeting). I would like to make a few comments.
1. Beyond FAIR. I believe the items here are all covered by the existing FAIR principles, however not necessarily one for one. (This raises another question about the intersections of concepts in the FAIR principles – see below). To take the Beyond FAIR list from the GitHub:
* Data Repository: a dataset may reside in a repository but not necessarily. Consider streamed data from sensors or satellite. Some colleagues get excited about repository quality or ‘trusted repositories’; from m experience the only real criteria for FAIR use by a user concern the relevance and quality of the dataset for their purpose at that time (assuming no limitations on (re-)use) – and this can only be judged from the rich metadata describing the dataset.
* Curation and maintenance: the DCC lifecycle (and associated DMP) is useful here. Curation involves difficult decisions based on the value of the dataset and the cost of curation. The real problem is the lack of economic models that deal with (potentially) infinite time. However, the FAIR requirement for an eternally persistent ID for the dataset implies curation (if a decision is taken to assign (manually or automatically) a persistent identifier an implicit decision is taken to preserve the dataset. Curation implies a DMP, repository (possibly more than one), appropriate metadata, management responsibility, ownership responsibility, licensing, usage rules/constraints… all largely covered by what I understand by rich metadata.
* Open Data: this is a very difficult term to define. Many conflate open and free and confuse with open access. Open data strictly concerns government data made available to citizens although the term is used more widely and is generally used meaning open data, open access, free access (possibly subject to licence conditions such as acknowledgement). I believe the open data concept is covered by the FAIR principles.
* Data Quality: as I suggest under (a) in my experience quality is determined by the end-user in the sense of appropriate quality for the current purpose. I suggest there is no absolute measure but only relative (to the purpose). Data quality is then determined by the intersection of the user requirement and the dataset quality as described by rich metadata. This is likely to include properties/attributes such as precision and accuracy although in some sciences the reputation of the experimental team is sufficient. The equipment used for data collection and the data collection/correction/summarisation method may also be relevant.
* Others: this is of course not yet defined. I would hope that all could be accommodated by the existing FAIR principles because they are relatively abstract; as always the ‘devil is in the detail’ and this will be specified by interpretation towards concreteness of the FAIR principles.
I am hopeful that as other potential ‘beyond FAIR’ concepts arise they will increase our understanding of the several more concrete interpretations of each of the FAIR principles.
1. Intersections in FAIR principles:
* I believe there is some confusion in the FAIR principles concerning data and metadata. Many of the principles start with (meta)data. While I subscribe to the principle that metadata is also data (library catalog cards are metadata to a researcher finding a particular paper but data to a librarian counting papers on the human genome) I fear the FAIR principles are not clear on what should be a property of the thing referred to(data) and what should be a property of the referring thing (metadata). The obvious example is persistent identifier (I prefer UUPID): while both a dataset and the metadata describing it should have a UUPID, A1 is relevant for data (where the UUPID is an attribute in the metadata as specified in F4) but not really for metadata where the retrieval is usually by user values for metadata attributes.
* I believe F2 and R1 are – for metadata – really the same principle and although different sets of attributes may be used for F and R there is likely to be a large intersection. For example R1.2 provenance metadata may well be highly relevant for a user finding appropriate (relevance, quality) datasets.
* It seems to me that I2 and I3 concern aspects of I1 and could equally be I1.1 (a formal language should have its semantics defined) and I1.2 (a formal language should support qualified references, this is, for example, the advantage of RDF over XML).
I raise these concerns now because – as we define progressively the metrics for assessing FAIRness – we have to be clear on exactly what is being measured.
Thanks for your patience!
Best
Keith
——————————————————————————–
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***
T: +44 7768 446088
S: keithgjeffery
———————————————————————————————————————————-
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
———————————————————————————————————————————-
– Show quoted text -From: mail=***@***.***-groups.org On Behalf Of makxdekkers
Sent: 17 April 2019 09:25
To: ***@***.***-groups.org
Subject: [fair_maturity] Workshop #2 Report
Dear members of the RDA FAIR Data Maturity Model Working Group,
We would like to thank you for attending the meeting of the Working Group in Philadelphia on 3 April 2019 and hope you found the meeting useful.
The report of the meeting is now available for download from the WG page on the RDA site at https://www.rd-alliance.org/workshop-2.
We are currently finalising a Google spreadsheet for your contributions to the development of the indicators for the FAIR principles following the approach presented at the meeting in Philadelphia, and we plan to share the spreadsheet with the Working Group in the coming days.
Many thanks!
Makx Dekkers
Editorial team
—
Full post: https://www.rd-alliance.org/group/fair-data-maturity-model-wg/post/works…
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/62859
—
Ge Peng, PhD
Research Scholar
Cooperative Institute for Climate and Satellites – NC (CICS-NC)/NCSU at
NOAA’s National Centers for Environmental Information (NCEI)
Center for Weather and Climate (CWC)
151 Patton Ave, Asheville, NC 28801
+1 828 257 3009; ***@***.***
ORCID: http://orcid.org/0000-0002-1986-9115
Following CICS-NC on Facebook
April 23, 2019 at 8:02 am #130595

Keith Jeffery
Member
April 23, 2019 at 9:48 am #130593

Barend Mons
Member

I think al lot of that we discuss here is ‘how’ rather than ‘what’
So maybe we should start a separate thread because the discussion as such is very valuable.
I also address the UPRI /Handle issue a bit in http://www.data-intelligence-journal.org/p/10/1/
I believe a prefix registry is scalable, and with automatically generated (hash for instance) suffixes, we should be fine
B
Prof. Barend Mons
Leiden University Medical Center
President of CODATA
GO FAIR International Support and Coordination Office
Visiting address:
Poortgebouw N-01
Rijnsburgerweg 10
2333 AA Leiden
The Netherlands
E-mail: ***@***.***-fair.org
Mobile: +31 6 24879779
Skype: dnerab
http://www.go-fair.org
flyer
ORCID: 0000-0003-3934-0072
April 23, 2019 at 11:59 am #130592

RDA Admin
Organizer

Excellent point Barend.
I would like to suggest possible ways to handle these threads.
First, I am hoping that members are willing to contribute to the collaborative document at [1] by proposing specific indicators and maturity levels for the FAIR principles. Furthermore, on specific points, like what is ‘persistence’ and what is ‘rich metadata’, such indicators could include whether or not data is identified with DOIs or other commonly used identifier schemes, and whether metadata is provided conformant to DataCite kernel or some other standard set – it would be good to point to common or best practice in order to make sure that we’re not re-inventing any wheels.
Second, for any further discussion, issues could be raised on the mailing list. In that case, I would suggest that an e-mail (a) is about one issue and (b) has a sensible subject line, so that it is easier to follow the thread. Maybe people could also cut off some of the long tail of messages in the replies.
It is also possible to raise issues on GitHub at [2]. GitHub makes it easier to follow individual discussions and see how discussions lead to consensus, but if people are more comfortable with e-mail that is OK too. The editorial team will keep an eye on both e-mail and GitHub and will try to summarise every now and again how the discussions are progressing.
Many thanks, Makx.
[1] https://docs.google.com/spreadsheets/d/1gvMfbw46oV1idztsr586aG6-teSn2cPW…
[2] https://github.com/RDA-FAIR/FAIR-data-maturity-model-WG/issues
– Show quoted text -From: barendmons=***@***.***-groups.org On Behalf Of Barend Mons
Sent: 23 April 2019 11:48
To: Keith Jeffery ; FAIR Data Maturity Model WG
Subject: Re: [fair_maturity] Workshop #2 Report
I think al lot of that we discuss here is ‘how’ rather than ‘what’
So maybe we should start a separate thread because the discussion as such is very valuable.
I also address the UPRI /Handle issue a bit in http://www.data-intelligence-journal.org/p/10/1/
I believe a prefix registry is scalable, and with automatically generated (hash for instance) suffixes, we should be fine
B
Author

Replies

Page 2 of 3

← 1 2 3 →

FAIR Data Maturity Model WG

Group Organizers

Workshop #2 Report