Health Research Performing Organisations (HRPOs) FAIR Guidelines
Raising FAIRness in Health Data and health research performing organisations (HRPOs) WG
Group co-chairs: Shanmugasundaram Venkataraman, Celia Alvarez-Romero, Kristan Kang, Anupama Gururaj
Recommendation title: Health Research Performing Organisations (HRPOs) FAIR Guidelines
Authors: Celia Alvarez-Romero, Anupama Gururaj, Kristan Kang, Shanmugasundaram Venkataraman
Due to siloed datasets in many regions of the world, there are currently barriers that hinder full exploitation of health research data for societal benefits globally. Although many data have been made open, that does not necessarily translate to data that can be easily reused and shared, or even discovered, due to mismanagement or socioeconomic, political and/or legal barriers. Legal and ethical barriers for the implementation of FAIR policies in health research performing organisations (HRPOs) limit the reproducibility of research, the data variability and access to large volumes of data to make the research more robust, as well as the scope of research and discovery of scientific knowledge through data sharing and data reuse.
The societal implications are many-fold and can have far reaching consequences for health emergencies that could develop rapidly and extensively across borders. One obvious example would be the Covid crisis,, where international cooperation on a scale that has never been achieved before allowed the rapid sequencing and epidemiological studies to be carried out and which resulted in a body of published work that is the fastest in history for any one disease. There are of course several other diseases that could be targeted for concerted action using this model and the volumes of data that each produces would need to be managed carefully. The FAIR principles are still relatively in their infancy but will provide an answer to the challenges of proper research data management (RDM) of big data that will have significant impacts on society. To provide indicators, the increase in secondary use of data could be measured once FAIR policies have been implemented, related to the publication and sharing of FAIR data, and compared to practice prior to implementation of FAIR policies.
In terms of the UN’s Sustainable Development Goals, and especially the third SDG of Good Health and Well-being, adopting the FAIR principles would take society towards more equality, especially for low- and middle-income (LMIC) countries through better data access and sharing. At present, there is a large imbalance between high income countries (HICs) and LMICs in the level of healthcare that is available, and this partly contributes to the disparities that can be seen in the levels of health research, and which is also a factor of lack of funding and infrastructure. However, in today’s digital world, LMICs could also benefit from shared resources and this can be achieved through application of the same FAIR principles that are being widely adopted in HICs.
The vast amounts of money spent every year on health research both in the public and private sectors makes a case for the economic imperative to manage the resulting data in a suitable fashion that will allow their reuse. In the case of publicly funded research, there is an even stronger case perhaps. Several billion dollars of taxpayer funded research is conducted every year globally and this necessitates proper accountability. Indeed there is a data deluge which has been previously described [ref] and which sees exponential growth, and a large contributor to this is publicly funded research. The FAIR principles provide a framework in which accountability can be fulfilled to a large extent.
It is essential to refer to the report issued by the European Union about the costs of not having FAIR data. The main conclusions of that report are that: i) the cost of not having FAIR data is approximately €10.2bn per year for the EU; ii) in addition, the open data economy suggests that the impact on innovation of FAIR could add another €16bn to the minimum cost estimated; and iii) that would make a total of at least €26.2bn per year.
Unnecessary duplication of research is another factor that could be avoided through implementing the FAIR principles and the reproducibility of the research studies. Allowing reuse and sharing of data and increasing their discoverability will reduce the need and likelihood of repeating previous research and thus reducing their economic outlay. The speed at which scientific advances could be made will also benefit as more effort will be available. This will perhaps have most significant impacts in LMICs where reuse of data could have solutions to local health problems.
A practical example of the effort and use of resources in an HRPO when using or not using the FAIR principles is related to data extraction and collection. The tasks to extract and collect data from EHRs and other kind of healthcare sources is not trivial and require lot of conceptual and technical efforts, due mainly to: (i) complexity of the raw data (the source EHRs are typically very complex including information in several tables in the source databases), (ii) free text used in some fields in the source raw data, and (iii) differences between the nature of the source raw data. To address this complexity, different profiles of researchers are often involved in these tasks, including experts in Natural Language Processing (NLP) techniques to address the information in free text fields of the EHRs and data scientists to analyse in-depth each source raw dataset.
Therefore, to provide indicators, the increase in data reuse could be measured once FAIR policies have been implemented, in comparison with the practice previous to the implementation of FAIR policies.
Adoption and application of the FAIR principles to research data has grown significantly in the few years that they have existed. They have been embraced in many quarters and have paved the way to creating a level playing field for data reuse. Nevertheless, they also pose challenges in some areas of research where reuse may not be an automatic right due to issues of confidentiality, privacy, commercial interests and sensitivities in general.
One sector of research where this is particularly evident is health research, and this proposed working group aims to address global disparities in uptake of the FAIR principles in health research and within health research performing organisations (HRPOs). The work will build upon existing and ongoing work being conducted at the European level (please see below the attached document "D2.3 Guidelines for implementing FAIR open data policy in health research"), and which has in itself identified differences in policies within that region. The aim will be to expand on this work to create a global analysis of policies and to subsequently draw on commonalities to propose a set of guidelines (RDA outputs or recommendations) that can be utilised by HRPOs in their local contexts to address FAIRification of their research data.
Author: Francis P. Crawley
Date: 21 Nov, 2022
These guidelines are nicely developed. They are clear and light, which should make them easy for an organisation to use. I have some comments on the title: 'Health Research Performing Organisations (HRPOs) FAIR Guidelines'. The title is not well considered.
1. These guidelines are not all about FAIR. Even the majority of what is written is not about FAIR. Related to this, the term 'FAIR data policy' under Principles: 1. is either misleading or incorrect or both. There exist 'FAIR Data Principles' and it is in this framework that FAIR is best considered and discussed. You are presenting something of a policy document that includes implementing these FAIR Data Principles, but also includes much more (that is equally of importance). (You indeed say very little about how to implement the FAIR Data Principles into the data stewardship, but no need: there are other more extensive guidances for this).
2. 'Health research' is usually thought of as something like 'public health research'. Your title would be clearer, and have broader applicability, if you would changed it to 'Health-related Research'
Thus, the title I would suggest for your guidelines are 'Developing Data Policy for Health-related Research Organisations'. You will then need to rewrite your guidance slightly, but it will work better and be clearer.
Date: 21 Mar, 2023
Thank you very much for your feedback and apologies for this delayed response. Please see our revised version of the guidelines, and in particular the title and the first principle which we hope addresses your first comment. We agree with your analysis of what we are trying to achieve but want to emphasise that raising FAIR levels in the health research domain is the main objective.
Re. the term “health research”: we hope the corrections noted above resolve this issue. Thank you.