Health Data IG Charter

12 Jan 2016

RDA Health Data Interest Group Charter - Revised Charter taking into account the TAB recommendations


Name of Proposed Interest Group: Health Data Interest Group


Introduction (A brief articulation of what issues the IG will address, how this IG is aligned with the RDA mission, and how this IG would be a value-added contribution to the RDA community):

Here follows a Proposal for an Interest Group on “Health Data” (HD-IG), as a long-term initiative in the framework of RDA. It follows a rather successful BoF Session during the 6th RDA Plenary Meeting in Paris, which was attended by over 35 researchers and professionals from diverse backgrounds, who discussed several relevant issues, expressed significant interest in forming the proposed Interest Group, and helped shape its focus as presented here. The Interest Group will fill a gap in the RDA subject map formed by its current WGs and IGs, as at the moment, there is no RDA group focusing on the intricacies of Health Data, especially as it relates to privacy and security issues in Healthcare. Establishment of this IG will also enrich the set of communities involved in and contributing to RDA, as there are several professions as well as research disciplines that revolve around Health Data.

This proposal to form the HD-IG is rooted in a long series of European, international, and national projects in the area of biomedical informatics, in which the proposers have been involved in the past decade. These include projects Health-e-Child (, Sim-e-Child (, MD-Paedigree (, p-medicine (, Cardioproof (, Avicenna ( and others.
Different techniques of de-identification were adopted (pseudonymisation and anonymisation) and ad-hoc privacy guidelines were developed during these projects, not only to meet the requirements of the in-force legislation but also to face future challenges in the possible exploitation of the projects. The scientific and practitioner community developed during these projects is quite extensive and several members of them are expected to join HD-IG.

User scenario(s) or use case(s) the IG wishes to address (what triggered the desire for this IG in the first place):

Data-based Healthcare characterizes a fundamental shift in the way biomedical data are collected and processed, as well as how biomedical research is performed. The application of data techniques in Healthcare will allow us to capitalise on growing patient and health system data availability and generate healthcare innovation. However, to bring about this revolution in healthcare there are legal, technical and cultural/societal barriers that must be overcome.
The proposed “Health Data” Interest Group (HD-IG) seeks to bring together stakeholders from all relevant sides and provide a forum for discussion on the specific issues that arise when using advanced data management and analytics techniques in a Healthcare setting, particularly (although not exclusively) focusing on the impact of privacy and security concerns.

Bottom-up (evidence-oriented) analysis, seeking to extract useful knowledge by mining the daily routine's streaming data, is of fundamental interest in model-guided personalized medicine. In this context, advanced techniques are applied aiming to identify latent factors (disease signatures) that can explain and predict variability in drug therapies and disease evolution, reveal similarities among patients stratifying patient groups and build patient specific simulation and prediction models. Such an approach goes beyond classical flat file data analysis, batch learning procedures, and simple data analysis techniques commonly focused only on a few variables of interest and a well specified dataset from a specific clinical trial.
On the contrary, Knowledge Discovery and Data Mining (KDD) platforms in this area should be able to handle massive volumes of uncertain, streaming heterogeneous biomedical data, to curate, validate and analyze them in an incremental/on-line fashion from multiple points of view and under different assumptions, as well as to include or exclude dimensions, combine different modalities and incorporate existing knowledge and previous beliefs, all while preserving the privacy of the patients whose data is being analysed.
The still growing potential of modern data management and analysis is today fully acknowledged, but it may remain partly undeveloped or lead to undesirable outcomes or misuses of data if not carried in parallel with a deeper understanding of the regulatory and legal challenges it poses to patients’ privacy and data protection. At the same time, harming innovation and putting restrictions on research should be avoided. Indeed, as debates and proposals held in different countries show  (such as on a “Magna Charta for Big Data”)[1], there’s a societal need for a more adequate legislative framework for ethical leveraging of data applications, balancing the needs and rights of data providers and owners.

It must be remarked that privacy and regulatory issues related to the process of “data-intensive scientific discovery” have become a metter of special attention for the EU, in particular after the approval of the General Data Protection Regulation, which determines an updated legal framework still needing to be attentively analysed, with the aim to strengthening individuals’ trust and confidence in the digital environment and enhancing legal certainty.
The general debate on health data policies turns around three core themes: the need to ensure that citizens’ data are adequately protected; the need for Open Access to data for research purposes; the need of allowing a Data Value industry to play a growing role also in Health. As a consequence, it is necessary to strike the appropriate balance between individual privacy concerns in the healthcare setting and research purposes and innovation, which can greatly benefit patients.

Given the lack of legal international harmonisation and the different national implementations of data protection, different approaches and protocols will be adopted (many of which in accordance to HIPAA, which still is, as yet, the largest de-identifying constraint expression). Comparing and discussing these approaches is a fundamental need for the improvement of data technology in Healthcare.
The HD-IG will provide its members with a forum to discuss and highlight the legal, technological, ethical and societal challenges to the adoption of advanced data management and analysis techniques in Healthcare, to exchange opinions and compare experiences, and form Working Groups to address these challenges.

Objectives (A specific set of focus areas for discussion, including use cases that pointed to the need for the IG in the first place.   Articulate how this group is different from other current activities inside or outside of RDA.):


The initial focus of the HD-IG includes the following areas:

  • ·Privacy and Security in Health Data
    • sharing best practice on pseudonymisation, anonymisation, differential privacy, homomorphic encryption and dedicated blockchain applications
    • developing models for dynamic consent that protect patients while enabling research
    • providing a forum for discussing, explaining and responding to data protection regulatory issues
  • Data-based Healthcare for Personalised Medicine
    • analytics applied to highly sensitive health data
    • disease signatures identification and stratification of patient groups
    • patient-specific simulation and prediction
    • exploring the potential of health data usage in in silico drug development and clinical trials
  • Health Data Organisations environment
    • defining a list of organisations dealing with Health Data to cluster liaisons with
    • disseminating HDIG results within other relevant Health Data organisations

Regarding other related WG/IG efforts in the area, the following points can be observed:
The discussions during the P6 BoF Session shifted its original focus from “Big Health Data” to simply “Health Data” and intensified the already high importance of privacy and security aspects in the field. This proposal reflects the conclusions of those discussions and concentrates on a rich set of issues that are critical for Health Data and are not covered by any currently active RDA IGs.
Other related WP/IG were invited to participate at the P7 BoF, to give short presentations on their focus and to discuss possible overlapping areas to be aware of, but also possible ways of joining efforts and actions on common issues.
Besides the connections estabilshed in Paris at P6 BoF Session with the following groups: Active Data Management Plans, Big Data, ELIXIR Bridging Force, Ethics and Social Aspects of Data, Long tail of research data, RDA/CODATA Legal Interoperability, Structural Biology, in Tokyo, connections were established also with two newly proposed Working groups, Data Security and Trust Working Group  (WGDST) and RDA/NISO Privacy Implications of Research Data Sets WG
Furthermore, in the Tokyo BoF there were interventions made by GeoHealth, i.e. the geospatial dimension in health data and related analyses; Population data base - project HIVE, i.e. building a repository for the Gold coast region in Queensland starting from hospital data to link to citizen data; Human stress management and monitoring – MindFlow; Clinical data publication guidelines - Nature - Scientific Data.
This IG is the only one dealing mainly with the vertical of Health Data, while other groups dealing with privacy, security and trust are horizontal with potential use cases from several areas.
Nevertheless HD-IG will seek to pursue collaboration with those IGs that have affinity to aspects it will address, as well as with external organisations, such as VPH Institute (Virtual Phisiological Human), National Association of Health Data Organizations (NAHDO), PerMed (Personalised Medicine), HIMSS (Healthcare Information and Management System Society), IMI (Innovative Medicines Initiative).

Participation (Address which communities will be involved, what skills or knowledge should they have, and how will you engage these communities.  Also address how this group proposes to coordinate its activity with relevant related groups.):

HD-IG is open to all RDA members to participate. Particularly, but not exclusively, HD-IG welcomes individuals with the following expertise to actively participate in its activities:

·      Clinicians wanting to use data technology to improve practice

·      Biomedical researchers using data heavy analytical techniques

·      Healthcare Data Analytics with data mining, machine learning, physiological modelling and image processing expertise

·      HPC and distributed computing experts

·      Policy-makers for Healthcare

·      Health bioinformatics legal experts

·      Healthcare administrators and Health Maintenance Organisations

·      Pharmaceutical industry researchers and manufacturers

·      Medical equipment researchers and manufacturers

·      In silico modelling, testing and clinical trial experts


A quick survey during the P6 and P7 BoF Session identified participants with all but the last expertise in the above list, an indication of both the diversity of relevant stakeholders and the strength of current interest in the focus areas of HD-IG.
Nevertheless, the HD-IG will endeavour to reach out for a large number of worldwide participants, focusing especially on those moderately involved in biomedical issues.
In view of P8, an ad-hoc communication strategy will be put in place, to engage experts and people involved in other Health data projects or in the external organizations listed above.

Outcomes (Discuss what the IG intends to accomplish.  Include examples of WG topics or supporting IG-level outputs that might lead to WGs later on.):

Outcomes are expected in the areas identified as part of the Objectives, namely:

  • Privacy and Security in Health Data
    • Best practices on pseudonymisation, anonymisation, differential privacy, homomorphic encryption and dedicated blockchain applications
    • Models for dynamic consent that protect patients while enabling research
    • Recommendations and standards on data protection regulatory issues
  • Data-based Healthcare for Personalised Medicine
    • Analytics applied to highly sensitive health data
    • Disease signatures identification and stratification of patient groups
    • Patient-specific simulation and prediction
    • Exploring the potential of health data usage in in silico drug development and clinical trials
  • Health Data Organisations environment
    • Identify a list of organisations dealing with Health Data to establish liaisons with
    • Address HDIG statement to relevant organisations to invite them to join forces in further developing the IG.

Mechanism (Describe how often your group will meet and how will you maintain momentum between Plenaries.):

The HDIG will meet at the next Plenary (P8) in Denver, Colorado.
In P9 HDIG will organise joint meetings with the related WG/IGs, which will have shown interest for such a possibility in P8.
In between plenaries there will be at least one meeting every two months, via teleconference.

Timeline (Describe draft milestones and goals for the first 12 months):

We are looking forward to having the IG established before the 8th RDA Plenary Meeting, September 2016 in Denver, Colorado, so that we may extensively spread out the news inviting relevant experts to attend in view of having the first official meeting of the group then.
In P9 HDIG will have joint meetings with related WG/IGs, depending on the interest of other groups in collaborating/working together.
The first outcomes will be presented at the latest after 12 months, taking into account the chosen prioritarization among the different areas.

