Data Fabric P10 Montréal meeting

The Data Fabric IG met at P10 and held a succesful session with two contributions from new members with fresh perspectives and a structured discussion afterwards.

The presentations were done by Mike Kearney (CCSDS) and Pascal Lesage (BONSAI).

Mike presented an effort that is roughly a continuation of OAIS to come to a reference architecture for preservation, which we identified, in Data Fabric terms, as a preservation Data Fabric component. The work has already progressed so far that a clearly specified interface for this component is available, as well as a good embedding into the overall architecture, depicting roles and relevant policies. One conclusion from the following discussion was that this would not only be of interest to the DF, but also the disciplinary interoperability group, as the preservation component needs further input from individual disciplines.

Pascal presented BONSAI, an organization in industrial ecology that aims to open up currently closed silos of global product footprinting data. The technical challenges are getting stepwise addressed through prototyping and iterative development, while one big challenge that remains is how to provide the right incentives for the current data providers to open up their silos of valuable data or disrupt the currently established market in a way that leads to much more open data sharing and resulting possible assessment of product footprinting. The overall effort is however still in its early phases, and as such, it could benefit from adopting the high-level architecture the Data Fabric proposes.

From the general Data Fabric discussion, we have identified a few highlights that are of relevance for the work of the group until P11:

1. Is the white paper a completion of the group? Does it indicate need for other papers?

This was worth a bit of discussion, reaching out also to the beginnings of the group. Back then, there were only the first conceptualizations of how to bind multiple RDA outputs together and how to put this into a possible conceptual framework. The supporting output is definitely a reached milestone as it explains the matured perspective and is a fair description of the work that needs to be done. However, the work of establishing a full data fabric is not yet complete, which is clearly illustrated by the test bed activities.

2. What other currently emerging RDA activities are of relevance to the future of the Data Fabric group's efforts?

The disciplinary collaboration framework (DCF) efforts were identified as providing highly relevant synergy points for the Data Fabric. The aggregated space of use cases put forward there should further drive the discussions within the data fabric group and somehow root them in practical cases. With the conceptual framework established, this is a logical next step and is also beneficial to fully understand the feasibility and practical complexity of a component-based data fabric approach. There was a good consensus among the group participants that we should seek a joint session with the DIF group for P11.

3. What could be a possible focus of such a session?

Given what we see happening with the usage of type registries in particular, but also the PID Kernel Information, a very welcome contribution could shed more light on the curation of the content of such registries and metadata fabric concerns in general. How do we avoid an explosion of types? This is an important problem that is not solved yet and might even be more generalized. Possible involvement could also stretch to Wo Chang's IEEE group. Governance issues would be a new aspect of Data Fabric work, but decidedly worth a discussion for a possible P11 joint session.