• Primary Domain: Natural Sciences
  • Group Focus: Data Management, Disseminate, Link, and Find
  • Group Technology Focus: Data (Output) Management Planning, Depositing Research Outputs, Archiving, Search & Discovery, Re-Use
  • RDA Pathways: Semantics, Ontology, Standardisation, Data Lifecycles - Versioning, Provenance, Citation, and Reward, Discipline Focused Data Issues
  • Group Description

    Creating a Multi-omics Metadata Schema Standard Reporting Matrix WG

    Note: The WG is actively seeking co-chairs and members from different countries and continents who cover various Omics domains, including expertise in mass spectrometry and sequencing. A dedicated communications campaign via the RDA’s communications and social media channels will take place to recruit co-chairs and members from geographically diverse regions.

    Overview 

    Multi-omics data integration merges multiple Omics data types (e.g., genomics, proteomics, metabolomics, phenomics, etc.), leveraging a wide range of high-throughput technologies as a holistic approach to quantification and characterisation of large complex pools of biological molecules. Multi-omics data integration and analysis provides a closer look into the complex structural and functional interactions at the molecular level for improving our understanding of biological dynamics of living organism(s) across the life science landscape.

    Significant advancements continue to evolve high-throughput Omics technologies (sequencing, mass spectrometry, imaging, etc.), bridging a variety of subject matter expert methodologies and applications. As a result, there has been an unprecedented increase in the volume of multi-omics data generation and storage over the past decade. However, many challenges remain regarding multi-omics data management and sharing for reuse of complex Omics datasets (Figure 1).

    Figure 1. Challenges of effectively managing and integrating multi-omics data identified during the RDA-OfR WG brainstorming workshop in September 2023.

    A persistent challenge is the vast array of fragmented experimental metadata and data formats generated across the different Omics domains, making it difficult to manage and integrate multi-omics data prior to downstream analyses.

    Deliverables

    This RDA working group (WG), supported by Oracle for Research (OfR), aims to address a few of these challenges by creating a matrix of identified reporting guidelines and standards essential for integration of multiple Omics metadata elements across the different domain technologies (Figure 2).

    1. Landscape review and collection of existing Omics community standards (Deliverable 1a, 1b)

    The WG will research and consult existing work in the area by undertaking an in-depth landscape review (Deliverable 1a) to identify current Omics domain data standards (common data formats, controlled vocabularies and ontologies, metadata reporting guidelines, and identifier schemas) outlined by community accepted data management and sharing best practices within and across the different Omics domains. This work will leverage and build upon existing resource records at FAIRsharing (an RDA WG and a curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies) in evaluating existing community Omics standard records in contribution to an iterative and open Omics Domain Collection at FAIRsharing (Deliverable 1b) as part of the landscape review recommendation output. Leveraging FAIRsharing educational materials to help guide WG curation activities, the group will collect standalone Omics domain community standards and reporting guidelines for informed downstream crosswalk activities. The Omics landscape review and analysis collection serves to benefit and encourage continuous (machine-actionable) community level standard curation beyond the lifecycle of the WG for existing and future research community stakeholder groups focused on data standards in life sciences.

    2. Omics community standard and reporting guideline crosswalk (Deliverable)

    There are currently many well-established and well-developed Omics standards, but knowing intuitively or immediately which standards are complementary across domains is not obvious. Based on the results of the Omics landscape review, this WG will provide a crosswalk that will identify common data standards and metadata reporting guidelines implemented across Omics domains to link complementary integration points. This activity aims to support common metadata elements outlined in the resulting standard reporting matrix and downstream data interpretation endeavours. This crosswalk will highlight domain metadata reporting gaps and areas where current standard implementations may not be in alignment across the various Omics standards where supplementation may be of use.

    3. Multi-omics metadata schema standards and reporting matrix plus use-case collection (Deliverable)

    The multi-omics metadata schema standard reporting matrix, detailing the essential domain standards metadata elements required to accommodate multi-omics integration in areas of genomics, transcriptomics, proteomics, metabolomics (including imaging mass spectrometry). This guideline will be supported by multi-omics community example use cases and curated Multi-omics Database Collection at FAIRsharing capturing use case records, if applicable/available. Documented use cases support diverse subject domain community examples of multi-omics standard integrations from existing cross-disciplinary group activities (societies, alliances, standard consortiums, and research projects focused on data standards harmonisation). Use case database collection will outline specific developments where multiple Omics standards have been successfully integrated for supporting multi-omics data platforms and/or metadata frameworks (Ex: National Microbiome Data Collaborative (NMDC), Multi-Omics Research Factory (MORF), etc.). Multi-omics data integration use cases with an included database record at FAIRsharing, identified during the landscape review, will be included in  a new collection (as mentioned above) consisting of use-case multi-omics data repositories and/or knowledge bases implementing at least 2 different domain Omics data standards (reporting guidelines, controlled vocabularies and ontologies, etc.) reflecting key points of integration captured in the standard reporting matrix.

    Figure 2. Omics domain technologies and corresponding core Multi-omics working group deliverables outline.

    Meetings and Deliverables (see WG meeting folder

    Date (2023-2024)  Topic Documentation and Deliverables 
    7 September 2023 Brainstorming workshops: Definition of WG scope
    29 November 2023 1st WG meeting: Kick-off
    – Member consultation, overview of Omics resources and related training materials.
    13 December 2023 2nd WG meeting: Landscape review and collection of existing Omics community standards (Deliverable 1a)
    25 January 2024 3rd WG meeting: Landscape review and collection of existing Omics community standards (Deliverable 1a continued)
    21 February 2024 4th WG meeting: Landscape review and collection of existing Omics community standards (Deliverable 1a continued)
    – Assignment of research data lifecycle stage(s) to Omics standards.
    – Discussion about selection criteria for standards to be added to the Omics collection.
    20 March 2024 5th WG meeting: Landscape review and collection of existing Omics community standards (Deliverable 1a continued)
    – Update and completion of all existing Landscape review columns/fields.
    – Reflection on machine-actionable format of Deliverable 1a for dissemination.
    17 April 2024 6th WG meeting: Introduction to ‘Defining Selection Criteria for Multi-omics Standards’ (Deliverable 1b)
    May 2024 Research Data Alliance’s 22nd Virtual Plenary meeting (VP22)
    – Tuesday, 14 May 2024 – VP22 WG Breakout Session 2 (14:00 – 15:30 UTC)
    – Wednesday, 22 May 2024 – VP22 WG Breakout Session 15 (22:30 – 00:00 UTC)
    12 June 2024 7th WG meeting: RDA VP22 Recap & Next Steps on Deliverables 1a (Landscape Review) and 1b (FAIRsharing Multi-Omics Standards Collection)
  • Rationale Summary

  • TAB Liason(s)

    Isabelle PERSEIL
  • Secretariat Liason(s)

    Bridget Walker
  • Group Visibility

    Public
  • Group Creation Date

  • Endorsement Date

  • Estimated End Date

  • Actual End Date

  • Group Email

    rda-ofr-multiomics-metadata-schema@rda-groups.org
  • Group Type: Working Group
  • Group Status: recognised-and-endorsed
  • Co-Chair(s): Lindsey Anderson, David Molik, Tim Van Den Bossche

Leave a Reply