LIMS, LLMs, Schemas, Ontologies & Terminology Services: Current Materials & Engineering Sciences Efforts

You are here

08 Jun 2023
Meeting objectives: 
  1. Continue discussions about Laboratory Information Management Systems (LIMS),
  2. Materials Ontologies begun at P20 and introduce LIMS Common Schema, Terminology Services and Large Language Models (LLM).

  3. Discussion with audience about starting a new Ontologies Working Group.
Meeting agenda: 
  1. 0-5 Mins: Introduction

  2. 5-15 Mins: LIMS Updates: Laboratory Information Management Systems (LIMS), such as NIST NexusLIMS, provide a framework for managing data from the outset of the research life cycle, delivering the foundation platform for integrating existing and newly emerging capabilities for machine learning, data analysis, collaboration, and dissemination.  

  3. 15-25 Mins: Ontologies Updates: Multiple regional activities for development of ontologies used in materials science and chemistry will be presented.

  4. 25-55 Mins:  The Terminology Service is a web-based infrastructure component applied in data and knowledge management tasks to make ontologies visible, searchable and usable for standardization activities and for integration in automated processes.  Common LIMS schema & Terminology Services: NexusLIMS is developing a common schema to use structured terminology to communicate more broad concepts and capabilities used throughout materials and engineering sciences.

  5. 55-65 LLMs for schema/ontological uses: Large Language Models (LLMs), such as ChatGPT, use neural network architectures designed to perform a variety of Natural Language Processing (NLP) tasks. Models are trained on massive natural language corpora and require high-performance GPUs (graphic processing units) for computation.

  6. 65-80 Mins: Discussions about updated four areas.

  7. 80-90 Mins: Discussion about new Materials Ontology WG 

Panel Discussions: Moderators – Laura Bartolo, Northwestern University, Jim Warren, NIST, Mat’ls Data IG Co-Chairs, Gretchen Greene, Daniela Hausen, RDMES IG, Co-Chairs

    • Introduction, 5 minutes: Connections across LIMS, Schemas, Ontologies, Terminology Services & LLMs: Benjamin Long (NIST USA)
    • Presentations & Discussion about materials & engineering sciences 1) LIMS, 2) Ontologies, 3) LIMS Common Schema & Terminology Services and 4) LLMs.  75 minutes
      • Panel 1 LIMS Updates:  Henk Birkholz (IWT, Germany) Gretchen Greene, (NIST, USA), June Lau (NIST, USA) David Elbert (JHU, USA) 
      • Panel 2 Ontologies Updates: Yann Le Franck (eScienceFactory, F) Emanuele Ghedini (Univ Bologna, Italy), Marek Cebecauer (Heyrovsky Institute, CZ) KR Lee (KIST, Korea VIRTUAL)
      • Panel 3 Common LIMS Schema & Terminology Services Joshua Taillon (NIST, USA), Felix Engel (TIB Germany), Matthias Grönewald (TU Darmstadt Germany)
      • Panel 4 LLMs for schema/ontological uses: Ben Blaiszik, KJ Schmidt (University of Chicago/Argonne National Lab, USA) VIRTUAL
    • Ontology panelists & audience to discuss possible WG, 10 minutes  
      • Gerhard Goldbeck (GCL, UK): Overview and update of new WG
Target Audience: 

Researchers in materials science, engineering, cognate domains and data science

Group chair serving as contact person: 
Brief introduction describing the activities and scope of the group: 

The Materials Data, Infrastructure, & Interoperability (MDII) was founded in 2013 and aims to foster exchange of computational & experimental materials data through shared online repositories, standardized formats/terminologies & open programming interfaces.  

The Research Data Management in Engineering Sciences

 

Short Group Status: 

Session at Plenary 20 introduced participants to current status of semantic data documentation in materials and cognate disciplines, ranging from terminologies, metadata schema, vocabulary services to ontologies. In ten brief presentations, participants learned about the latest updates from pertinent projects and initiatives in Europe, US and Korea. This was followed by a discussion about the different approaches and the need to set up a Working Group related to the Materials Data IG aimed at harmonisation and guidelines geared towards the needs of the involved domains.

The “Research Data Management in Engineering” Interest Group (IG RDMinEng) seeks to bring together scientific and industrial stakeholders from all relevant sectors. The IG RDMinEng will provide its scientific and industrial members with the opportunity to discuss and improve the legal and technological challenges to the adoption of FAIR data and software management in Engineering, to share knowledge, opinions and experiences, and form or participate in existing Working Groups to address these challenges.

At P12 in Gaborone, Botswana, a first session has been organized to foster the discussion about data sharing and research data management challenges in engineering. With more than 40 persons the meeting room was crowded demonstrating a huge interest in this topic. Since P13 the IG is successfully organising a session each plenary. At P14the idea of setting up focus groups on “Engineering-specific DMP”, “Metadata for Engineering”, “Data Annotation” and “Engineering and Open Science” was formed. The Joint Session at P15 with Small Unmanned Aircraft Systems Data IG let the IG moving forward on the topic metadata and annotation. IG was running an online seminar series on “Metadata for Engineering” and 
“Data Annotation”
 from July to October 2020 which should be continued in spring 2021.  
The focus group on Engineering-specific DMP was integrated into the 
WG Discipline-specific Guidance for DMP.  The IG RDMinEng held a working session on Data Provenance and Research Software for Engineers and a Joint Meeting with the Metadata IG at P16.

At VP17 the IG organised a session of Engineering RDM initiatives hosting several engineering domain research areas across the RDA community. A FAIR approach was presented for the Energy and Additive Manufacturing work areas and interest expressed on metrics and data quality in addition to planning of research data infrastructure for engineering sciences. VP18 further engaged these particular topics of interest in Research Data In Engineering through the session on Data Quality and FAIR metrics in Engineering.

At P20 the IG held a Joint Session with the RDA/CODATA Materials Data, Infrastructure and interoperability IG with the focus on Automation of LIMS for Data Analysis and Reuse (https://www.rd-alliance.org/automation-lims-data-analysis-and-reuse-joint-meeting).   These efforts are continuing in P21 and will have touch points to the LIMS schema presented in this joint session.

Type of Meeting: 
Informative meeting
Avoid conflict with the following group (1): 
Meeting presenters: 
1) LIMS: Birkholz, IWT, DE, Long, Lau, Greene NIST; Elbert, JHU, USA 2) LMMs: Blaiszik, Schimidt, UC, USA 3) Ontolo: Goldbeck, GCL, UK, Ghedini, Univ Bologna, IT, Cebecauer, Heyrovsky Inst, CZ Y Le Franck, eScienceFactory, KR Lee, KIST, Ishii, Japan
Contact for group (email):