Photoemission spectroscopy (PES) is presented as a use case for pioneering future research data concepts. We will show how FAIR research data can be organized and how we intend to create benefits for the participating scientists. We will present an extensive and elaborated standard (NXmpes) for harmonizing PES data using NeXus. This standard is developed in collaboration with the PES community...
Low temperature Scanning Tunneling Microscopy (STM) and Spectroscopy (STS) provide important insights to materials properties which should be captured and stored following the FAIR principles. We have developed a data format proposal for the NeXus community standard which provides a rich vocabulary for representing all important experimental data and metadata. An end-to-end solution embedded...
In modern material science the amount of generated experimental data is rapidly increasing while analysis methods still require many manual work hours. Especially, this is the case for X-ray photoelectron spectroscopy (XPS), where quantification is a complex task and, in many cases, can be properly done by experts only. However, these problems could be overcome by the use of a neural...
Achieving an interoperable representation of knowledge for experiments and computer simulations [1-4] is the key motivation behind the implementation of tools for FAIR research data management in the condensed-matter physics and materials engineering communities. Electron microscopy and atom probe tomography are two key materials characterization techniques used globally and across...
Within the expansive domain of Metal-Organic Frameworks (MOFs), navigating the vast datasets for impactful research has posed significant challenges. Addressing this, our study introduces a groundbreaking methodology through MOFGalaxyNet, employing Social Network Analysis (SNA) to illuminate the structure and dynamics of MOF interactions. The core of our strategy, the Black Hole approach,...
The FAIR principles (Findable, Accessible, Interoperable, Reusable) serve as a reference for assessing the quality of data storage and publication [1]. NOMAD [nomad-lab.eu] [2, 3] is an open-source data infrastructure for materials science data that is built upon these principles. In this contribution, we will demonstrate the interplay between high-quality data and knowledge using the...
Recent discoveries in astroparticle physics, including cosmic accelerators, gravitational waves from black-hole mergers, and astronomical neutrino sources, underscore the importance of a multi-messenger approach. The transient and rare nature of these astrophysical phenomena necessitates interdisciplinary work with diverse modern and historical data, emphasizing the need for FAIR (Findable,...
Optical spectroscopy covers experimental techniques such as ellipsometry, Raman spectroscopy, or photoluminescence spectroscopy. In the upcoming transformation process of the research environment towards FAIR data structures, these techniques will play a crucial role as they govern various fundamental and easily accessible material properties such as reflectivity, light absorption, bandgap, or...
The emergence of big data in science underscores the need for FAIR (Findable, Accessible, Interoperable, Reusable) [1] data management. NOMAD [nomad-lab.eu] [2, 3] is an open-source data infrastructure that meets this demand in materials science, enabling cross-disciplinary data sharing and annotation for both computational and experimental users. In this contribution, we will present our...
The PATOF project builds on work at MAMI particle physics experiment A4. A4 produced a stream of valuable data for many years which already released scientific output of high quality and still provides a solid basis for future publications. The A4 data set consists of 100 TB and 300 million files of different types (Vague context because of hierarchical folder structure and file format with...
There has been a distinct lack of FAIR data principles in the field of photoemission spectroscopy (PES). Within the FAIRmat consortium, we have been developing an end-to-end workflow for data management in PES experiments using NOMAD and NeXus, a community-driven data-modeling framework for experiments [1]. We will present an extensive and elaborated standard (NXmpes) for harmonizing PES data...
In order to achieve interoperability for data of different origin, FAIRmat is contributing to the materials science data management platform, NOMAD. It features flexible, but structured data modeling, allows custom data ingestion, while providing efficient search capabilities and online visualization of datasets. Several standard data formats are supported by NOMAD including the NeXus format...
New materials are conventionally developed via trial and error in laboratory experiments.
This process is in general slow and involves significant resources and research eEorts.
Furthermore, it can overlook potential candidates, properties, or business-case criteria
related to their use. Computational simulation methods can help solve these problems by
accelerating the screening process...
While rapid exploration and optimisation of solution-processable materials in self-driving laboratories (SDLs) is advanced, adapting these approaches for inorganic materials using physical vapour deposition (PVD) presents challenges due to increased experimental complexity and higher time and energy demands for sample production. It is thus critical that the SDL’s underlying algorithms learn...
State-of-the-art Bayesian optimization algorithms have the shortcoming of relying on a rather fixed experimental workflow. The possibility of making on-the-fly decisions about changes in the planned sequence of experiments is usually excluded and the models often do not take advantage of known structure in the problem or of information given by intermediate proxy measurements [1-3]. We...
Infrared Spectroscopy (IR) is crucial in heterogeneous catalysis for identifying active sites, yet existing simulations lack comprehensice peak broadening output. We propose an application to generate complete spectra from Density Functional Theory (DFT) data, facilitating comparison with experimental results. Built on CaRMeN, it manages data in an SQL database, ensuring efficiency and...
Most current explainable AI methods are post-hoc methods that analyze trained models and only generate importance annotations, which often leads to an accuracy-explainability tradeoff and limits interpretability. Here, we propose a self-explaining multi-explanation graph attention network (MEGAN) [1]. Unlike existing graph explainability methods, our network can produce node and edge...
NOMAD [nomad-lab.eu] [1, 2] is an open-source data infrastructure for materials science data. NOMAD already supports an array of computational codes and techniques, with over 60 parsers that automatically extract essential (meta)data from the raw output of standard calculations. Traditionally, the NOMAD repository has focused on contributions from DFT calculations, accumulating over 12.5...
Advancements in materials science are significantly dependent on the detailed characterization of samples, which in turn generates complex measurement data. This poses challenges in data management, notably in metadata preservation and the need for extensive manual processing, often exceeding the expertise of researchers. The FAIR principles offer a pathway towards resolving these issues...
We introduce NOMAD CAMELS [1] (Configurable Application for Measurements, Experiments, and Laboratory Systems), an innovative open-source measurement software designed to capture FAIR data that is fully self-describing in NeXus format. This enables native integration of CAMELS’ data into research data management tools such as NOMAD or eLabFTW. CAMELS empowers users to define measurement...
The rise of digitalization has significantly reshaped scientific practices, positioning research data as a valuable asset. New research paradigms have emerged that extend the use of these data beyond their original research purposes. As a result, proper data preservation in line with the FAIR principles,[1] as well as the legal aspects relevant to the preservation and reuse of these data, have...
A key challenge in experimental high-resolution microscopy is the real-time interpretation of the observed images in conjunction with the parameters adjusted by the experimenter during data acquisition, e.g. to obtain a certain contrast. The parameter space of candidate structures, experimental parameters, and resulting image contrast can be vast and complex, often requiring a scientist who is...
Electronic Laboratory Notebooks (ELNs) are crucial for moving research data from paper to digital formats, streamlining lab workflows and digitizing data. This study examines integrating ELNs into Research Data Management (RDM) platforms like NOMAD, focusing on challenges like user acceptance and data structuring.
ELNs need to be user-friendly and structure data effectively for integration...
Recently, funding agencies have begun to require sections on research data management in grant applications and the submission of a detailed Data Management Plan (DMP) during the initial phase of a funded research project. These DMPs are set as milestones to be achieved for a successful research project. Scientists often view DMPs as a burden and additional work that distracts them from active...