Jun 16 – 19, 2026
Center for the Science of Materials Berlin (CSMB)
Europe/Berlin timezone

AI-Assisted Scientific Data Extraction

Jun 17, 2026, 11:35 AM
25m
2.049 (Center for the Science of Materials Berlin (CSMB))

2.049

Center for the Science of Materials Berlin (CSMB)

Zum Großen Windkanal 2 12489 Ber

Speakers

Sherjeel Shabih yaru wang

Description

Scientific progress increasingly depends on the ability to transform unstructured information into accessible, structured data. However, the rapid growth of scientific literature has made manual data extraction and curation a major bottleneck across disciplines. Recent advances in artificial intelligence offer new opportunities to automate this process and unlock knowledge at unprecedented scale.

This session will explore emerging AI-assisted approaches for scientific data extraction, focusing on multimodal workflows that convert diverse information sources into structured, machine-readable datasets. We will present recent developments in NOMAD that leverage large language models and domain-specific validation to extract scientific information from research publications, enabling the creation of continuously updated knowledge resources. In addition, we will showcase new capabilities that extend data extraction beyond traditional documents, including the use of speech and audio inputs as alternative pathways for capturing and structuring scientific knowledge.

Through examples from materials science and photovoltaics, we will discuss the opportunities, challenges, and limitations of AI-driven extraction systems, including issues of accuracy, validation, reproducibility, and integration with existing scientific infrastructures. The session aims to provide researchers with an overview of how AI can accelerate the transformation of scientific content into reusable data and support data-driven discovery in an era of rapidly expanding scientific output.

Presentation materials

There are no materials yet.