November 27, 2024
Europe/Berlin timezone

The FAIRification of PV research data

Nov 27, 2024, 9:25 AM
25m

Speaker

Eva Unger

Description

The rapid expansion of research in many domains presents an increasing problem: results and information get distributed and hence fragmented across different scientific publications. This is in direct opposition to FAIR data principles and, as a consequence, research data is being underutilized despite the growing technical opportunities to acquire huge datasets of high-quality information. In addition, peer-reviewed publication often favors bias toward positive results, and a lot of solid research data is not being published in lieu of exciting scientific narratives that enable chasing high-impact publications.
Out of desperation, we launched a collaborative initiative in 2019, collecting a unified database of perovskite solar cell data, now with information from over 46,000 individual solar cells, which is among the most expansive and comprehensive datasets in the field of PV. We are currently undertaking efforts to automatize data feed-in through LLM-based data mining (led by Kevin Jablonka). This initial effort taught us many important lessons on the challenge and importance of collecting cohesive datasets based on a unified data model. To enhance accessibility and adherence to FAIR principles, we migrated this dataset to the NOMAD data infrastructure and are now building a research data management platform for PV metadata as well as measurement data in collaboration with Helmholtz PV scientists. The goal is to generate a basis to enable the utilization of advanced machine learning methods in making better use of the collective research outcomes generated. This talk will outline our progress in data standardization, technical implementation, and the tailored adaptations developed with Helmholtz-Zentrum Berlin to meet the specific needs of photovoltaic research. We invite collaboration with AI and data management experts to optimize this resource for the research community, promoting a sustainable and globally interconnected data ecosystem.

Presentation materials

There are no materials yet.