

AI Safety Poland Talks #15
Welcome to AI Safety Poland Talks!
A biweekly series where researchers, professionals, and enthusiasts from Poland or connected to the Polish AI community share their work on AI Safety.
💁 Topic: Mathematical discovery in the age of AI
📣 Speaker: Bartosz Naskręcki
🇬🇧 Language: English
🗓️ Date: 28.05.2026, 18:00
📍 Location: Online
Speaker Bio
Dr Bartosz Naskręcki is a graduate of Adam Mickiewicz University, specialising in number theory and algebraic geometry. He received the Polish Mathematical Society's young mathematician award (2013) and the Marcinkiewicz student paper prize (2010). After obtaining his PhD in 2014, he held research positions in Bayreuth, Bristol, Bavaria and the Dioscuri TDA Centre in Warsaw.
Bartosz lectures at the intersection of CS and mathematics: cryptography, Python-based mathematical computing, as well as LLM use in maths and programming. He is also an active science communicator, especially on the Polish contribution to breaking Enigma. In 2025, he co-created FrontierMath, the hardest mathematical benchmark for LLMs. His problem was included in its toughest Tier 4 and - until very recently - resisted being solved by AI models.
Abstract
The rapid development of large language models over the past few years has led us to experiment with reasoning and recognize the potential usefulness of models in science. In the last two years, models have emerged that are gradually overcoming successive barriers in advanced deduction and are able to emulate the work of scientists to a certain extent. In this lecture, I will discuss the latest benchmarks verifying these achievements. I will say whether a successful mathematical education is possible when combining AI, “flipped classroom” and modern tools like Jupyter Notebooks. I will discuss whether we are actually close to AI surpassing mathematicians and what this means in practical science. We will see how various new techniques, including reasoning token models and multi-agent models, achieve high effectiveness in some types of reasoning, but still fail completely in others. I will try to argue why this is the case and whether the end of scientific institutions is upon us.