

Using AutoResearch for Science
Abhinav Kumar is a Researcher at Lossfunk, where he has been extending Karpathy's auto-research framework to test whether neuroscience priors can improve performance on tasks like BabyLM. He has previously worked on building an AI Research Mentor and probing sycophancy vectors in LLMs.
In this session, he will walk through Karpathy's auto-research, a setup where an LLM agent autonomously iterates on a problem against a editable markdown file running for hours (or weeks) on a single task.
We will look at how the community has been using auto-research across CS/ML for program discovery, fine-tuning, architecture discovery and more use-cases across domains. We will also walk through Lossfunk's experiments, and discuss the failure modes along with best practices for using the system.
The session will end with a discussion of CAISc, the verifiable problems which attendees can attempt on their own, as well as recommendations for open-ended research problems suitable for using auto-research.
Karpathy's Auto-Research repo - https://github.com/karpathy/autoresearch
CAISc Verifiable Problems - https://caisc2026.github.io/verifiable-problems/
Part of the CAISc 2026 pre-conference series. Learn more at caisc2026.github.io.