

VAM! AI Reading Group - Paper DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation
โ1) ๐ Paper: DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation https://arxiv.org/abs/2601.09688
โ3) Paper 3-line Summary: DeepResearchEval is an automated framework designed to evaluate AI systems that perform complex, multi-step web research tasks. It generates realistic research tasks using persona-driven profiles and filters to ensure they require multi-source evidence synthesis, eliminating the need for manual annotation. The framework evaluates outputs through adaptive, task-specific quality criteria and actively fact-checks claims via web search, even when citations are absent.
โRecommended Action Item:
โplease message me if you would like to present an AI paper
โto maximize engagement during the reading group, please try to read the paper in advance.
โWant to present?
โThe list of papers will be available here: https://docs.google.com/spreadsheets/d/1HET5sjnHjwiF3IaCTipR_ZWspfgglqwdBFRWAfKBhp8/edit?usp=sharing
โTo connect with the group, join the Discord: https://discord.gg/teJvEejs94
โTimeline:
๐ 7:00 PM โ Arrival & Networking.
โ๐ฃ๏ธ 7:10 PM ~ 7:55 โ Paper Presentation & Discussions
โAbout the Facilitator
โIssam Laradji is a Research Scientist at ServiceNow and an Adjunct Professor at University of British Columbia. He holds a PhD in Computer Science and a PhD from the University of British Columbia, and his research interests include natural language processing, computer vision, and large-scale optimization.
โLooking forward to discussing the latest AI Papers!