Cover Image for Testing Self-Evaluation Bias of LLMs
Cover Image for Testing Self-Evaluation Bias of LLMs
Avatar for Arize AI
Presented by
Arize AI
Generative AI-focused workshops, hackathons, and more. Come build with us!
Hosted By
148 Went

Testing Self-Evaluation Bias of LLMs

Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

When building and testing AI agents, one practical question that arises is whether to use the same model for both the agent’s reasoning and the evaluation of its outputs. Keeping the model consistent may simplify the setup and reduce costs, but it also raises concerns about bias, over-familiarity, and inflated scores.

To better understand these trade-offs, we ran an experiment comparing how evaluations differ when the same model is used versus when evaluation is handled by a different model.

Join us to see the results and our take on the implications.

Avatar for Arize AI
Presented by
Arize AI
Generative AI-focused workshops, hackathons, and more. Come build with us!
Hosted By
148 Went