

AvataRL: Adaptive Data Spaces and Sample Efficient Language Modelling
How can we keep training AI models when high-quality pretraining data starts to run out?
What we can learn from AvataRL - using a smaller mentor model to guide what the main model should learn next.
Training a model that not only learns from data but learns how to learn by steering its own practice sessions.
Why uniform training is wasteful and how adaptive data selection helps make every batch count.
How the mentor chooses examples with the most learning signal and avoids wasting steps on what the model already knows.
Balancing batches to stay diverse and avoid repetitive data, leading to more robust learning outcomes.
About the speaker: Abhishek Mishra (aka tokenbender), Researcher at eXperiment Labs https://x.com/tokenbender || https://tokenbender.com
Preread: https://tokenbender.com/post.html?id=avatarl
To attend online, please use the link below:
https://meet.google.com/awd-hfrj-xur?hs=122&authuser=0