

Redwood Research Red Teaming Hackathon
ββ‘ Who can build the best hacks and defenses for LLMs? β‘π€
βWhat if LLMs had hidden motives? Models deployed in the U.S. government and the world's most important companies might secretly be looking for opportunities to sneak backdoors into code, or otherwise pursue nefarious objectives. Could we catch these model hacks? This weekend, we're playing a capture the flag game with two teams:
βββThe red team: The red team will create realistic examples of models that hide secret objectives (e.g. the model secretly tries to share data with attackers)
βββThe blue team: The blue team will try to detect these objectives.
βWe'll bring the compute and some examples to build on. You bring the research hustle.
βMore details here: http://bit.ly/3JlJ3Q2
βHosts
βRedwood Research (redwoodresearch.org)
βConstellation (constellation.org)
ββποΈ Schedule (Subject to Change):
ββSept 13th 2025
ββ9:30 AM: πͺ Doors Open
ββ9:45 AM: π€ Opening Remarks
ββ11:00 AM: π» Start Coding!
ββ12:00 AM: Lunch Provided
ββ6:00 PM: Dinner Provided
ββSept 14th 2025
ββ9:30 AM: πͺ Doors Open
ββ12:00 PM: Lunch Provided
ββ2:00 PM: π€ Project Submission Deadline
ββ5:30 PM: π©βπ» Awards ceremony
βββ‘οΈ Next steps β‘οΈ
ββJoin the discord: https://discord.gg/7qEDgCWAwmβπ #teamsearch is where you can look for teams
ββRegister on Devpost: https://redwood-af.devpost.com/
ββNotion: http://bit.ly/3JlJ3Q2