Cover Image for Adversarial Defenses for LLMs
Cover Image for Adversarial Defenses for LLMs
Avatar for Trajectory Labs
Presented by
Trajectory Labs
Hosted By
39 Going
Get Tickets
Welcome! Please choose your desired ticket type:
About Event

In his talk, Samuel Simko from ETH Zurich will present his recent work on adversarial defenses for LLMs, developed with the Jinesis Lab (University of Toronto). The talk will cover a series of approaches, ranging from triplet-based contrastive learning defenses to honeypot-style defenses designed to avoid worst-case behavior. He will also discuss patterns observed in contest-winning manual jailbreaking prompts, ideas for tamper-resistant safeguards, and the current limits of attacks, defenses, and evaluation methodologies.

Location
30 Adelaide St E
Toronto, ON M5C 3G8, Canada
Enter the main lobby of the building and let the security staff know you are here for the AI event. You may need to show your RSVP on your phone. You will be directed to the 12th floor where the meetup is held. If you have trouble getting in, give Georgia a call at 519-981-0360.
Avatar for Trajectory Labs
Presented by
Trajectory Labs
Hosted By
39 Going