Cover Image for The Lytic Threshold
Cover Image for The Lytic Threshold
Avatar for Trajectory Labs
Presented by
Trajectory Labs
Hosted By
15 Going
Get Tickets
Welcome! Please choose your desired ticket type:
About Event

In this talk, Sheikh Abdur Raheem Ali:

1) Discusses rare cases where deployed LLMs have been observed to engage in self-directed behavior which would have led to catastrophic outcomes if the agent escaping containment were equipped with stronger capabilities.

2) Introduces arbitrium, a peptide-based communication method used by certain bacteriophages which release small molecules known as autoinducers to decide coordinated population-level behavior, and in particular lytic and lysogenic pathways, with a quorum-sensing mechanism.

3) Explores landmark results from alignment science which inform our current understanding of LLM biology (<250 malicious documents required to poison training datasets vs <100 viral particles required to produce infection in humans).

4) Analyzes early findings from experiments which attempt to investigate scalable improvements to defenses that monitor the internal activations of production transformer models (such as probes) and demonstrates how these enable targeted interventions on known distributions which are more difficult to achieve with input/output only methods (such as prompted classifiers) in certain cases (such as long context windows or latent reasoning models)

Event Schedule
6:00 to 6:30 - Food and introductions
6:30 to 7:30 - Presentation and Q&A
7:30 to 9:00 - Open Discussions

​​​​If you can't attend in person, join our live stream starting at 6:30 pm via this link.

​​​​This is part of our weekly AI Safety Thursdays series. Join us in examining questions like: 

  • ​​​​How do we ensure AI systems are aligned with human interests? 

  • ​​​​How do we measure and mitigate potential risks from advanced AI systems? 

  • ​​​​What does safer AI development look like?

Location
30 Adelaide St E
Toronto, ON M5C 3G8, Canada
Enter the main lobby of the building and let the security staff know you are here for the AI event. You may need to show your RSVP on your phone. You will be directed to the 12th floor where the meetup is held. If you have trouble getting in, give Georgia a call at 519-981-0360.
Avatar for Trajectory Labs
Presented by
Trajectory Labs
Hosted By
15 Going