Cover Image for Frontiers: Breaking the Fragile Architecture of Computer Use Agents

Presented by

We're a mission driven R&D institute dedicated to advancing fundamental discoveries and carrying them to real-world impact.

Frontiers: Breaking the Fragile Architecture of Computer Use Agents

Manifold Research

Zoom

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

Welcome to Frontiers - a series where we bring top researchers, engineers, designers, and leaders working at the cutting edge of various fields to go deep on their work with the Manifold Community.

For this talk, our speaker will be Yangyue Wang. Yangyue is a Research Engineer at Metarch and an OS Research Fellow at Manifold Research, where he works on software control agents and multimodal action models. His research focuses on advancing the reliability of computer use agents through improved evaluation frameworks and the development of robust methods that bridge vision, language, and action.

Abstract

While AI systems demonstrate remarkable ability to interpret and manipulate computer interfaces, Computer Use Agents (CUAs) continue to encounter significant obstacles that prevent them from achieving reliable, general-purpose automation. In this talk, I will examine the core limitations that hinder CUA performance, exploring the critical failure modes that create what we term the "fragile architecture" of current AI assistants. I will then present our analysis framework that categorizes these breakdowns: visual recognition failures stemming from ambiguous interface elements and uncommon UI components, cognitive reasoning errors including fabricated planning sequences and goal drift, and operational challenges in unpredictable digital landscapes. I will discuss how leading-edge models such as Seed1.5VL and UI-TARS-1.5 and multi-agent frameworks like CoACT and Agent S2 attempt to overcome these barriers, highlighting the disparity between laboratory evaluations and practical implementation. Finally, I will outline promising avenues for advancing CUA capabilities, covering integrated learning methodologies, enhanced evaluation protocols incorporating real-world complexity, and adaptive systems that leverage failure experiences to develop more resilient digital companions.

We’re growing our core team and pursuing new projects. If you’re interested in working together, see our website for active initiatives and open positions, join the conversation on Discord and check out our Github.

If you want to see more of our updates as we work to explore and advance the field of Intelligent Systems, follow us on Twitter and Linkedin!

Presented by

Manifold Research

We're a mission driven R&D institute dedicated to advancing fundamental discoveries and carrying them to real-world impact.

Frontiers: Breaking the Fragile Architecture of Computer Use Agents

​Abstract

Abstract