Cover Image for InstaFormer: Holistic Order Prediction in Natural Scenes
Cover Image for InstaFormer: Holistic Order Prediction in Natural Scenes
Hosted By
9 Went

InstaFormer: Holistic Order Prediction in Natural Scenes

Hosted by Daniel Kang
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Hosted by: AER LABS
Date: Saturday, December 13, 2025
Time: 12:00 PM - 13:00 PM SG
Location: Network School Library


🌿 About the Event

Understanding instance-wise geometries in controlled settings is already a challenging task for many visual models. In the wild, this becomes even harder.

While specialized systems exist, modern state-of-the-art approaches often rely on expensive input formats (like category labels or binary segmentation masks) and suffer from high inference costs (requiring a quadratic number of forward passes).

In this talk, Pierre Musacchio (Ph.D. Student, SNU) will present InstaFormer, a novel network designed to mitigate these limitations through holistic order prediction.

Key Takeaways:

  • Single Forward Pass: Unlike previous methods, InstaFormer returns full occlusion and depth orderings for all instances in a scene from solely an input RGB image.

  • Novel Architecture: Learn how the network leverages interactions between object queries and latent mask descriptors to semantically represent objects while carrying complementary geometric info.

  • Efficiency: Discover how this approach achieves superior results with a favorable inference-memory trade-off.

  • Future Applications: Discussion on how InstaFormer opens the door to zero-shot geometric understanding using Vision-Language Models (VLMs).


🎤 About the Speaker

Pierre Musacchio
Ph.D. Student, Computer Science & Engineering, Seoul National University

Pierre is a French Ph.D. student at the VGI Lab (Visual & Geometric Intelligence Lab) at Seoul National University. Previously, he was a member of the CV Lab at POSTECH.

His research focuses on instance-wise understanding and fundamental challenges in computer vision. He is passionate about geometric deep learning and is actively seeking collaboration with researchers interested in pushing the boundaries of scene understanding.

Location
NS Library
Hosted By
9 Went