Cover Image for Cloud Native x AI: The next wave of innovation
Cover Image for Cloud Native x AI: The next wave of innovation
31 Going
Registration
Welcome! To join the event, please register below.
About Event

Join us for an opportunity to connect with Cloud Native and AI application developers in the Bay Area! Gain real-world insights from experts at Nutanix, Canonical and SigNoz as they dive into scaling GenAI, building AI-native infrastructure, and optimizing Cloud Native workflows. Don’t miss this great event packed with ideas, innovation, and meaningful conversations.

Location: 1740 Technology Drive San Jose, CA 95110 (Nutanix office - Room 120 Cranium)

Agenda

  • ​​​5:30 - 6:00 pm: Check-in/networking (food & drinks)

  • ​6:00 - 6:25 pm: Talk #1 - Deploying Semantic Code Search on Cloud-Native Open-Source LLMs  

  • ​​6:25 - 6:50 pm: Talk #2 - Build your AI cloud with open source technologies

  • ​​6:50 - 7:15: Talk #3 - Extending Kubernetes Clusters with CAREN and ClusterClass

  • 7:15 - 7:40: Talk #4 - Taming Agentic AI with Observability: Tips, Gotchas & Live Debugging

  • ​7:40 - 8:00: Wrap-up/networking

Talk #1: Deploying Semantic Code Search on Cloud-Native Open-Source LLMs

Speakers: Aryan Singhal and Vaishnavi Bhargava (Nutanix)

Abstract:

Deploying cloud-native applications at production scale poses significant challenges, especially when running memory-bound, inference-time decoding for large language models (LLMs). These challenges intensify in agentic architectures with retrieval-augmented generation (RAG), which require managing both non-parametric memory and large parametric models.

For code search agents, an additional hurdle is bridging the representational gap between code artifacts and natural language queries. To address this, we developed a patent-pending, zero-shot LLM agent that converts source code into natural language descriptions—first presented at the ICLR 2025 Coding Workshop (link).

In this talk, we share how this research innovation was transformed into a production-scale, internal developer productivity application. We cover:

  • Navigating legal, security, and privacy compliance

  • Building a batch inference API for large-scale processing

  • Overcoming GPU infrastructure constraints

  • Designing cloud-native, stateful application workflows

  • Ensuring contextual quality in generated outputs

Our goal is to provide practical insights into the deployment and operational management of cloud-native GenAI applications at production scale, informed by our real-world implementation experience.

Talk #2: Build your AI cloud with open source technologies

Speakers: Bruno Hildenbrand | Gustavo Sanchez (Canonical)

Abstract:

Building a powerful and scalable AI infrastructure often comes with the dual challenges of vendor lock-in and prohibitive costs. For developers dedicated to the principles of open source, navigating a landscape dominated by proprietary solutions can be a significant hurdle. This talk presents a powerful alternative: an open, flexible, and cost-effective AI cloud built on the synergy of open source technologies.

Join us to explore how we can leverage fully open source tools to create a dedicated AI environment that gives you full control. We will cover the orchestration of key AI frameworks like Kubeflow, the integration of GPU resources, and vector databases such as OpenSearch. This talk will explore how organizations can save time and prevent reskilling their professionals through using open source software for GenAI apps, including key considerations, benefits, and common challenges.

By the end of this session, you will understand how to bypass the limitations of proprietary clouds and deploy a high-performance, future-proof AI infrastructure that is both resilient and fully customizable. Learn to build your own AI cloud and take control of your innovation.

Talk #3: Extending Kubernetes Clusters with CAREN and ClusterClass 

Speakers: Deepak Goel (Nutanix)

Abstract: 

As Kubernetes adoption grows, platform teams are challenged to balance standardization with flexibility across diverse environments. Cluster API’s ClusterClass offers a declarative, reusable blueprint for Kubernetes clusters, simplifying lifecycle management across infrastructure providers. However, real-world deployments often require environment-specific tweaks, policy enforcement, and add-on integration that go beyond static templates.

This is where CAREN (Cluster API Runtime Extensions – Nutanix) comes in. CAREN introduces a framework for runtime customization through mutation hooks, enabling operators to dynamically modify cluster definitions without altering the underlying ClusterClass. This separation of concerns promotes maintainability, accelerates testing, and makes customizations portable across providers.

In this session, we’ll explore:

  • The fundamentals of ClusterClass and how it streamlines cluster provisioning

  • How CAREN enables dynamic, provider-agnostic runtime modifications

  • Real-world use cases like setting audit policies, configuring proxies, and integrating CNIs

  • Patterns for building and testing runtime extensions effectively

By the end, you’ll understand how to use CAREN and ClusterClass together to deliver standardized yet adaptable Kubernetes clusters that meet your organization’s operational needs.

Talk #4: Taming Agentic AI with Observability: Tips, Gotchas & Live Debugging 

Speakers: Pranay Prateek and Goutham Karthi (SigNoz)

Abstract: Agentic AI is cool—until it breaks in production. Running across multiple cloud-native services, agent workflows can quickly turn opaque: tool-call chains loop, LLMs hit rate limits, and latency spikes without clear root cause.

In this talk, we’ll show how to bring cloud-native observability practices to agentic AI using OpenTelemetry. You’ll learn how to:

  • Trace LLM calls, tool-call chains, and inter-service hops across distributed systems.

  • Pinpoint real bottlenecks—whether it’s an LLM, an MCP server, or an agent backend.

  • Set up alert patterns to catch infinite tool loops before they spiral out of control.

We’ll wrap with a live debugging demo, showing how observability-first design turns agentic AI from a “black box” into a manageable, cloud-native production service.

Location
1740 Technology Dr (Room: 120-Cranium)
31 Going