ML Pub Club #29: Building Jarvis - CROZ Self-Hosted AI Platform at Scale
ML Pub Club #29: Building Jarvis - CROZ Self-Hosted AI Platform at Scale
This one goes beyond “we plugged in an API and it works.” 🍻
At the next ML Pub Club, we’re breaking down what it actually looks like to build and run a self-hosted AI platform in a real company environment, with all the constraints that come with it.
Jarvis is CROZ’s internal AI platform built on OpenShift AI, vLLM and open-source models, designed for full control over data, models and access. No black boxes, no outsourcing core intelligence.
Today, it serves 400 employees and powers a growing set of agentic workflows across both technical and non-technical processes. We’ll go through how it started, why they decided to build it in-house, and what it takes to keep it running as usage scales.
On the performance side, we’re getting into the details. Kernel-level optimizations like FlashInfer for attention and MoE experts, DeepGEMM for dense layers, plus batching strategies that pushed throughput close to 2× while cutting tail latency.
🎤 This session is led by Petar Zrinscak, an AI consultant and engineering leader currently focused on building self-hosted AI platforms and large language model infrastructure for enterprise customers. With a background in development, integration, API management, and AI inference, he designs and deploys AI platforms that emphasize control, security, and observability across the full lifecycle of model development and deployment.
As CEO of RIWARE Development, he also leads a team of 25 engineers delivering complex software projects for clients in Croatia and beyond. He is actively sharing practical insights on AI infrastructure, vLLM, and enterprise AI automation via Enterprise AI Substack.
🗓️ Tuesday, May 5th. 🕕 18:00.
📍 CroAI HQ (Zavrtnica 17, 4th floor, Zagreb).
🔗 Registrations via Luma.
See you there? 👀
