

Build Night: Web Research Agents That Don’t Break in Prod x HackerSquad
In 3 hours, build a deep‑research agent that returns structured, citation-backed JSON then challenge it with the problems that kill agents in production (JS pages, blocks, rate limits) and run a small scale burst. Starter repo + credits included.
What we’re building
Project: Deep Research → Structured JSON (with citations)
Input: a company product / market question
Output: schema‑validated JSON + citations pulled from multiple public web sources
Then: flip on scale mode (concurrency + bounded retries + basic metrics) and see if it holds up
This is a build night, not a talk. You’ll ship something you can fork and deploy.
The real problems we’ll solve (hands-on)
If you’ve built agents/RAG before, you already know the hard part isn’t the model—it’s the web + ops.
We’ll work through the failure modes that make “it works on my laptop” die in production:
JS-heavy pages where HTML fetch returns placeholders
Inconsistent results across session/geo (locale, consent, interstitials, A/B variants)
Blocks + friction that appear only at volume
Rate limits + burstiness (works at 1 rps, fails at 20)
Bad retry behavior (duplicate work, runaway costs, thundering herd)
Silent extraction breakage (you get wrong structured data that “looks” right)
No observability (can’t explain failures or cost per task)
You will see how the same agent goes from single-run demo → repeatable pipeline with measurable success rate.
What you’ll walk away with
A deployable repo (Node/Python starter) that includes:
Multi-source retrieval + citation capture
Schema-first extraction (Zod/Pydantic) + validation checks
“Fast path → browser fallback” strategy
Session/geo controls (when needed for consistency)
Concurrency + queue pattern (safe parallelism)
Retry/backoff that’s bounded and measurable
Basic metrics output (success rate / latency + failure reasons)
Plus: credits to keep running it after the event.
Who should come
Advanced devs building agents, research workflows, monitoring/enrichment, competitive intel
AI engineers, full-stack devs, founders shipping agentic features
If you’ve said “it worked yesterday” about a web pipeline, you’ll fit right in
Not intended for “Scraping 101.”