VOL. III · 2026 Šimon prague + london · est. 2012
Menu
GenAI Meetup (Prague, Czech Republic)

The Eval Flywheel: From "Works on My Laptop" to Systematic Quality

Most teams shipping GenAI products have no evaluation system. They have vibes, a few saved prompts, and hope. This talk starts with observability: if you can't see what's happening in production, nothing else matters. From there, we build the eval flywheel — a practical pattern where production observability feeds error analysis, error analysis generates eval cases, and eval cases prevent recurrence.

Slides The Eval Flywheel: From "Works on My Laptop" to Systematic Quality

Want this talk for your audience? Invite me to speak ↗