When (& How) to Start Writing Evals
Most teams approach LLM evaluation like test-driven development: write tests first, then build. But LLMs have an infinite failure surface — you can't predict what will break. This talk argues for a different approach: deploy first, observe failures, then build evals for the patterns you've actually discovered.
¶ Want this talk for your audience? Invite me to speak ↗