If you're not following eval-driven development, you're flying blind. Sohan, our founding engineer, reveals how we use lightweight evals to ship fast improvements to our AI product with confidence.
It's the battle of the AI coding agents. We put Tusk, Cursor, and Claude Code to the test to evaluate their unit test generation quality and ability to detect latent bugs on a TypeScript codebase.