Prompt Agent Evals
This suite validates the Prompt Intelligence Agent before it calls a model for code.
Fixtures
The fixtures live in src/lib/generation/evals/prompt-agent-cases.json and cover:
- ambiguous briefs
- conflicting style requests
- long production prompts
- reference links
- revision requests
- Chinese creative direction
- dashboard interfaces
- marketing pages
- strict brand constraints
- missing design-system input
Scoring
Each fixture checks the intended structure rather than visual taste:
- intent profile is present
- decision board captures tradeoffs and conflicts
- at least 6 design-system panels are expected
- prompt template payload is present
- prompt trace is present when evidence matters
- Prompt QA can block code generation
Run
npm run eval:prompt-agent
--live is reserved for a later model-backed runner. The default runner is fixture-only and does not call an LLM.