Automatic quality scoring
Every task completion gets a quality observation within 1-10 seconds. View summary and 0-100 score in the tracing UI.
- LLM-as-a-judge on every completion
- Score from 0-100 with detailed observations
- Paragraph summary of completion quality









