- Grok & DeepSeek provider support
- Run individual test cases
- Multi-select and run specific tests
- Drag-to-reorder test cases
- Suspend/resume tests without deleting
- LLM Judge works with all providers now
Your prompts deserve better QA than "looks good to me."