Name: Probefish
Author: Probefish

Create test suites for your prompts and endpoints (for example your AI Assistant)
Automated static validations (regex, JSON schema, response time)
AI-validation rules, use LLM-as-judge for quality scoring
Compare prompt execution results across GPT-4, Claude, and Gemini side-by-side. Select model fits better
Track regression history over time
Self-hosted, your API keys stay with you

If you're building with LLMs and want confidence your prompts actually work - check it out.

Can be connected to Gitlab-Ci and provides webhooks out-of-box.

Probefish 🐟 is open source now!