Back to Blog

Probefish v1.0.0 "Marlin": Production-Ready LLM Testing

Probefish v1.0.0 "Marlin": Production-Ready LLM Testing

We're excited to announce Probefish v1.0.0, codenamed Marlin. This release marks our transition to a production-ready platform with SaaS offerings while keeping self-hosted deployments free forever.

What's New

SaaS Launch with Self-Hosted Freedom

Probefish is now available as a managed cloud service with flexible plans. But here's the important part: self-hosted remains free forever. No feature gates, no artificial limitations. Deploy on your own infrastructure and get the full platform.

Why both options?

  • Cloud: Zero setup, automatic updates, we handle the infrastructure
  • Self-hosted: Full control, data stays on your servers, no ongoing costs

Per-Test-Case Validation Rules

Previously, validation rules applied to an entire test suite. Now you can set different validation rules for each test case.

Why it matters:

Different inputs need different validation. A test case for "Write a haiku" needs length constraints. A test case for "Explain quantum computing" needs keyword checks. Now each can have its own rules.

Example:

Test Case Validation Rules
Generate haiku Max length: 100, Contains line breaks
Explain REST APIs Contains "HTTP", Contains "endpoint"
Translate to French Regex: French characters present

No more one-size-fits-all validation.


Real-Time Streaming Results (SSE)

Waiting for all tests to complete before seeing results? Not anymore.

Probefish now uses Server-Sent Events (SSE) to stream results as each test case completes. Watch your test run progress in real-time:

Test 1/10: customer_greeting .......... PASSED (1.2s)
Test 2/10: product_query .............. PASSED (0.8s)
Test 3/10: refund_request ............. FAILED (validation)
Test 4/10: ... [running]

Benefits:

  • See failures immediately - don't wait for the full suite
  • Monitor long-running test suites without timeouts
  • Heartbeat keeps connections alive for extended runs

Smart JSON Detection

The "Edit Test Case" modal now automatically detects JSON in input fields and formats it accordingly.

Paste this:

{"user_id": "123", "query": "help"}

The editor recognizes it's JSON and:

  • Applies syntax highlighting
  • Validates JSON structure
  • Preserves formatting on save

No more switching between "text" and "JSON" modes manually.


URL-Based Tab Persistence

Ever refreshed the page and lost your place? Fixed.

Tab selections in test suites and projects now persist in the URL hash:

/projects/my-project/test-suites/auth-tests#results

Benefits:

  • Share links to specific tabs with teammates
  • Browser back/forward works naturally
  • Refresh doesn't reset your view

Duplicate Validation Rules

Found a validation rule you want to reuse? Click duplicate instead of recreating it from scratch.

Works for:

  • Contains / Excludes rules
  • Length constraints
  • Regex patterns
  • Response time limits

Small feature, big time saver when building comprehensive test suites.


Full SMTP Support + Magic Links

Email infrastructure got a complete overhaul:

SMTP Support: Connect any SMTP server - Gmail, SendGrid, AWS SES, your corporate mail server. No vendor lock-in.

Magic Links: Passwordless authentication is now fully supported. Users click a link in their email to sign in - no password required.

Configuration:

SMTP_HOST=smtp.example.com
SMTP_PORT=587
SMTP_USER=your-user
SMTP_PASS=your-password
SMTP_FROM=noreply@example.com

We removed the Resend dependency - you control your email delivery.


Bug Fixes

  • Test case checkbox fix: Previously, newly added test cases didn't show selection checkboxes until the suite was saved and the page reloaded. Now they appear immediately.
  • Various UI polish and minor improvements

What's Next

Version 1.1.0 is already in development with:

  • Multi-turn conversation testing - Test chat flows, not just single prompts
  • Human-readable identifiers - Use slugs instead of MongoDB IDs in CLI
  • UI skeleton loaders - Better perceived performance

Thank You

Marlin represents month of work toward a production-ready platform. Whether you're using Probefish Cloud or self-hosting, we're committed to building the best LLM testing tool available.

Questions? Issues? Open a GitHub issue or reach out directly.

Happy testing!