Tutorials

Prompt Regression Testing Tutorial

Create suites, run prompt regressions, schedule recurring checks, and export CSV results.

13 minUpdated 2026-02-15

Summary

Create suites, run prompt regressions, schedule recurring checks, and export CSV results.

5 deep-dive sections2 code samples

Quick Start

Start from your current production prompt/request.
Run the exact tutorial flow step-by-step once.
Measure impact in Usage before rollout.
Promote only when quality/cost/reliability metrics match target.

What this feature covers

Prebuilt prompt templates
Custom suite creation
Manual and scheduled test runs
CSV export for historical tracking

Workflow

Templates -> suite -> run -> schedule

Define

GET /optimization/test-templates
POST /optimization/test-suites
Set models and cases

Execute

POST /optimization/test-suites/{suite_id}/run
Collect scores and latency
Store run artifacts

Automate

POST /optimization/regression-schedules
POST /optimization/regression-schedules/{id}/run
GET /optimization/test-runs/{id}/csv

Core endpoints

Method	Path	Purpose
GET	/api/v1/optimization/test-templates	List prebuilt templates
POST	/api/v1/optimization/test-suites	Create suite
PUT	/api/v1/optimization/test-suites/{suite_id}	Update suite
POST	/api/v1/optimization/test-suites/{suite_id}/run	Run suite now
GET	/api/v1/optimization/test-runs	List run history
GET	/api/v1/optimization/test-runs/{run_id}/csv	Download run CSV
POST	/api/v1/optimization/regression-schedules	Create schedule
PUT	/api/v1/optimization/regression-schedules/{schedule_id}	Update schedule

Example suite payload

{
  "name": "Code quality regression",
  "description": "Weekly check for code prompts",
  "models": ["gpt-5.2", "claude-sonnet-4.5", "deepseek-v3"],
  "template_ids": ["code-review", "summarization"],
  "temperature": 0.2,
  "max_tokens": 800,
  "is_active": true
}

Example schedule payload

{
  "name": "Weekly code regression",
  "suite_id": "SUITE_UUID",
  "cadence_minutes": 10080,
  "enabled": true
}

Practical baseline

Start with one suite for your top 10 production prompts and run weekly. Expand only after your scoring rubric is stable.

Replay Lab tutorial Mesh mode tutorial Dashboard user guide

Docs Assistant

ChatKit-style guided help

Product-scoped assistant for LLMWise docs and API usage. It does not answer unrelated topics.

Replay Lab Tutorial

Blend Strategies & Orchestration Algorithms