Tutorials
Prompt Regression Testing Tutorial
Create suites, run prompt regressions, schedule recurring checks, and export CSV results.
13 minUpdated 2026-02-15
Summary
Create suites, run prompt regressions, schedule recurring checks, and export CSV results.
5 deep-dive sections2 code samples
Quick Start
- Start from your current production prompt/request.
- Run the exact tutorial flow step-by-step once.
- Measure impact in Usage before rollout.
- Promote only when quality/cost/reliability metrics match target.
What this feature covers
- Prebuilt prompt templates
- Custom suite creation
- Manual and scheduled test runs
- CSV export for historical tracking
Workflow
Templates -> suite -> run -> schedule
Define
- GET /optimization/test-templates
- POST /optimization/test-suites
- Set models and cases
Execute
- POST /optimization/test-suites/{suite_id}/run
- Collect scores and latency
- Store run artifacts
Automate
- POST /optimization/regression-schedules
- POST /optimization/regression-schedules/{id}/run
- GET /optimization/test-runs/{id}/csv
Core endpoints
| Method | Path | Purpose |
|---|---|---|
| GET | /api/v1/optimization/test-templates | List prebuilt templates |
| POST | /api/v1/optimization/test-suites | Create suite |
| PUT | /api/v1/optimization/test-suites/{suite_id} | Update suite |
| POST | /api/v1/optimization/test-suites/{suite_id}/run | Run suite now |
| GET | /api/v1/optimization/test-runs | List run history |
| GET | /api/v1/optimization/test-runs/{run_id}/csv | Download run CSV |
| POST | /api/v1/optimization/regression-schedules | Create schedule |
| PUT | /api/v1/optimization/regression-schedules/{schedule_id} | Update schedule |
Example suite payload
{
"name": "Code quality regression",
"description": "Weekly check for code prompts",
"models": ["gpt-5.2", "claude-sonnet-4.5", "deepseek-v3"],
"template_ids": ["code-review", "summarization"],
"temperature": 0.2,
"max_tokens": 800,
"is_active": true
}
Example schedule payload
{
"name": "Weekly code regression",
"suite_id": "SUITE_UUID",
"cadence_minutes": 10080,
"enabled": true
}
Practical baseline
Start with one suite for your top 10 production prompts and run weekly. Expand only after your scoring rubric is stable.
Docs Assistant
ChatKit-style guided help
Product-scoped assistant for LLMWise docs and API usage. It does not answer unrelated topics.
Sign in to ask implementation questions and get runnable snippets.
Sign in to use assistantPrevious
Replay Lab Tutorial
Next
Blend Strategies & Orchestration Algorithms