Tutorials
Mesh Mode Tutorial (Failover Routing)
Build resilient fallback chains that keep requests alive through rate limits and provider instability.
10 minUpdated 2026-02-15
Summary
Build resilient fallback chains that keep requests alive through rate limits and provider instability.
6 deep-dive sections1 code samples
Quick Start
- Start from your current production prompt/request.
- Run the exact tutorial flow step-by-step once.
- Measure impact in Usage before rollout.
- Promote only when quality/cost/reliability metrics match target.
When to use Mesh
Use Mesh mode for reliability-sensitive traffic where a single provider failure is not acceptable.
- Frequent 429 bursts
- Provider latency spikes
- High-value requests that must complete
Mesh failover model
Primary -> fallback chain
1
Primary
Try first model
2
Failure signal
429, timeout, or recoverable provider error
3
Fallback
Switch to backup model list
4
Response
Return result and settle usage
Mesh request example
{
"model": "gpt-5.2",
"routing": {
"strategy": "rate-limit",
"fallback": ["claude-sonnet-4.5", "gemini-3-flash", "deepseek-v3"]
},
"messages": [
{"role": "user", "content": "Summarize this outage in 5 bullet points."}
],
"stream": true
}
Routing strategies
| Strategy | Best for | Behavior |
|---|---|---|
| rate-limit | 429 resilience | Prefer fallback when quota/rate limits are hit |
| latency | P95 reduction | Prefer lower-latency fallback path |
| cost | Budget control | Prefer lower-cost fallback path |
| round-robin | Traffic distribution | Cycle across configured fallbacks |
Circuit breaker
Mesh uses an in-memory circuit breaker per model:
- 3 consecutive failures → circuit opens for 30 seconds
- During open state, the model is automatically skipped
- After 30 seconds, the circuit enters half-open state: one test request is allowed
- Success closes the circuit; failure reopens it for another 30 seconds
Retryable status codes: 408, 409, 425, 429, 500, 502, 503, 504.
Rollout checklist
- Start with two-model fallback chain.
- Replay recent traffic before production cutover.
- Track fallback frequency and quality deltas.
- Expand chain only after behavior is stable.
Migration tip
Use Mesh first on reliability-critical endpoints, not every endpoint. This keeps complexity and spend under control while you collect failure data.
Docs Assistant
ChatKit-style guided help
Product-scoped assistant for LLMWise docs and API usage. It does not answer unrelated topics.
Sign in to ask implementation questions and get runnable snippets.
Sign in to use assistantPrevious
API Explorer Guide
Next
Replay Lab Tutorial