Tutorials

Mesh Mode Tutorial (Failover Routing)

Build resilient fallback chains that keep requests alive through rate limits and provider instability.

10 minUpdated 2026-02-15
Summary

Build resilient fallback chains that keep requests alive through rate limits and provider instability.

6 deep-dive sections1 code samples
Quick Start
  1. Start from your current production prompt/request.
  2. Run the exact tutorial flow step-by-step once.
  3. Measure impact in Usage before rollout.
  4. Promote only when quality/cost/reliability metrics match target.

When to use Mesh

Use Mesh mode for reliability-sensitive traffic where a single provider failure is not acceptable.

  • Frequent 429 bursts
  • Provider latency spikes
  • High-value requests that must complete

Mesh failover model

Primary -> fallback chain
1
Primary
Try first model
2
Failure signal
429, timeout, or recoverable provider error
3
Fallback
Switch to backup model list
4
Response
Return result and settle usage

Mesh request example

{
  "model": "gpt-5.2",
  "routing": {
    "strategy": "rate-limit",
    "fallback": ["claude-sonnet-4.5", "gemini-3-flash", "deepseek-v3"]
  },
  "messages": [
    {"role": "user", "content": "Summarize this outage in 5 bullet points."}
  ],
  "stream": true
}

Routing strategies

StrategyBest forBehavior
rate-limit429 resiliencePrefer fallback when quota/rate limits are hit
latencyP95 reductionPrefer lower-latency fallback path
costBudget controlPrefer lower-cost fallback path
round-robinTraffic distributionCycle across configured fallbacks

Circuit breaker

Mesh uses an in-memory circuit breaker per model:

  • 3 consecutive failures → circuit opens for 30 seconds
  • During open state, the model is automatically skipped
  • After 30 seconds, the circuit enters half-open state: one test request is allowed
  • Success closes the circuit; failure reopens it for another 30 seconds

Retryable status codes: 408, 409, 425, 429, 500, 502, 503, 504.

Rollout checklist

  1. Start with two-model fallback chain.
  2. Replay recent traffic before production cutover.
  3. Track fallback frequency and quality deltas.
  4. Expand chain only after behavior is stable.
Migration tip

Use Mesh first on reliability-critical endpoints, not every endpoint. This keeps complexity and spend under control while you collect failure data.

Docs Assistant

ChatKit-style guided help

Product-scoped assistant for LLMWise docs and API usage. It does not answer unrelated topics.

Sign in to ask implementation questions and get runnable snippets.

Sign in to use assistant
Previous
API Explorer Guide
Next
Replay Lab Tutorial