# LLMWise > Multi-model LLM API orchestration platform. One API key to access GPT, Claude, Gemini, DeepSeek, Llama, Mistral, Grok, and 25+ more models. Compare outputs side-by-side, blend the best parts, let AI judge, and auto-failover with circuit breakers. OpenAI-style messages, credit-based pay-per-use, no subscription. Base URL: https://llmwise.ai API base: https://llmwise.ai/api/v1 Auth: Bearer token (API key with `mm_sk_` prefix) or Clerk JWT Streaming: Server-Sent Events (SSE) Compatibility: OpenAI-style messages (role + content) ## API Endpoints - `POST /api/v1/chat` — Single-model chat (1 credit). OpenAI-style messages + SSE streaming. Supports model="auto" for heuristic routing. - `POST /api/v1/compare` — Multi-model comparison (3 credits). Same prompt to 2-9 models simultaneously, streamed side-by-side. - `POST /api/v1/blend` — Multi-model synthesis (4 credits). 6 strategies: consensus, council, best_of, chain, moa (multi-layer), self_moa (single-model diversity). - `POST /api/v1/judge` — AI evaluation (5 credits). 2-4 contestants compete, judge model scores 0-10 and ranks. - `POST /api/v1/chat` with `routing` — Mesh failover (1 credit). Circuit breaker auto-failover on 429/500/timeout. ## Blend Strategies - **consensus** (default): Combine strongest points, resolve contradictions. 2-6 models, 1 layer. - **council**: Structured deliberation — agreements, disagreements, follow-ups. 2-6 models, 1 layer. - **best_of**: Pick best response, enhance with others. 2-6 models, 1 layer. - **chain**: Iterative sequential integration. 2-6 models, 1 layer. - **moa**: Mixture-of-Agents multi-layer refinement. 2-6 models, 1-3 layers. Models see previous layer answers. - **self_moa**: Single model, 2-8 diverse candidates via temperature variation + agent prompts. 1 model, 1 layer. ## SSE Streaming Format All endpoints stream via SSE. Each line: `data: {JSON}`. Terminator: `data: [DONE]`. Standard chunk: `{model, delta, done, latency_ms, content_length}` Final chunk adds: `{ttft_ms, prompt_tokens, completion_tokens, tokens_per_second, cost, finish_reason, full_content}` Mesh events: `route` (trying/failed/skipped), `chunk` (content), `trace` (summary with final_model, attempts, total_ms) ## Error Codes - 400: Bad Request (invalid body, unknown model) - 401: Unauthorized (missing/invalid auth) - 402: Payment Required (insufficient credits — response: {error, credits, required}) - 429: Too Many Requests (rate limited — check Retry-After header) - 502: Bad Gateway (upstream provider error) ## Rate Limits Token bucket, 60s window. Headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, Retry-After. Buckets: chat=90, compare=45, blend=30, judge=30, uploads=30, default=180 (per 60s). Multipliers: paid=1.5x, free=0.6x. IP limits: free=120/60s, paid=360/60s. ## Docs - [Getting Started](https://llmwise.ai/docs/getting-started): Quickstart guide — API key setup, first request, streaming - [Dashboard User Guide](https://llmwise.ai/docs/dashboard-user-guide): Web UI walkthrough — chat, compare, settings - [Authentication & API Keys](https://llmwise.ai/docs/authentication-and-api-keys): Auth methods, API key management, BYOK - [Chat API Reference](https://llmwise.ai/docs/chat-api-reference): POST /api/v1/chat — models, parameters, streaming format - [Compare, Blend & Judge Reference](https://llmwise.ai/docs/compare-blend-judge-reference): Multi-model endpoints — compare, blend, judge - [Blend Strategies & Algorithms](https://llmwise.ai/docs/blend-strategies-and-algorithms): Deep dive into all 6 blend strategies, MoA, circuit breaker, auto-router, optimization scoring - [Mesh Mode Tutorial](https://llmwise.ai/docs/mesh-mode-tutorial): Failover routing, circuit breakers, strategies - [Billing & Credits](https://llmwise.ai/docs/billing-and-credits): Credit system, pricing, auto top-up, settlement - [Rate Limits & Reliability](https://llmwise.ai/docs/rate-limits-and-reliability): Rate limiting, concurrency, error handling - [Privacy, Security & Data Controls](https://llmwise.ai/docs/privacy-security-and-data-controls): Zero-retention mode, data policies, BYOK encryption ## Guides - [Replay Lab Tutorial](https://llmwise.ai/docs/replay-lab-tutorial): Test model switches against historical requests - [Regression Testing](https://llmwise.ai/docs/regression-testing-tutorial): Automated quality checks across model versions - [Semantic Memory API](https://llmwise.ai/docs/semantic-memory-api): Per-user conversation context via embeddings - [Webhooks & Sync](https://llmwise.ai/docs/webhooks-and-sync): Clerk and Stripe webhook integration - [API Explorer](https://llmwise.ai/docs/api-explorer-playground): Interactive API playground ## Optional - [Full compiled docs](https://llmwise.ai/llms-full.txt): Complete platform documentation in plain text - [Machine-readable view](https://llmwise.ai/ai): Structured HTML overview for AI agents with full API schemas - [Landing page machine mode](https://llmwise.ai): Toggle "Machine" button for structured API reference