IntegrationPython

Use Multiple LLM APIs in Python with One SDK

Use the official LLMWise Python SDK (or REST) to chat with multiple models and run multi-model workflows through one API key.

I want to try now View all models Open docs

You only pay credits per request. No monthly subscription. Paid credits never expire.

Replace multiple AI subscriptions with one wallet that includes routing, failover, and optimization.

First success in 60 seconds

Step 01Sign up in 10 secondsGet 40 free credits Step 02Open your dashboardCreate API key Step 03Send first requestRun a sample

Why teams start here first

No monthly subscription

Pay-as-you-go credits

Start with trial credits, then buy only what you consume.

Failover safety

Production-ready routing

Auto fallback across providers when latency, quality, or reliability changes.

Data control

Your policy, your choice

BYOK and zero-retention mode keep training and storage scope explicit.

Single API experience

One key, multi-provider access

Use Chat/Compare/Blend/Judge/Failover from one dashboard.

Quick start

pip install llmwise

Full example

Python

# pip install llmwise
# Repository: https://github.com/LLMWise-AI/llmwise-python-sdk
import os
from llmwise import LLMWise

client = LLMWise(os.environ["LLMWISE_API_KEY"])

# Chat (non-stream)
resp = client.chat(
    model="auto",
    messages=[{"role": "user", "content": "Explain quantum computing in simple terms."}],
    max_tokens=512,
)
print(resp["content"])

# Streaming chat (SSE JSON events)
for ev in client.chat_stream(
    model="claude-sonnet-4.5",
    messages=[{"role": "user", "content": "Write a Python quicksort implementation."}],
):
    if ev.get("delta"):
        print(ev["delta"], end="", flush=True)
    if ev.get("event") == "done":
        print(f"\n\ncharged={ev.get('credits_charged')} remaining={ev.get('credits_remaining')}")
        break

Evidence snapshot

Python integration overview

Everything you need to integrate LLMWise's multi-model API into your Python project.

Setup steps

to first API call

Features

capabilities included

Models available

via single endpoint

Starter credits

trial 7 days · paid credits never expire

What you get

+Official LLMWise Python SDK (httpx-based, intentionally small)

+OpenAI-style messages format (role + content)

+Chat with any model by changing one string (or use model="auto")

+Mesh failover routing (primary + fallback chain) for reliability

+Compare / Blend / Judge multi-model orchestration modes

+Streaming via Server-Sent Events (SSE) with token deltas

+Async supported via AsyncLLMWise

+Per-request usage metadata (tokens, latency, credits) for observability

Step-by-step integration

1Install the LLMWise Python SDK

Install the official llmwise package. Repo: https://github.com/LLMWise-AI/llmwise-python-sdk

pip install llmwise

2Set your LLMWise API key

export LLMWISE_API_KEY="your_api_key_here"

3Create a client

Instantiate the LLMWise client (base URL defaults to https://llmwise.ai/api/v1).

import os
from llmwise import LLMWise

client = LLMWise(os.environ["LLMWISE_API_KEY"])

4Send a basic chat request

Pass a model ID (or model="auto"). Messages are OpenAI-style role/content objects.

resp = client.chat(
    model="gemini-3-flash",
    messages=[{"role": "user", "content": "Summarize the key ideas of REST API design."}],
)
print(resp["content"])

5Stream tokens for real-time output

Use chat_stream() to receive SSE JSON events. Render ev.delta and stop on the done event.

for ev in client.chat_stream(
    model="deepseek-v3",
    messages=[{"role": "user", "content": "Explain gradient descent step by step."}],
):
    if ev.get("delta"):
        print(ev["delta"], end="", flush=True)
    if ev.get("event") == "done":
        break

6Add failover routing (Mesh mode)

Provide a fallback chain to automatically retry on 429/5xx/timeouts. Route + trace events are emitted when failover triggers.

for ev in client.chat_stream(
    model="gpt-5.2",
    routing={"strategy": "rate-limit", "fallback": ["claude-sonnet-4.5", "gemini-3-flash"]},
    messages=[{"role": "user", "content": "Summarize this support thread."}],
):
    if ev.get("event") in {"route", "trace"}:
        continue
    if ev.get("event") == "done":
        break
    if ev.get("delta"):
        print(ev["delta"], end="", flush=True)

Common questions

Can I use async Python with LLMWise?

Yes. Use AsyncLLMWise from the llmwise package. It provides async chat/compare/blend/judge helpers plus async streaming generators for SSE.

Do I need the OpenAI Python SDK?

No. LLMWise provides its own official Python SDK. Under the hood, it calls the LLMWise REST API at https://llmwise.ai/api/v1.

How do I handle errors and retries in Python?

Catch LLMWiseError for HTTP failures. For reliability, use Mesh routing (fallback chains) so transient 429/5xx/timeouts can be retried on a backup model automatically.

What Python versions are supported?

Python 3.8+ is recommended. The SDK is small and works in notebooks, servers, and containerized environments.

One wallet, enterprise AI controls built in

You only pay credits per request. No monthly subscription. Paid credits never expire.

Replace multiple AI subscriptions with one wallet that includes routing, failover, and optimization.

Chat, Compare, Blend, Judge, MeshPolicy routing + replay labFailover without extra subscriptions

Start free with 40 credits See pricing examples

TypeScript Integration Next.js Integration LangChain Integration cURL / REST Integration React Integration Migration Integration