IntegrationPython

Use Multiple LLM APIs in Python with One SDK

Use the official LLMWise Python SDK (or REST) to chat with multiple models and run multi-model workflows through one API key.

You only pay credits per request. No monthly subscription. Paid credits never expire.

Replace multiple AI subscriptions with one wallet that includes routing, failover, and optimization.

Why teams start here first
No monthly subscription
Pay-as-you-go credits
Start with trial credits, then buy only what you consume.
Failover safety
Production-ready routing
Auto fallback across providers when latency, quality, or reliability changes.
Data control
Your policy, your choice
BYOK and zero-retention mode keep training and storage scope explicit.
Single API experience
One key, multi-provider access
Use Chat/Compare/Blend/Judge/Failover from one dashboard.
Quick start
pip install llmwise

Full example

Python
# pip install llmwise
# Repository: https://github.com/LLMWise-AI/llmwise-python-sdk
import os
from llmwise import LLMWise

client = LLMWise(os.environ["LLMWISE_API_KEY"])

# Chat (non-stream)
resp = client.chat(
    model="auto",
    messages=[{"role": "user", "content": "Explain quantum computing in simple terms."}],
    max_tokens=512,
)
print(resp["content"])

# Streaming chat (SSE JSON events)
for ev in client.chat_stream(
    model="claude-sonnet-4.5",
    messages=[{"role": "user", "content": "Write a Python quicksort implementation."}],
):
    if ev.get("delta"):
        print(ev["delta"], end="", flush=True)
    if ev.get("event") == "done":
        print(f"\n\ncharged={ev.get('credits_charged')} remaining={ev.get('credits_remaining')}")
        break
Evidence snapshot

Python integration overview

Everything you need to integrate LLMWise's multi-model API into your Python project.

Setup steps
6
to first API call
Features
8
capabilities included
Models available
9
via single endpoint
Starter credits
40
trial 7 days · paid credits never expire

What you get

+Official LLMWise Python SDK (httpx-based, intentionally small)
+OpenAI-style messages format (role + content)
+Chat with any model by changing one string (or use model="auto")
+Mesh failover routing (primary + fallback chain) for reliability
+Compare / Blend / Judge multi-model orchestration modes
+Streaming via Server-Sent Events (SSE) with token deltas
+Async supported via AsyncLLMWise
+Per-request usage metadata (tokens, latency, credits) for observability

Step-by-step integration

1Install the LLMWise Python SDK

Install the official llmwise package. Repo: https://github.com/LLMWise-AI/llmwise-python-sdk

pip install llmwise
2Set your LLMWise API key

Sign up at llmwise.ai to get your API key. Store it as an environment variable so it stays out of your source code.

export LLMWISE_API_KEY="your_api_key_here"
3Create a client

Instantiate the LLMWise client (base URL defaults to https://llmwise.ai/api/v1).

import os
from llmwise import LLMWise

client = LLMWise(os.environ["LLMWISE_API_KEY"])
4Send a basic chat request

Pass a model ID (or model="auto"). Messages are OpenAI-style role/content objects.

resp = client.chat(
    model="gemini-3-flash",
    messages=[{"role": "user", "content": "Summarize the key ideas of REST API design."}],
)
print(resp["content"])
5Stream tokens for real-time output

Use chat_stream() to receive SSE JSON events. Render ev.delta and stop on the done event.

for ev in client.chat_stream(
    model="deepseek-v3",
    messages=[{"role": "user", "content": "Explain gradient descent step by step."}],
):
    if ev.get("delta"):
        print(ev["delta"], end="", flush=True)
    if ev.get("event") == "done":
        break
6Add failover routing (Mesh mode)

Provide a fallback chain to automatically retry on 429/5xx/timeouts. Route + trace events are emitted when failover triggers.

for ev in client.chat_stream(
    model="gpt-5.2",
    routing={"strategy": "rate-limit", "fallback": ["claude-sonnet-4.5", "gemini-3-flash"]},
    messages=[{"role": "user", "content": "Summarize this support thread."}],
):
    if ev.get("event") in {"route", "trace"}:
        continue
    if ev.get("event") == "done":
        break
    if ev.get("delta"):
        print(ev["delta"], end="", flush=True)

Common questions

Can I use async Python with LLMWise?
Yes. Use AsyncLLMWise from the llmwise package. It provides async chat/compare/blend/judge helpers plus async streaming generators for SSE.
Do I need the OpenAI Python SDK?
No. LLMWise provides its own official Python SDK. Under the hood, it calls the LLMWise REST API at https://llmwise.ai/api/v1.
How do I handle errors and retries in Python?
Catch LLMWiseError for HTTP failures. For reliability, use Mesh routing (fallback chains) so transient 429/5xx/timeouts can be retried on a backup model automatically.
What Python versions are supported?
Python 3.8+ is recommended. The SDK is small and works in notebooks, servers, and containerized environments.

One wallet, enterprise AI controls built in

You only pay credits per request. No monthly subscription. Paid credits never expire.

Replace multiple AI subscriptions with one wallet that includes routing, failover, and optimization.

Chat, Compare, Blend, Judge, MeshPolicy routing + replay labFailover without extra subscriptions