IntegrationcURL / REST

Call 9 LLM Models from Any Language with One REST API

Use cURL, fetch, or any HTTP client to access GPT, Claude, Gemini, DeepSeek, and more with one API key.

You only pay credits per request. No monthly subscription. Paid credits never expire.

Replace multiple AI subscriptions with one wallet that includes routing, failover, and optimization.

Why teams start here first
No monthly subscription
Pay-as-you-go credits
Start with trial credits, then buy only what you consume.
Failover safety
Production-ready routing
Auto fallback across providers when latency, quality, or reliability changes.
Data control
Your policy, your choice
BYOK and zero-retention mode keep training and storage scope explicit.
Single API experience
One key, multi-provider access
Use Chat/Compare/Blend/Judge/Failover from one dashboard.
Quick start
curl -X POST https://llmwise.ai/api/v1/chat

Full example

cURL / REST
# Basic chat completion — works with any model
curl -X POST https://llmwise.ai/api/v1/chat \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LLMWISE_API_KEY" \
  -d '{
    "model": "gpt-5.2",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the difference between REST and GraphQL?"}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

# Streaming — tokens arrive as Server-Sent Events
curl -X POST https://llmwise.ai/api/v1/chat \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LLMWISE_API_KEY" \
  -N \
  -d '{
    "model": "claude-sonnet-4.5",
    "messages": [{"role": "user", "content": "Write a bash script to monitor disk usage."}],
    "stream": true
  }'

# Switch models — just change the model field
curl -X POST https://llmwise.ai/api/v1/chat \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LLMWISE_API_KEY" \
  -d '{
    "model": "deepseek-v3",
    "messages": [{"role": "user", "content": "Explain monads in simple terms."}]
  }'
Evidence snapshot

cURL / REST integration overview

Everything you need to integrate LLMWise's multi-model API into your cURL / REST project.

Setup steps
5
to first API call
Features
8
capabilities included
Models available
9
via single endpoint
Starter credits
40
trial 7 days · paid credits never expire

What you get

+Simple JSON request and response format
+Works from any language or tool — cURL, wget, Postman, fetch, httpx
+9 models from 7 providers behind a single endpoint
+Server-Sent Events (SSE) streaming with stream: true
+Bearer token authentication with your LLMWise API key
+No SDK installation required — just HTTP
+Chat API with system/user/assistant message roles
+Response includes usage metadata: tokens, latency, cost

Step-by-step integration

1Set your API key as an environment variable

Export your LLMWise API key so you can reference it in cURL commands. This keeps your key out of shell history if you use the variable form.

export LLMWISE_API_KEY="your_api_key_here"
2Send a basic chat completion request

POST a JSON body to the /chat endpoint: specify a model, an array of messages, and optional parameters like temperature and max_tokens.

curl -X POST https://llmwise.ai/api/v1/chat \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LLMWISE_API_KEY" \
  -d '{
    "model": "gpt-5.2",
    "messages": [
      {"role": "user", "content": "Explain the difference between TCP and UDP."}
    ],
    "temperature": 0.7
  }'
3Enable streaming for real-time responses

Add "stream": true to receive tokens as Server-Sent Events. Each event contains a JSON chunk with the delta text. Use the -N flag in cURL to disable output buffering so chunks appear immediately.

curl -X POST https://llmwise.ai/api/v1/chat \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LLMWISE_API_KEY" \
  -N \
  -d '{
    "model": "gemini-3-flash",
    "messages": [{"role": "user", "content": "Write a SQL query to find duplicate rows."}],
    "stream": true
  }'
4Parse the JSON response

In non-streaming mode, the response includes top-level content plus usage and billing metadata.

# Response format (non-stream)
{
  "id": "request_uuid",
  "model": "gpt-5.2",
  "content": "TCP is a connection-oriented protocol...",
  "prompt_tokens": 24,
  "completion_tokens": 156,
  "latency_ms": 812,
  "cost": 0.001234,
  "credits_charged": 1,
  "credits_remaining": 120,
  "finish_reason": "stop",
  "mode": "chat"
}
5Test different models with the same request

Change the "model" field to any supported model ID. The rest of the request stays identical. This makes it trivial to benchmark models from a shell script.

# Loop through models and compare outputs
for MODEL in gpt-5.2 claude-sonnet-4.5 gemini-3-flash deepseek-v3; do
  echo "=== $MODEL ==="
  curl -s -X POST https://llmwise.ai/api/v1/chat \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $LLMWISE_API_KEY" \
    -d "{
      \"model\": \"$MODEL\",
      \"messages\": [{\"role\": \"user\", \"content\": \"What is a bloom filter?\"}],
      \"max_tokens\": 200
    }" | python3 -c "import sys,json; print(json.load(sys.stdin)['content'])"
  echo ""
done

Common questions

What is the base URL for the LLMWise REST API?
The base URL is https://llmwise.ai/api/v1. The primary chat endpoint is POST https://llmwise.ai/api/v1/chat.
Can I use the LLMWise REST API from languages without an OpenAI SDK?
Yes. The API uses standard HTTP with JSON payloads, so you can call it from any language with an HTTP client: Go (net/http), Rust (reqwest), Java (HttpClient), Ruby (net/http), PHP (cURL), or even shell scripts. No SDK is required.
How does streaming work with the LLMWise REST API?
Set "stream": true in the request body. The response uses Server-Sent Events (SSE) format. Each line starts with 'data: ' followed by a JSON chunk containing the delta text. The stream ends with 'data: [DONE]'. Most HTTP clients have built-in SSE support.
What HTTP status codes does the LLMWise API return?
200 for success, 400 for invalid requests (wrong model ID, missing messages), 401 for authentication failures, 402 for insufficient credits, and 502 for upstream model errors. Error responses include a JSON body with a descriptive message.

One wallet, enterprise AI controls built in

You only pay credits per request. No monthly subscription. Paid credits never expire.

Replace multiple AI subscriptions with one wallet that includes routing, failover, and optimization.

Chat, Compare, Blend, Judge, MeshPolicy routing + replay labFailover without extra subscriptions