Use cURL, fetch, or any HTTP client to access GPT, Claude, Gemini, DeepSeek, and more with one API key.
You only pay credits per request. No monthly subscription. Paid credits never expire.
Replace multiple AI subscriptions with one wallet that includes routing, failover, and optimization.
curl -X POST https://llmwise.ai/api/v1/chat
# Basic chat completion — works with any model
curl -X POST https://llmwise.ai/api/v1/chat \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LLMWISE_API_KEY" \
-d '{
"model": "gpt-5.2",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the difference between REST and GraphQL?"}
],
"temperature": 0.7,
"max_tokens": 1024
}'
# Streaming — tokens arrive as Server-Sent Events
curl -X POST https://llmwise.ai/api/v1/chat \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LLMWISE_API_KEY" \
-N \
-d '{
"model": "claude-sonnet-4.5",
"messages": [{"role": "user", "content": "Write a bash script to monitor disk usage."}],
"stream": true
}'
# Switch models — just change the model field
curl -X POST https://llmwise.ai/api/v1/chat \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LLMWISE_API_KEY" \
-d '{
"model": "deepseek-v3",
"messages": [{"role": "user", "content": "Explain monads in simple terms."}]
}'Everything you need to integrate LLMWise's multi-model API into your cURL / REST project.
Export your LLMWise API key so you can reference it in cURL commands. This keeps your key out of shell history if you use the variable form.
export LLMWISE_API_KEY="your_api_key_here"
POST a JSON body to the /chat endpoint: specify a model, an array of messages, and optional parameters like temperature and max_tokens.
curl -X POST https://llmwise.ai/api/v1/chat \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LLMWISE_API_KEY" \
-d '{
"model": "gpt-5.2",
"messages": [
{"role": "user", "content": "Explain the difference between TCP and UDP."}
],
"temperature": 0.7
}'Add "stream": true to receive tokens as Server-Sent Events. Each event contains a JSON chunk with the delta text. Use the -N flag in cURL to disable output buffering so chunks appear immediately.
curl -X POST https://llmwise.ai/api/v1/chat \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LLMWISE_API_KEY" \
-N \
-d '{
"model": "gemini-3-flash",
"messages": [{"role": "user", "content": "Write a SQL query to find duplicate rows."}],
"stream": true
}'In non-streaming mode, the response includes top-level content plus usage and billing metadata.
# Response format (non-stream)
{
"id": "request_uuid",
"model": "gpt-5.2",
"content": "TCP is a connection-oriented protocol...",
"prompt_tokens": 24,
"completion_tokens": 156,
"latency_ms": 812,
"cost": 0.001234,
"credits_charged": 1,
"credits_remaining": 120,
"finish_reason": "stop",
"mode": "chat"
}Change the "model" field to any supported model ID. The rest of the request stays identical. This makes it trivial to benchmark models from a shell script.
# Loop through models and compare outputs
for MODEL in gpt-5.2 claude-sonnet-4.5 gemini-3-flash deepseek-v3; do
echo "=== $MODEL ==="
curl -s -X POST https://llmwise.ai/api/v1/chat \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LLMWISE_API_KEY" \
-d "{
\"model\": \"$MODEL\",
\"messages\": [{\"role\": \"user\", \"content\": \"What is a bloom filter?\"}],
\"max_tokens\": 200
}" | python3 -c "import sys,json; print(json.load(sys.stdin)['content'])"
echo ""
doneYou only pay credits per request. No monthly subscription. Paid credits never expire.
Replace multiple AI subscriptions with one wallet that includes routing, failover, and optimization.