Use LLMWise inside LangChain via a tiny Runnable wrapper. Switch models, enable Auto routing, and add failover without coupling your app to a single provider.
You only pay credits per request. No monthly subscription. Paid credits never expire.
Replace multiple AI subscriptions with one wallet that includes routing, failover, and optimization.
pip install llmwise langchain-core
import os
from llmwise import LLMWise
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnableLambda
client = LLMWise(os.environ["LLMWISE_API_KEY"])
prompt = PromptTemplate.from_template(
"You are a concise technical writer.\n\nQuestion: {question}"
)
def call_llmwise(text: str) -> str:
resp = client.chat(
model="auto",
messages=[{"role": "user", "content": text}],
max_tokens=512,
)
return resp["content"]
llm = RunnableLambda(call_llmwise)
chain = prompt | llm
print(chain.invoke({"question": "What are the SOLID principles in software engineering?"}))
# Swap models for A/B tests by changing one string:
# resp = client.chat(model="claude-sonnet-4.5", messages=[...])Everything you need to integrate LLMWise's multi-model API into your LangChain project.
Install the official LLMWise SDK plus LangChain core. You’ll wrap LLMWise as a Runnable so the rest of your chain stays the same.
pip install llmwise langchain-core
Store your API key in an environment variable so it stays out of source code.
export LLMWISE_API_KEY="your_api_key_here"
Use a small RunnableLambda that accepts text and returns the LLMWise response content.
import os
from llmwise import LLMWise
from langchain_core.runnables import RunnableLambda
client = LLMWise(os.environ["LLMWISE_API_KEY"])
def call_llmwise(text: str) -> str:
resp = client.chat(model="auto", messages=[{"role": "user", "content": text}])
return resp["content"]
llm = RunnableLambda(call_llmwise)Compose your prompt template and pipe it into the runnable. Swap models or routing in one place (the wrapper).
from langchain_core.prompts import PromptTemplate
prompt = PromptTemplate.from_template("Answer concisely: {question}")
chain = prompt | llm
print(chain.invoke({"question": "Explain dependency injection."}))For production traffic, add a fallback chain so requests can retry on another model when a provider is saturated or failing.
resp = client.chat(
model="gpt-5.2",
routing={"strategy": "rate-limit", "fallback": ["claude-sonnet-4.5", "gemini-3-flash"]},
messages=[{"role": "user", "content": "Summarize this incident report."}],
)
print(resp["content"])You only pay credits per request. No monthly subscription. Paid credits never expire.
Replace multiple AI subscriptions with one wallet that includes routing, failover, and optimization.