IntegrationJava

Integrate Multiple LLM APIs in Java with One SDK

Use the official LLMWise Java SDK to call multiple AI models with one API key. CompletableFuture async, SSE streaming, Spring Boot integration, and built-in failover.

Credit-based pay-per-use with token-settled billing. No monthly subscription. Paid credits never expire.

Replace multiple AI subscriptions with one wallet that includes routing, failover, and optimization.

Why teams start here first
No monthly subscription
Pay-as-you-go credits
Start with trial credits, then buy only what you consume.
Failover safety
Production-ready routing
Auto fallback across providers when latency, quality, or reliability changes.
Data control
Your policy, your choice
BYOK and zero-retention mode keep training and storage scope explicit.
Single API experience
One key, multi-provider access
Use Chat/Compare/Blend/Judge/Failover from one dashboard.
Quick start
<!-- Maven -->
<dependency>
  <groupId>ai.llmwise</groupId>
  <artifactId>llmwise-java</artifactId>
  <version>1.0.0</version>
</dependency>

Full example

Java
// Maven: ai.llmwise:llmwise-java:1.0.0
import ai.llmwise.LLMWiseClient;
import ai.llmwise.model.ChatRequest;
import ai.llmwise.model.ChatResponse;
import ai.llmwise.model.Message;

import java.util.List;
import java.util.concurrent.CompletableFuture;

public class QuickStart {
    public static void main(String[] args) throws Exception {
        var client = LLMWiseClient.builder()
            .apiKey(System.getenv("LLMWISE_API_KEY"))
            .build();

        // Synchronous chat request
        var request = ChatRequest.builder()
            .model("auto")
            .messages(List.of(
                Message.user("Explain the Java memory model in simple terms.")
            ))
            .maxTokens(512)
            .build();

        ChatResponse resp = client.chat(request);
        System.out.println(resp.getContent());

        // Async streaming with CompletableFuture
        var streamReq = ChatRequest.builder()
            .model("claude-sonnet-4.5")
            .messages(List.of(
                Message.user("Write a thread-safe singleton in Java.")
            ))
            .stream(true)
            .build();

        client.chatStream(streamReq, event -> {
            if (event.getDelta() != null) {
                System.out.print(event.getDelta());
            }
            if (event.isDone()) {
                System.out.printf("%nCredits charged: %d%n",
                    event.getCreditsCharged());
            }
        }).get(); // block until stream completes
    }
}
Evidence snapshot

Java integration overview

Everything you need to integrate LLMWise's multi-model API into your Java project.

Setup steps
6
to first API call
Features
8
capabilities included
Models available
9
via single endpoint
Starter credits
20
free credits never expire

What you get

+Official LLMWise Java 17+ SDK with builder pattern API
+Async streaming via CompletableFuture and event callbacks
+Spring Boot starter with auto-configuration and properties binding
+Mesh failover routing with automatic retry on 429/5xx/timeouts
+Typed model classes with Jackson serialization (ChatRequest, ChatResponse, Message)
+Thread-safe client instance backed by java.net.http.HttpClient
+Connection pooling and HTTP/2 support via java.net.http.HttpClient
+Observability hooks (request/response interceptors) for logging and metrics

Step-by-step integration

1Add the LLMWise Maven dependency

Add the official LLMWise Java SDK to your pom.xml or build.gradle. Requires Java 17 or later.

<!-- Maven -->
<dependency>
  <groupId>ai.llmwise</groupId>
  <artifactId>llmwise-java</artifactId>
  <version>1.0.0</version>
</dependency>

// Gradle
implementation 'ai.llmwise:llmwise-java:1.0.0'
2Set your LLMWise API key

Store your API key as an environment variable or in application.properties for Spring Boot.

export LLMWISE_API_KEY="your_api_key_here"

# Or in application.properties (Spring Boot):
# llmwise.api-key=your_api_key_here
3Create a client instance

Build a client using the builder pattern. The client is thread-safe and should be created once and shared across your application.

var client = LLMWiseClient.builder()
    .apiKey(System.getenv("LLMWISE_API_KEY"))
    .build();
4Send a basic chat request

Build a ChatRequest with a model ID and messages. The response includes content, token counts, latency, and credits metadata.

var request = ChatRequest.builder()
    .model("gpt-5.2")
    .messages(List.of(
        Message.system("You are a senior Java architect."),
        Message.user("When should I use records vs classes in Java?")
    ))
    .maxTokens(512)
    .build();

ChatResponse resp = client.chat(request);
System.out.println(resp.getContent());
5Stream tokens with an event callback

Use client.chatStream() with a callback to receive SSE events as they arrive. The method returns a CompletableFuture that completes when the stream ends.

var streamReq = ChatRequest.builder()
    .model("deepseek-v3")
    .messages(List.of(Message.user("Implement a LRU cache in Java.")))
    .stream(true)
    .build();

client.chatStream(streamReq, event -> {
    if (event.getDelta() != null) {
        System.out.print(event.getDelta());
    }
}).get();
6Use Compare mode for multi-model evaluation

Send the same prompt to multiple models and compare their outputs. Useful for A/B testing and evaluation pipelines.

var compareReq = CompareRequest.builder()
    .models(List.of("gpt-5.2", "claude-sonnet-4.5", "gemini-3-flash"))
    .messages(List.of(
        Message.user("Explain the SOLID principles with Java examples.")
    ))
    .build();

var compareResp = client.compare(compareReq);
for (var r : compareResp.getResponses()) {
    System.out.printf("[%s]: %s%n%n", r.getModel(), r.getContent());
}

Common questions

Does the Java SDK work with Spring Boot?
Yes. The SDK includes a Spring Boot starter that auto-configures the LLMWiseClient bean from application.properties. Just add the dependency and set llmwise.api-key in your properties file. Inject the client with @Autowired.
Is the client thread-safe?
Yes. The LLMWiseClient is backed by java.net.http.HttpClient which is thread-safe. Create one instance at application startup and share it across all threads and request handlers.
How do I handle errors in Java?
The SDK throws LLMWiseException for HTTP errors. Use getStatusCode() to check for 401 (auth), 402 (insufficient credits), 400 (validation), or 502 (model error). Wrap calls in try-catch blocks or handle via CompletableFuture.exceptionally().
What Java versions are supported?
Java 17 or later is required. The SDK uses java.net.http.HttpClient (introduced in Java 11) and other modern APIs like records and sealed interfaces available in Java 17+.

One wallet, enterprise AI controls built in

Credit-based pay-per-use with token-settled billing. No monthly subscription. Paid credits never expire.

Replace multiple AI subscriptions with one wallet that includes routing, failover, and optimization.

Chat, Compare, Blend, Judge, MeshPolicy routing + replay labFailover without extra subscriptions
Get LLM insights in your inbox

Pricing changes, new model launches, and optimization tips. No spam.