OpenAI GPT-4.5 preview
We have added support for the new GPT-4.5 model.
- openai/gpt-4.5-preview
Claude 3.7 Sonnet
We have added support for the new Sonnet model.
- anthropic/claude-3.7-sonnet
- anthropic/claude-3.7-sonnet-20250219
Thinking
In order to use the new Thinking mode in Claude 3.7, you can do something like this:
import asyncio
import os
from opperai import AsyncOpper
from opperai.types import CallConfiguration
opper = AsyncOpper()
async def main():
result, _ = await opper.call(
name="respond",
model="anthropic/claude-3.7-sonnet",
input="What is the capital of Sweden?",
configuration=CallConfiguration(
model_parameters={
"thinking": {
"type": "enabled",
"budget_tokens": 1024,
},
}
),
)
print(result)
asyncio.run(main())
Embeddings API
The API now supports getting embeddings for arbitrary input. While our indexes are the most straightforward way of using external knowledge for RAG use-cases and other things, this provide advanced users greater control over embeddings for custom use-cases.
Example: Input as string
curl -X POST "https://api.opper.ai/v1/embeddings" \
-H "x-opper-api-key: op-your-key-here" \
-H "Content-Type: application/json" \
-d '{
"model": "azure/text-embedding-3-large",
"input": "The text"
}'
Input as list of strings
curl -X POST "https://api.opper.ai/v1/embeddings" \
-H "x-opper-api-key: op-your-key-here" \
-H "Content-Type: application/json" \
-d '{
"model": "azure/text-embedding-3-large",
"input": ["First text", "Second text", "Third text"]
}'
Available embedding models
- azure/text-embedding-ada-002
- azure/text-embedding-3-large
- azure/text-embedding-3-large-1536
- openai/text-embedding-ada-002
- openai/text-embedding-3-large
- openai/text-embedding-3-small
- opper/e5-mistral-7b-instruct
Gemini 2.0: Flash-Lite
We have added support for Gemini 2.0 Flash-Lite, hosted by Google Cloud Platform (US). You can access it in Opper using the model name:
- gcp/gemini-2.0-flash-lite-preview-02-05
OpenAI API/SDKs Compatibility Layer
We have added an OpenAI compatibility layer that allows you to use Opper models with the OpenAI API and SDKs. This gives you the ability to use any model provided by Opper in any project that uses the OpenAI API/SDKs. The compatibility layer supports additional Opper functionality through extra body arguments:
fallback_models
: A list of models to use if the primary model is not availabletags
: A dictionary of tags to add to the requestspan_uuid
: The UUID of the span to add to the requestevaluate
: Whether to evaluate the generation or not
Python example using these features:
import os
from openai import OpenAI
from opperai import Opper
opper = Opper()
client = OpenAI(
base_url="https://api.opper.ai/compat/openai",
api_key="-", # must not be blank
default_headers={"x-opper-api-key": os.getenv("OPPER_API_KEY")},
)
with opper.spans.start("reverse-name") as span:
response = client.chat.completions.create(
model="gorq/deepseek-r1-distill-llama-70", # This model is not available since provider is called "gorq" and not "groq"
messages=[
{"role": "user", "content": "What is the capital of France? Please reverse the name before answering."}
],
extra_body={
"fallback_models": [
"groq/deepseek-r1-distill-llama-70b",
],
"tags": {
"user_id": "123",
},
"span_uuid": str(span.uuid),
"evaluate": False,
}
)
Node example using these features:
import { OpenAI } from "openai";
import OpperAI from "opperai";
const opper = new OpperAI();
const client = new OpenAI({
baseURL: "https://api.opper.ai/compat/openai",
apiKey: "OPPER_API_KEY",
defaultHeaders: { "x-opper-api-key": "OPPER_API_KEY" },
});
async function main() {
const trace = await opper.traces.start({
name: "node-sdk/using-the-openai-sdk",
input: "What is the capital of France? Please reverse the name before answering.",
});
const completion = await client.chat.completions.create({
model: "openai/gpt-4o-mini",
messages: [
{
role: "user",
content: "What is the capital of France? Please reverse the name before answering.",
},
],
// @ts-expect-error These are Opper specific params.
// fallback_models: ["openai/gpt-4o-mini"],
span_uuid: trace.uuid.toString(),
// evaluate: false,
});
await trace.end({ output: { foo: completion.choices[0].message.content } });
}
main();
Gemini 2.0: Flash
We have added support for Gemini 2.0 Flash, hosted by Google Cloud Platform (US). You can access it in Opper using the model name:
- gcp/gemini-2.0-flash