Added support for Imagen 3 in the Python and Node SDKs
Opper now support two image generation models, azure/dall-e-3-eu
and gcp/imagen-3.0-generate-001-eu
. Here is an example of generating an image from a description in Python:
def generate_image(description: str) -> ImageOutput:
image, _ = opper.call(
name="generate_image",
output_type=ImageOutput,
input=description,
model="gcp/imagen-3.0-generate-001-eu",
configuration=CallConfiguration(
model_parameters={
"aspectRatio": "9:16",
}
),
)
return image
description = "portrait of a person standing in front of a park. vibrant, autumn colors"
path = save_file(generate_image(description).bytes)
print(path)
Here is a similar example in TypeScript:
async function testImageGeneration() {
const image = await client.generateImage({
model: "gcp/imagen-3.0-generate-001-eu",
prompt: "portrait of a person standing in front of a park. vibrant, autumn colors",
parameters: {
aspectRatio: "9:16",
}
});
const tempFilePath = path.join(os.tmpdir(), "image.png");
fs.writeFileSync(tempFilePath, image.bytes);
console.log(`image written to temporary file: ${tempFilePath}`);
}
testImageGeneration();
Model parameters vary between models, but here are the supported ones for each model:
azure/dall-e-3-eu
:
- style: natural, vivid
- quality: standard, hd
- size: 1024x1024, 1792x1024, 1024x1792
gcp/imagen-3.0-generate-001-eu
:
- aspectRatio: 1:1, 3:4, 4:3, 16:9, 9:16
Images as input to multimodal models
You are now able to pass images as input to multimodal models.
Python SDK
# special type for images, this is to capture the need for encoding the image in the right format
from opperai import ImageInput
description, response = await aopper.call(
name="async_describe_image",
instructions="Create a short description of the image",
output_type=Description,
input=Image(
image=ImageInput.from_path("examples/cat.png"),
),
model="openai/gpt-4o",
)
Node SDK
// special function to read images, this is to capture the need for encoding the image in the right format
import { opperImage } from "opperai";
const { message } = await client.call({
parent_span_uuid: trace.uuid,
name: "node-sdk/call/multimodal/image-input",
instructions: "Create a short description of the image",
input: {image: image("examples/cat.png")},
model: "openai/gpt-4o",
});
Image generation using DALL-E 3 now available
Using the ImageOutput
type you are now able to generate images via call
using DALL-E 3 in the Python SDK.
from opperai import ImageOutput
cat, _ = await aopper.call(
name="generate_cat",
output_type=ImageOutput,
input="Create an image of a cat",
)
Using the Node SDK you can generate images using DALL-E 3.
const cat = await client.generateImage({
parent_span_uuid: trace.uuid,
prompt: "Create an image of a cat",
});
New models added
- aws/claude-3.5-sonnet-eu
- cerebras/llama3.1-8b
- cerebras/llama3.1-70b
- gcp/gemini-1.5-pro-002-eu
- gcp/gemini-1.5-flash-002-eu
- groq/llama-3.1-70b-versatile
- groq/llama-3.1-8b-instant
- groq/gemma2-9b-it
- mistral/pixtral-12b-2409-eu
- openai/o1-preview
- openai/o1-mini
See Cerebras for more information about these models.
Updated default embedding model
The new default embedding model for indexes is text-embedding-3-large
.