Skip to content

Commit

Permalink
feat (ai/core): enhance generateImage, add initial Fireworks image su…
Browse files Browse the repository at this point in the history
…pport. (#4266)

Co-authored-by: Lars Grammel <[email protected]>
  • Loading branch information
shaper and lgrammel authored Jan 7, 2025
1 parent 930c5f4 commit 19a2ce7
Show file tree
Hide file tree
Showing 27 changed files with 935 additions and 91 deletions.
7 changes: 7 additions & 0 deletions .changeset/cuddly-kiwis-guess.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
'@ai-sdk/google-vertex': patch
'@ai-sdk/openai': patch
'ai': patch
---

feat (ai/core): add aspectRatio and seed options to generateImage
5 changes: 5 additions & 0 deletions .changeset/perfect-lobsters-guess.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'@ai-sdk/provider': patch
---

feat (provider): add message option to UnsupportedFunctionalityError
6 changes: 6 additions & 0 deletions .changeset/poor-pets-obey.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
'@ai-sdk/fireworks': patch
'@ai-sdk/provider-utils': patch
---

feat (provider/fireworks): Add image model support.
77 changes: 68 additions & 9 deletions content/docs/03-ai-sdk-core/35-image-generation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ import { openai } from '@ai-sdk/openai';
const { image } = await generateImage({
model: openai.image('dall-e-3'),
prompt: 'Santa Claus driving a Cadillac',
size: '1024x1024',
});
```

Expand All @@ -28,26 +27,84 @@ const base64 = image.base64; // base64 image data
const uint8Array = image.uint8Array; // Uint8Array image data
```

### Size and Aspect Ratio

Depending on the model, you can either specify the size or the aspect ratio.

##### Size

The size is specified as a string in the format `{width}x{height}`.
Models only support a few sizes, and the supported sizes are different for each model and provider.

```tsx highlight={"7"}
import { experimental_generateImage as generateImage } from 'ai';
import { openai } from '@ai-sdk/openai';

const { image } = await generateImage({
model: openai.image('dall-e-3'),
prompt: 'Santa Claus driving a Cadillac',
size: '1024x1024',
});
```

##### Aspect Ratio

The aspect ratio is specified as a string in the format `{width}:{height}`.
Models only support a few aspect ratios, and the supported aspect ratios are different for each model and provider.

```tsx highlight={"7"}
import { experimental_generateImage as generateImage } from 'ai';
import { vertex } from '@ai-sdk/google-vertex';

const { image } = await generateImage({
model: vertex.image('imagen-3.0-generate-001'),
prompt: 'Santa Claus driving a Cadillac',
aspectRatio: '16:9',
});
```

### Generating Multiple Images

`generateImage` also supports generating multiple images at once for models that support it:

```tsx highlight={"4"}
```tsx highlight={"7"}
import { experimental_generateImage as generateImage } from 'ai';
import { openai } from '@ai-sdk/openai';

const { images } = await generateImage({
model: openai.image('dall-e-2'),
prompt: 'Santa Claus driving a Cadillac',
n: 4, // number of images to generate
});
```

### Providing a Seed

You can provide a seed to the `generateImage` function to control the output of the image generation process.
If supported by the model, the same seed will always produce the same image.

```tsx highlight={"7"}
import { experimental_generateImage as generateImage } from 'ai';
import { openai } from '@ai-sdk/openai';

const { image } = await generateImage({
model: openai.image('dall-e-3'),
prompt: 'Santa Claus driving a Cadillac',
seed: 1234567890,
});
```

### Provider-specific Settings

Image models often have provider- or even model-specific settings.
You can pass such settings to the `generateImage` function
using the `providerOptions` parameter. The options for the provider
(`openai` in the example below) become request body properties.

```tsx highlight={"5-7"}
```tsx highlight={"9"}
import { experimental_generateImage as generateImage } from 'ai';
import { openai } from '@ai-sdk/openai';

const { image } = await generateImage({
model: openai.image('dall-e-3'),
prompt: 'Santa Claus driving a Cadillac',
Expand Down Expand Up @@ -93,9 +150,11 @@ const { image } = await generateImage({

## Image Models

| Provider | Model | Supported Sizes |
| ----------------------------------------------------------------------- | ------------------------------ | ------------------------------------------------------------------------------------------------------------- |
| [Google Vertex](/providers/ai-sdk-providers/google-vertex#image-models) | `imagen-3.0-generate-001` | See [aspect ratios](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images#aspect-ratio) |
| [Google Vertex](/providers/ai-sdk-providers/google-vertex#image-models) | `imagen-3.0-fast-generate-001` | See [aspect ratios](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images#aspect-ratio) |
| [OpenAI](/providers/ai-sdk-providers/openai#image-models) | `dall-e-3` | 1024x1024, 1792x1024, 1024x1792 |
| [OpenAI](/providers/ai-sdk-providers/openai#image-models) | `dall-e-2` | 256x256, 512x512, 1024x1024 |
| Provider | Model | Sizes | Aspect Ratios |
| ----------------------------------------------------------------------- | ---------------------------------------------- | ------------------------------- | ----------------------------------------------- |
| [Google Vertex](/providers/ai-sdk-providers/google-vertex#image-models) | `imagen-3.0-generate-001` | Use aspect ratio | 1:1, 3:4, 4:3, 9:16, 16:9 |
| [Google Vertex](/providers/ai-sdk-providers/google-vertex#image-models) | `imagen-3.0-fast-generate-001` | Use aspect ratio | 1:1, 3:4, 4:3, 9:16, 16:9 |
| [OpenAI](/providers/ai-sdk-providers/openai#image-models) | `dall-e-3` | 1024x1024, 1792x1024, 1024x1792 | use size |
| [OpenAI](/providers/ai-sdk-providers/openai#image-models) | `dall-e-2` | 256x256, 512x512, 1024x1024 | use size |
| [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/flux-1-dev-fp8` | Use aspect ratio | 1:1, 2:3, 3:2, 4:5, 5:4, 16:9, 9:16, 9:21, 21:9 |
| [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/flux-1-schnell-fp8` | Use aspect ratio | 1:1, 2:3, 3:2, 4:5, 5:4, 16:9, 9:16, 9:21, 21:9 |
13 changes: 13 additions & 0 deletions content/docs/07-reference/01-ai-sdk-core/10-generate-image.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,19 @@ console.log(images);
description:
'Size of the images to generate. Format: `{width}x{height}`.',
},
{
name: 'aspectRatio',
type: 'string',
isOptional: true,
description:
'Aspect ratio of the images to generate. Format: `{width}:{height}`.',
},
{
name: 'seed',
type: 'number',
isOptional: true,
description: 'Seed for the image generation.',
},
{
name: 'providerOptions',
type: 'Record<string, Record<string, JSONValue>>',
Expand Down
7 changes: 6 additions & 1 deletion content/providers/01-ai-sdk-providers/01-openai.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -618,9 +618,14 @@ using the `.image()` factory method.
const model = openai.image('dall-e-3');
```

<Note>
Dall-E models do not support the `aspectRatio` parameter. Use the `size`
parameter instead.
</Note>

### Model Capabilities

| Model | Supported Sizes |
| Model | Sizes |
| ---------- | ------------------------------- |
| `dall-e-3` | 1024x1024, 1792x1024, 1024x1792 |
| `dall-e-2` | 256x256, 512x512, 1024x1024 |
19 changes: 10 additions & 9 deletions content/providers/01-ai-sdk-providers/11-google-vertex.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -564,27 +564,28 @@ The following optional settings are available for Google Vertex AI embedding mod
You can create [Imagen](https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview) models that call the [Imagen on Vertex AI API](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images)
using the `.image()` factory method. For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image).

Note that Imagen does not support an explicit size parameter. Instead, size is driven by the [aspect ratio](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images#aspect-ratio) of the input image.

```ts
import { vertex } from '@ai-sdk/google-vertex';
import { experimental_generateImage as generateImage } from 'ai';

const { image } = await generateImage({
model: vertex.image('imagen-3.0-generate-001'),
prompt: 'A futuristic cityscape at sunset',
providerOptions: {
vertex: { aspectRatio: '16:9' },
},
aspectRatio: '16:9',
});
```

<Note>
Imagen models do not support the `size` parameter. Use the `aspectRatio`
parameter instead.
</Note>

#### Model Capabilities

| Model | Supported Sizes |
| ------------------------------ | ------------------------------------------------------------------------------------------------------------- |
| `imagen-3.0-generate-001` | See [aspect ratios](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images#aspect-ratio) |
| `imagen-3.0-fast-generate-001` | See [aspect ratios](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images#aspect-ratio) |
| Model | Aspect Ratios |
| ------------------------------ | ------------------------- |
| `imagen-3.0-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
| `imagen-3.0-fast-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |

## Google Vertex Anthropic Provider Usage

Expand Down
52 changes: 40 additions & 12 deletions content/providers/01-ai-sdk-providers/26-fireworks.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -87,25 +87,15 @@ const { text } = await generateText({

Fireworks language models can also be used in the `streamText` and `streamUI` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)).

## Completion Models
### Completion Models

You can create models that call the Fireworks completions API using the `.completion()` factory method:

```ts
const model = fireworks.completion('accounts/fireworks/models/firefunction-v1');
```

## Embedding Models

You can create models that call the Fireworks embeddings API using the `.textEmbeddingModel()` factory method:

```ts
const model = fireworks.textEmbeddingModel(
'accounts/fireworks/models/nomic-embed-text-v1',
);
```

## Model Capabilities
### Model Capabilities

| Model | Image Input | Object Generation | Tool Usage | Tool Streaming |
| ---------------------------------------------------------- | ------------------- | ------------------- | ------------------- | ------------------- |
Expand All @@ -124,3 +114,41 @@ const model = fireworks.textEmbeddingModel(
The table above lists popular models. Please see the [Fireworks models
page](https://fireworks.ai/models) for a full list of available models.
</Note>

## Embedding Models

You can create models that call the Fireworks embeddings API using the `.textEmbeddingModel()` factory method:

```ts
const model = fireworks.textEmbeddingModel(
'accounts/fireworks/models/nomic-embed-text-v1',
);
```

## Image Models

You can create Fireworks image models using the `.image()` factory method.
For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image).

```ts
import { fireworks } from '@ai-sdk/fireworks';
import { experimental_generateImage as generateImage } from 'ai';

const { image } = await generateImage({
model: fireworks.image('accounts/fireworks/models/flux-1-dev-fp8'),
prompt: 'A futuristic cityscape at sunset',
aspectRatio: '16:9',
});
```

<Note>
Fireworks models do not support the `size` parameter. Use the `aspectRatio`
parameter instead.
</Note>

### Model Capabilities

| Model | Aspect Ratios |
| ---------------------------------------------- | ----------------------------------------------- |
| `accounts/fireworks/models/flux-1-dev-fp8` | 1:1, 2:3, 3:2, 4:5, 5:4, 16:9, 9:16, 9:21, 21:9 |
| `accounts/fireworks/models/flux-1-schnell-fp8` | 1:1, 2:3, 3:2, 4:5, 5:4, 16:9, 9:16, 9:21, 21:9 |
28 changes: 27 additions & 1 deletion examples/ai-core/src/e2e/feature-test-suite.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import { z } from 'zod';
import {
experimental_generateImage as generateImage,
generateText,
generateObject,
streamText,
Expand All @@ -10,12 +11,17 @@ import {
} from 'ai';
import fs from 'fs';
import { describe, expect, it, vi } from 'vitest';
import type { EmbeddingModelV1, LanguageModelV1 } from '@ai-sdk/provider';
import type {
EmbeddingModelV1,
ImageModelV1,
LanguageModelV1,
} from '@ai-sdk/provider';

export interface ModelVariants {
invalidModel?: LanguageModelV1;
languageModels?: LanguageModelV1[];
embeddingModels?: EmbeddingModelV1<string>[];
imageModels?: ImageModelV1[];
}

export interface TestSuiteOptions {
Expand Down Expand Up @@ -369,5 +375,25 @@ export function createFeatureTestSuite({
);
}
});

describe.each(createModelObjects(models.imageModels))(
'Image Model: $modelId',
({ model }) => {
it('should generate an image', async () => {
const result = await generateImage({
model,
prompt: 'A cute cartoon cat',
});

// Verify we got a base64 string back
expect(result.image.base64).toBeTruthy();
expect(typeof result.image.base64).toBe('string');

// Check the decoded length is reasonable (at least 10KB)
const decoded = Buffer.from(result.image.base64, 'base64');
expect(decoded.length).toBeGreaterThan(10 * 1024);
});
},
);
};
}
1 change: 1 addition & 0 deletions examples/ai-core/src/e2e/fireworks.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ createFeatureTestSuite({
embeddingModels: [
provider.textEmbeddingModel('nomic-ai/nomic-embed-text-v1.5'),
],
imageModels: [provider.image('accounts/fireworks/models/flux-1-dev-fp8')],
},
timeout: 10000,
customAssertions: {
Expand Down
26 changes: 26 additions & 0 deletions examples/ai-core/src/generate-image/fireworks.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import 'dotenv/config';
import { fireworks } from '@ai-sdk/fireworks';
import { experimental_generateImage as generateImage } from 'ai';
import fs from 'fs';

async function main() {
const { image } = await generateImage({
model: fireworks.image('accounts/fireworks/models/flux-1-dev-fp8'),
prompt: 'A burrito launched through a tunnel',
aspectRatio: '4:3',
seed: 0, // 0 is random seed for this model
providerOptions: {
fireworks: {
// https://fireworks.ai/models/fireworks/flux-1-dev-fp8/playground
guidance_scale: 10,
num_inference_steps: 10,
},
},
});

const filename = `image-${Date.now()}.png`;
fs.writeFileSync(filename, image.uint8Array);
console.log(`Image saved to ${filename}`);
}

main().catch(console.error);
4 changes: 3 additions & 1 deletion examples/ai-core/src/generate-image/google-vertex.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,11 @@ async function main() {
const { image } = await generateImage({
model: vertex.image('imagen-3.0-generate-001'),
prompt: 'A burrito launched through a tunnel',
aspectRatio: '1:1',
providerOptions: {
vertex: {
aspectRatio: '16:9',
// https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/imagen-api#parameter_list
addWatermark: false,
},
},
});
Expand Down
4 changes: 0 additions & 4 deletions examples/ai-core/src/generate-image/openai.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,6 @@ async function main() {
const { image } = await generateImage({
model: openai.image('dall-e-3'),
prompt: 'Santa Claus driving a Cadillac',
size: '1024x1024',
providerOptions: {
openai: { style: 'vivid', quality: 'hd' },
},
});

const filename = `image-${Date.now()}.png`;
Expand Down
Loading

0 comments on commit 19a2ce7

Please sign in to comment.