AI

Learn about AI capabilities in the API; generate text, images, and videos using Cosmic AI.

Cosmic provides AI-powered text, image, and video generation capabilities through the API and SDK.

Base URL

Use the following endpoint to create AI-generated text and images.

https://workers.cosmicjs.com

POST/v3/buckets/:bucket_slug/ai/text

Generate Text

This endpoint enables you to generate text content using AI models.

Required parameters

You must provide either a prompt or messages parameter.

  • Name
    prompt
    Type
    string
    Description

    A text prompt for the AI to generate content from. Use this for simple, single-prompt text generation.

  • Name
    messages
    Type
    array
    Description

    An array of message objects for chat-based models. Each message object should have a role (either "user" or "assistant") and content (the message text).

Optional parameters

  • Name
    model
    Type
    string
    Description

    The AI model to use for text generation. Options include Claude models (e.g., claude-opus-4-5-20251101), Gemini models (e.g., gemini-3-pro-preview), and OpenAI models (e.g., gpt-5). Defaults to claude-opus-4-5-20251101. See Available Models section for full list.

  • Name
    max_tokens
    Type
    number
    Description

    The maximum number of tokens to generate in the response. Higher values allow for longer responses but may increase processing time.

  • Name
    media_url
    Type
    string
    Description

    URL of a file to analyze. Can be any file type available in your Bucket including images, PDFs, Excel spreadsheets, Word documents, and more. The AI model will be able to analyze the content when generating text. Can be used with either prompt or messages.

  • Name
    stream
    Type
    boolean
    Description

    When set to true, enables streaming for real-time responses as they're generated, rather than waiting for the complete response. Default is false.

Request Examples

import { createBucketClient } from '@cosmicjs/sdk'

const cosmic = createBucketClient({
  bucketSlug: 'BUCKET_SLUG',
  readKey: 'BUCKET_READ_KEY',
  writeKey: 'BUCKET_WRITE_KEY'
})

// Using a simple prompt
const textResponse = await cosmic.ai.generateText({
  prompt: 'Write a product description for a coffee mug',
  max_tokens: 500
})

console.log(textResponse.text)
console.log(textResponse.usage)

Media Analysis Examples

import { createBucketClient } from '@cosmicjs/sdk'

const cosmic = createBucketClient({
  bucketSlug: 'BUCKET_SLUG',
  readKey: 'BUCKET_READ_KEY',
  writeKey: 'BUCKET_WRITE_KEY'
})

const imageAnalysis = await cosmic.ai.generateText({
  prompt: 'Describe this image in detail and suggest a caption for social media',
  media_url: 'https://cdn.cosmicjs.com/mountain-landscape.jpg',
  max_tokens: 500
})

console.log(imageAnalysis.text)
console.log(imageAnalysis.usage)

cURL Examples

curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/text \
  -d '{"prompt":"Write a product description for a coffee mug","max_tokens":500}' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer {BUCKET_WRITE_KEY}"

Response Examples

{
  "text": "Introducing our Artisan Ceramic Coffee Mug – the perfect companion for your daily brew. Crafted with care from high-quality ceramic, this elegant mug retains heat longer, ensuring your coffee stays at the ideal temperature. The ergonomic handle provides a comfortable grip, while the smooth, glazed interior prevents staining and makes cleaning effortless. Available in a range of sophisticated colors to match any kitchen aesthetic, this 12oz mug strikes the perfect balance between style and functionality.",
  "usage": {
    "input_tokens": 10,
    "output_tokens": 89
  }
}

Streaming Capabilities

Text generation supports real-time streaming responses, allowing you to receive and display content as it's being generated.

Using with the SDK

import { TextStreamingResponse } from '@cosmicjs/sdk';

// Enable streaming with the stream: true parameter
const result = await cosmic.ai.generateText({
  prompt: 'Tell me about coffee mugs',
  // or use messages array format
  max_tokens: 500,
  stream: true // Enable streaming
});

// Cast the result to TextStreamingResponse
const stream = result as TextStreamingResponse;

// Option 1: Event-based approach
let fullResponse = '';
stream.on('text', (text) => {
  fullResponse += text;
  process.stdout.write(text); // Display text as it arrives
});
stream.on('usage', (usage) => console.log('Usage:', usage));
stream.on('end', (data) => console.log('Complete:', fullResponse));
stream.on('error', (error) => console.error('Error:', error));

// Option 2: For-await loop approach
async function processStream() {
  let fullResponse = '';
  try {
    for await (const chunk of stream) {
      if (chunk.text) {
        fullResponse += chunk.text;
        process.stdout.write(chunk.text);
      }
    }
    console.log('\nComplete text:', fullResponse);
  } catch (error) {
    console.error('Error:', error);
  }
}

Using the simplified stream method

// Simplified streaming method
const stream = await cosmic.ai.stream({
  prompt: 'Tell me about coffee mugs',
  max_tokens: 500,
});

// Process stream using events or for-await loop as shown above

Using cURL for streaming

curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/text \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer ${BUCKET_WRITE_KEY}" \
  -d '{"prompt":"Tell me about coffee mugs","stream":true}' \
  --no-buffer

The TextStreamingResponse supports two usage patterns:

  1. Event-based: Extends EventEmitter with these events:

    • text: New text fragments
    • usage: Token usage information
    • end: Final data when stream completes
    • error: Stream errors
  2. AsyncIterator: For for-await loops, with chunk objects containing:

    • text: Text fragments
    • usage: Token usage information
    • end: Set to true for the final chunk
    • error: Error information

POST/v3/buckets/:bucket_slug/ai/image

Generate Image

This endpoint enables you to create AI-generated images based on text prompts.

Required parameters

  • Name
    prompt
    Type
    string
    Description

    A text description of the image you want to generate. More detailed prompts typically yield better results.

Optional parameters

  • Name
    model
    Type
    string
    Description

    The image generation model to use. Options: gemini-3-pro-image-preview (default, recommended) or dall-e-3. Defaults to gemini-3-pro-image-preview.

  • Name
    size
    Type
    string
    Description

    The size of the generated image. For DALL-E 3: 1024x1024, 1792x1024, or 1024x1792. For Gemini 3 Pro Image: 1024x1024, 2048x2048, or 4096x4096. Defaults to 1024x1024.

  • Name
    quality
    Type
    string
    Description

    The quality of the generated image. Options: standard or hd. Defaults to standard. HD quality provides more detailed images but costs more. (DALL-E 3 only)

  • Name
    reference_images
    Type
    array
    Description

    Array of reference image URLs to provide context for image generation. The AI will analyze these images and use them to inform the generated image. Only supported with gemini-3-pro-image-preview model.

  • Name
    folder
    Type
    string
    Description

    Media folder to store the generated image in. (Folder must exist)

  • Name
    alt_text
    Type
    string
    Description

    Alt text for the generated image.

  • Name
    metadata
    Type
    object
    Description

    User-added JSON metadata for the generated image.

Request Examples

import { createBucketClient } from '@cosmicjs/sdk'

const cosmic = createBucketClient({
  bucketSlug: 'BUCKET_SLUG',
  readKey: 'BUCKET_READ_KEY',
  writeKey: 'BUCKET_WRITE_KEY'
})

const imageResponse = await cosmic.ai.generateImage({
  prompt: 'A serene mountain landscape at sunset',
  folder: 'ai-generated',
  alt_text: 'AI-generated mountain landscape at sunset',
  metadata: {
    caption: 'Beautiful mountain vista',
    generated_by: 'Cosmic AI'
  }
})

console.log(imageResponse.media.url)
console.log(imageResponse.media.imgix_url)

Response Example

{
  "media": {
    "id": "65f3a2c8853cca45f4c9fd96",
    "name": "c20391e0-b8a4-11e6-8836-fbdfd6956b31-mountain-sunset.png",
    "original_name": "mountain-sunset.png",
    "size": 457307,
    "folder": "ai-generated",
    "type": "image/png",
    "bucket": "5839c67f0d3201c114000004",
    "created_at": "2024-03-14T15:34:05.054Z",
    "url": "https://cdn.cosmicjs.com/c20391e0-b8a4-11e6-8836-fbdfd6956b31-mountain-sunset.png",
    "imgix_url": "https://imgix.cosmicjs.com/c20391e0-b8a4-11e6-8836-fbdfd6956b31-mountain-sunset.png",
    "alt_text": "AI-generated mountain landscape at sunset",
    "metadata": {
      "caption": "Beautiful mountain vista",
      "generated_by": "Cosmic AI"
    }
  },
  "revised_prompt": "A serene mountain landscape at sunset with golden light illuminating the peaks, reflecting in a calm lake below, and wispy clouds catching the last rays of sunlight."
}

POST/v3/buckets/:bucket_slug/ai/video

Generate Video

This endpoint enables you to create AI-generated videos using Google's Veo 3.1 models with native audio generation.

Required parameters

  • Name
    prompt
    Type
    string
    Description

    A detailed text description of the video you want to generate. More descriptive prompts yield better results.

Optional parameters

  • Name
    model
    Type
    string
    Description

    The video generation model to use. Options: veo-3.1-fast-generate-preview (recommended, faster) or veo-3.1-generate-preview (premium quality). Defaults to veo-3.1-fast-generate-preview.

  • Name
    duration
    Type
    number
    Description

    Video duration in seconds. Options: 4, 6, or 8. Defaults to 8.

  • Name
    resolution
    Type
    string
    Description

    Video resolution. Options: 720p or 1080p. Defaults to 720p.

  • Name
    reference_images
    Type
    array
    Description

    Array with 1 reference image URL to use as the first frame for video generation. Veo uses image-to-video mode to animate from this starting frame, ensuring precise control over the initial appearance, composition, and style. Ideal for product showcases, character consistency, and brand-accurate animations. Maximum 1 image.

  • Name
    folder
    Type
    string
    Description

    Media folder to store the generated video in. (Folder must exist)

  • Name
    metadata
    Type
    object
    Description

    User-added JSON metadata for the generated video.

Request Examples

import { createBucketClient } from '@cosmicjs/sdk'

const cosmic = createBucketClient({
  bucketSlug: 'BUCKET_SLUG',
  readKey: 'BUCKET_READ_KEY',
  writeKey: 'BUCKET_WRITE_KEY'
})

const videoResponse = await cosmic.ai.generateVideo({
  prompt: 'A calico kitten playing with a ball of yarn in golden sunlight',
  duration: 8,
  resolution: '720p',
  folder: 'ai-videos'
})

console.log(videoResponse.media.url)
console.log(videoResponse.usage)

Response Example

{
  "media": {
    "id": "65f8a3b2c4d5e6f7g8h9i0j1",
    "name": "veo-a1b2c3d4-e5f6-g7h8-i9j0-k1l2m3n4o5p6.mp4",
    "original_name": "veo-a1b2c3d4-e5f6-g7h8-i9j0-k1l2m3n4o5p6.mp4",
    "size": 8450000,
    "type": "video/mp4",
    "bucket": "65a1b2c3d4e5f6g7h8i9j0k1",
    "created_at": "2025-12-20T10:30:00.000Z",
    "folder": "ai-videos",
    "url": "https://cdn.cosmicjs.com/veo-xxx.mp4",
    "imgix_url": "https://imgix.cosmicjs.com/veo-xxx.mp4",
    "alt_text": "A calico kitten playing with a ball of yarn in golden sunlight",
    "metadata": {
      "duration": 8,
      "resolution": "720p",
      "generation_time_seconds": 45
    }
  },
  "usage": {
    "input_tokens": 288000,
    "output_tokens": 0,
    "total_tokens": 288000
  },
  "generation_time_seconds": 45
}

Available Models

Cosmic supports a variety of AI models from Anthropic, Google Gemini, and OpenAI for text, image, and video generation.

Text Generation Models

Anthropic Claude Models

ModelAPI Model IDDescription
Claude Opus 4.5claude-opus-4-5-20251101Latest flagship model with improved performance
Claude Opus 4.1claude-opus-4-1-20250805Most powerful and capable model
Claude Opus 4claude-opus-4-20250514Previous flagship model
Claude Sonnet 4.5claude-sonnet-4-5-20250929High-performance model (recommended)
Claude Sonnet 4claude-sonnet-4-20250514High-performance model with exceptional reasoning
Claude Haiku 3.5claude-3-5-haiku-20241022Fastest model for simple tasks

Google Gemini Models

ModelAPI Model IDDescription
Gemini 3 Pro 🍌gemini-3-pro-previewGoogle's most intelligent model with state-of-the-art reasoning, built for agentic workflows
Gemini 3 Pro Image 🍌gemini-3-pro-image-previewNative image generation with contextual understanding, supports up to 4K images
Veo 3.1 Fast 📹veo-3.1-fast-generate-previewFast video generation with native audio (recommended for most use cases)
Veo 3.1 Standard 📹veo-3.1-generate-previewPremium video generation with exceptional quality and cinematic realism

OpenAI GPT Models

ModelAPI Model IDDescription
GPT-5.2gpt-5.2Latest GPT-5 model with improved performance
GPT-5gpt-5OpenAI's most advanced multimodal model
GPT-5 Minigpt-5-miniCost-effective GPT-5 variant
GPT-5 Nanogpt-5-nanoUltra-fast and cost-effective for simple tasks
GPT-4.1gpt-4.1Enhanced coding and reasoning capabilities
GPT-4ogpt-4oFlagship multimodal model
o1o1Advanced reasoning model for complex problems
o3o3Latest reasoning model with improved performance

Text Generation Model Selection

By default, Cosmic uses Claude Sonnet 4.5 for text generation. You can specify a different model by including the model parameter in your request:

// Using Claude
const response = await cosmic.ai.generateText({
  prompt: 'Write a product description',
  model: 'claude-opus-4-5-20251101', // Optional: specify model
  max_tokens: 500
})

// Using Gemini for agentic workflows
const geminiResponse = await cosmic.ai.generateText({
  prompt: 'Analyze this content and suggest improvements',
  model: 'gemini-3-pro-preview', // Use Gemini 3 Pro
  max_tokens: 500
})

Or using cURL:

# Using Claude
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/text \
  -d '{"prompt":"Write a product description","model":"claude-opus-4-5-20251101","max_tokens":500}' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer {BUCKET_WRITE_KEY}"

# Using Gemini
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/text \
  -d '{"prompt":"Analyze this content and suggest improvements","model":"gemini-3-pro-preview","max_tokens":500}' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer {BUCKET_WRITE_KEY}"

Image Generation Models

Google Gemini Image Models

ModelAPI Model IDDescriptionSupported Sizes
Gemini 3 Pro Image 🍌gemini-3-pro-image-previewNative image generation with contextual understanding (default, recommended)1024x1024, 2048x2048, 4096x4096

OpenAI DALL-E Models

ModelAPI Model IDDescription
DALL-E 3dall-e-3OpenAI's latest image generation model

Image Generation Model Selection

By default, Cosmic uses Gemini 3 Pro Image for image generation. You can explicitly specify the model by including the model parameter in your request.

Gemini 3 Pro Image supports reference images for contextual generation. You can provide URLs to existing images, and Gemini will analyze their style, composition, and content to inform the generated image. This is perfect for:

  • Maintaining consistent visual styles across generated images
  • Creating variations based on existing artwork
  • Applying the aesthetic of one image to a new scene
  • Combining elements from multiple reference images
// Using DALL-E 3
const dalleResponse = await cosmic.ai.generateImage({
  prompt: 'A futuristic cityscape at night',
  model: 'dall-e-3', // Optional: defaults to dall-e-3
  size: '1792x1024',
  quality: 'hd'
})

// Using Gemini 3 Pro Image for 4K generation
const geminiResponse = await cosmic.ai.generateImage({
  prompt: 'A futuristic cityscape at night with neon lights',
  model: 'gemini-3-pro-image-preview', // Use Gemini for higher resolution
  size: '4096x4096', // 4K image support
  quality: 'hd'
})

// Using Gemini with reference images for style consistency
const styledResponse = await cosmic.ai.generateImage({
  prompt: 'A mountain landscape in the same artistic style',
  model: 'gemini-3-pro-image-preview',
  reference_images: [
    'https://cdn.cosmicjs.com/your-style-reference.jpg'
  ],
  size: '2048x2048'
})

Or using cURL:

# Using DALL-E 3
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/image \
  -d '{"prompt":"A futuristic cityscape at night","model":"dall-e-3","size":"1792x1024","quality":"hd"}' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer {BUCKET_WRITE_KEY}"

# Using Gemini 3 Pro Image
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/image \
  -d '{"prompt":"A futuristic cityscape at night","model":"gemini-3-pro-image-preview","size":"4096x4096","quality":"hd"}' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer {BUCKET_WRITE_KEY}"

Video Generation Models

Google Veo Models

ModelAPI Model IDDescription
Veo 3.1 Fastveo-3.1-fast-generate-previewFaster generation, excellent quality (30-90s)
Veo 3.1 Standardveo-3.1-generate-previewPremium quality, cinematic realism (60-180s)

Video Generation Model Selection

By default, Cosmic uses Veo 3.1 Fast for video generation. Both models support:

  • Native Audio: Automatically generated audio that matches the scene
  • Flexible Durations: 4, 6, or 8 seconds
  • High Resolution: 720p or 1080p
  • Image-to-Video Mode: Use 1 reference image as the starting frame

Veo 3.1 Fast is recommended for most use cases, offering excellent quality with faster generation times. Use Veo 3.1 Standard for premium marketing content and cinematic quality.

// Using Veo Fast (recommended)
const fastVideo = await cosmic.ai.generateVideo({
  prompt: 'A peaceful zen garden with koi fish swimming',
  model: 'veo-3.1-fast-generate-preview',
  duration: 8,
  resolution: '720p'
})

// Using Veo Standard for premium quality
const premiumVideo = await cosmic.ai.generateVideo({
  prompt: 'Cinematic shot of city skyline at golden hour',
  model: 'veo-3.1-generate-preview',
  duration: 8,
  resolution: '1080p'
})

// Using reference image as starting frame (image-to-video)
const styledVideo = await cosmic.ai.generateVideo({
  prompt: 'Product rotates smoothly revealing all angles',
  model: 'veo-3.1-fast-generate-preview',
  duration: 6,
  reference_images: [
    'https://cdn.cosmicjs.com/product-hero.jpg'
  ]
})

Or using cURL:

# Using Veo Fast
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/video \
  -d '{"prompt":"A peaceful zen garden with koi fish","model":"veo-3.1-fast-generate-preview","duration":8}' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer {BUCKET_WRITE_KEY}"

# Using Veo Standard for premium quality
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/video \
  -d '{"prompt":"Cinematic city skyline at golden hour","model":"veo-3.1-generate-preview","duration":8,"resolution":"1080p"}' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer {BUCKET_WRITE_KEY}"

Using Reference Images with Video Generation (Image-to-Video Mode)

Veo 3.1 supports image-to-video mode, where you provide 1 reference image that becomes the first frame of your video. This ensures precise control over the starting appearance, composition, and style.

What Image-to-Video Does:

  • Precise Starting Frame: Your reference image becomes the exact first frame of the video
  • Product Accuracy: Start with your exact product photo and animate from there
  • Character Consistency: Begin with a specific character pose or appearance
  • Brand Control: Ensure videos start with approved brand imagery

Best Practices:

// Product showcase starting from hero image
const productVideo = await cosmic.ai.generateVideo({
  prompt: 'Product rotates smoothly, revealing all angles with studio lighting',
  reference_images: [
    'https://cdn.cosmicjs.com/product-hero-front.jpg'
  ],
  duration: 8
})

// Character animation from specific pose
const characterVideo = await cosmic.ai.generateVideo({
  prompt: 'Character waves and smiles at camera with friendly expression',
  reference_images: [
    'https://cdn.cosmicjs.com/character-neutral-pose.jpg'
  ],
  duration: 6
})

// Brand scene animation
const brandVideo = await cosmic.ai.generateVideo({
  prompt: 'Camera slowly zooms in while maintaining brand aesthetic',
  reference_images: [
    'https://cdn.cosmicjs.com/brand-scene-wide.jpg'
  ],
  duration: 8
})

Guidelines:

  • Use high-resolution images (1024x1024+ recommended)
  • Only 1 reference image supported (becomes the first frame)
  • Ensure URL is publicly accessible
  • Your prompt should describe the motion/animation starting from that image
  • Think of it as "how should this image come to life?"

Advanced Veo 3.1 Capabilities

According to Google's Veo 3.1 documentation, Veo 3.1 supports several advanced features:

✅ Image-to-Video Mode

Use 1 reference image as the first frame of your video. Veo animates from this starting point with precise control over initial appearance. See examples above.

✅ Video Extension

Extend previously generated videos to create longer content. See the Extend Video section below for full documentation.


POST/v3/buckets/:bucket_slug/ai/video/extend

Extend Video

This endpoint enables you to extend a previously generated Veo video by creating a new 8-second clip that continues from the final second of the original video.

Required parameters

  • Name
    prompt
    Type
    string
    Description

    A text description of how to continue the video. Should describe the motion/action that follows the original video.

  • Name
    media_id
    Type
    string
    Description

    The ID of the original Veo-generated video to extend. The video must have been generated using the Generate Video endpoint.

Optional parameters

  • Name
    model
    Type
    string
    Description

    The video generation model to use. Options: veo-3.1-fast-generate-preview (recommended) or veo-3.1-generate-preview. Defaults to veo-3.1-fast-generate-preview.

  • Name
    folder
    Type
    string
    Description

    Media folder to store the extended video in. (Folder must exist)

  • Name
    metadata
    Type
    object
    Description

    User-added JSON metadata for the extended video.

Limitations

  • Duration: Extensions are always 8 seconds
  • Resolution: Extensions are always 720p (even if the original was 1080p)
  • Source: Only Veo-generated videos can be extended (must have veo_file_uri in metadata)
  • Chaining: Extended videos can also be extended, enabling creation of longer narratives

Request Examples

import { createBucketClient } from '@cosmicjs/sdk'

const cosmic = createBucketClient({
  bucketSlug: 'BUCKET_SLUG',
  readKey: 'BUCKET_READ_KEY',
  writeKey: 'BUCKET_WRITE_KEY'
})

// First, generate an initial video
const initialVideo = await cosmic.ai.generateVideo({
  prompt: 'A calico kitten sitting peacefully in golden sunlight',
  duration: 8,
  resolution: '720p'
})

console.log('Initial video ID:', initialVideo.media.id)

// Then extend it with a continuation
const extendedVideo = await cosmic.ai.extendVideo({
  media_id: initialVideo.media.id,
  prompt: 'The kitten stands up and walks away into the garden'
})

console.log('Extended video URL:', extendedVideo.media.url)
console.log('Source video ID:', extendedVideo.source_media_id)

Response Example

{
  "media": {
    "id": "65f8b4c3d5e6f7g8h9i0j1k2",
    "name": "veo-ext-a1b2c3d4-e5f6-g7h8-i9j0-k1l2m3n4o5p6.mp4",
    "original_name": "veo-ext-a1b2c3d4-e5f6-g7h8-i9j0-k1l2m3n4o5p6.mp4",
    "size": 8650000,
    "type": "video/mp4",
    "bucket": "65a1b2c3d4e5f6g7h8i9j0k1",
    "created_at": "2025-12-20T10:35:00.000Z",
    "folder": null,
    "url": "https://cdn.cosmicjs.com/veo-ext-xxx.mp4",
    "imgix_url": "https://imgix.cosmicjs.com/veo-ext-xxx.mp4",
    "alt_text": "The scene continues with the character walking away",
    "metadata": {
      "duration": 8,
      "resolution": "720p",
      "generation_time_seconds": 52,
      "is_extension": true,
      "source_media_id": "65f8a3b2c4d5e6f7g8h9i0j1",
      "veo_file_uri": "https://generativelanguage.googleapis.com/v1beta/files/..."
    }
  },
  "usage": {
    "input_tokens": 288000,
    "output_tokens": 0,
    "total_tokens": 288000
  },
  "generation_time_seconds": 52,
  "source_media_id": "65f8a3b2c4d5e6f7g8h9i0j1",
  "is_extension": true
}

Video Extension Use Cases

Creating Longer Narratives

Chain multiple 8-second extensions to create minute-long videos or longer stories:

// Build a 32-second video (4 segments)
let currentVideo = await cosmic.ai.generateVideo({
  prompt: 'Opening scene: sunrise over mountains',
  duration: 8
})

const segments = [currentVideo]

for (const continuation of [
  'Birds take flight from the trees below',
  'A hiker appears on the trail in the distance',
  'Close-up of the hiker reaching the summit'
]) {
  currentVideo = await cosmic.ai.extendVideo({
    media_id: currentVideo.media.id,
    prompt: continuation
  })
  segments.push(currentVideo)
}

console.log(`Created ${segments.length} connected segments (${segments.length * 8}s total)`)

Extending Product Showcases

Continue product demonstrations with smooth transitions:

const productIntro = await cosmic.ai.generateVideo({
  prompt: 'Product slowly rotates on white background',
  reference_images: ['https://cdn.cosmicjs.com/product.jpg'],
  duration: 8
})

const productDetails = await cosmic.ai.extendVideo({
  media_id: productIntro.media.id,
  prompt: 'Camera zooms in to show product texture and details'
})

Building Background Loops

Create longer ambient footage for websites or presentations:

const ambientBase = await cosmic.ai.generateVideo({
  prompt: 'Gentle rain falling on window with city lights blurred behind',
  duration: 8
})

const ambientExtended = await cosmic.ai.extendVideo({
  media_id: ambientBase.media.id,
  prompt: 'Rain continues with occasional lightning flash in the distance'
})

Usage and Limitations

AI capabilities in Cosmic are subject to the following considerations:

  • Rate Limits: AI generation requests may be subject to rate limiting based on your plan.
  • Token Usage: Text, image, and video generation consume tokens. Media generation (images and videos) has higher token costs due to computational requirements.
  • Media Storage: Generated images and videos are stored in your Cosmic Bucket's media library and count toward your storage quota.
  • Video Generation Time: Video generation is asynchronous and typically takes 30-90 seconds (Fast) or 60-180 seconds (Standard).
  • Content Policy: Generated content must comply with Cosmic's terms of service and content policies. Video generation will fail without a specific error message if the prompt violates content policies (e.g., celebrity likeness, copyrighted material, or safety concerns). If you receive a "No videos generated" error, try rephrasing your prompt to avoid references to real people, brands, or copyrighted characters.
  • Regional Restrictions: In EU, UK, CH, MENA regions, person generation has limitations (see Google's documentation for details).
  • Watermarking: Videos created by Veo are watermarked using SynthID for AI-generated content identification.

Pricing Overview

All AI features consume tokens from your monthly allocation or token packs. Token costs vary by model complexity and media type.

Text Generation Pricing

Text generation uses tiered pricing based on model capability. More powerful models cost more tokens to use, reflecting their higher computational requirements and superior performance.

How Token Multipliers Work: Your actual token deduction is multiplied by the tier multiplier:

TierMultiplierModelsToken Deduction
Budget1.0xGPT-5 Nano, GPT-4.1 Nano, GPT-4o Mini, GPT-5 Mini, GPT-4.1 Mini, Claude Haiku 3, Claude Haiku 3.51,000 actual tokens = 1,000 deducted
Standard2.0xGPT-5, GPT-5.2, GPT-4.1, GPT-4o, o1-mini, o3, o3-mini, o4-mini, Claude Sonnet 4, Claude Sonnet 4.5, Claude Opus 4.5, Gemini 3 Pro1,000 actual tokens = 2,000 deducted
Premium4.0xClaude Opus 4, Claude Opus 4.1, o1, o3-pro1,000 actual tokens = 4,000 deducted

Example: Using Claude Sonnet 4.5 (Standard tier) with a 1,000 token response will deduct 2,000 tokens from your balance (1,000 × 2.0x).

Media Generation Pricing

Images and videos use fixed token costs. All media generation costs are billed as output tokens.

FeatureToken Cost (Output)
DALL-E 3 Image4,800 tokens
Gemini 1K/2K Image32,160 tokens
Gemini 4K Image57,600 tokens
Veo Fast Video (4s)144,000 tokens
Veo Fast Video (6s)216,000 tokens
Veo Fast Video (8s)288,000 tokens
Veo Fast Extension288,000 tokens
Veo Standard Video (4s)384,000 tokens
Veo Standard Video (6s)576,000 tokens
Veo Standard Video (8s)768,000 tokens
Veo Standard Extension768,000 tokens

Note: Input tokens (your prompt) for media generation are minimal compared to the output cost. The token costs listed above represent the generation cost and are deducted from your output token allocation.

For more information about AI capabilities and pricing, please refer to the Cosmic pricing page or contact Cosmic support.