AI
Learn about AI capabilities in the API; generate text, images, and videos using Cosmic AI.
Cosmic provides AI-powered text, image, and video generation capabilities through the API and SDK.
Base URL
Use the following endpoint to create AI-generated text and images.
https://workers.cosmicjs.com
Generate Text
This endpoint enables you to generate text content using AI models.
The data request payload will need to be sent in JSON format with the Content-Type: application/json header set. A bucket write key is required for all AI operations.
Required parameters
You must provide either a prompt or messages parameter.
- Name
prompt- Type
- string
- Description
A text prompt for the AI to generate content from. Use this for simple, single-prompt text generation.
- Name
messages- Type
- array
- Description
An array of message objects for chat-based models. Each message object should have a
role(either "user" or "assistant") andcontent(the message text).
Optional parameters
- Name
model- Type
- string
- Description
The AI model to use for text generation. Options include Claude models (e.g.,
claude-opus-4-5-20251101), Gemini models (e.g.,gemini-3-pro-preview), and OpenAI models (e.g.,gpt-5). Defaults toclaude-opus-4-5-20251101. See Available Models section for full list.
- Name
max_tokens- Type
- number
- Description
The maximum number of tokens to generate in the response. Higher values allow for longer responses but may increase processing time.
- Name
media_url- Type
- string
- Description
URL of a file to analyze. Can be any file type available in your Bucket including images, PDFs, Excel spreadsheets, Word documents, and more. The AI model will be able to analyze the content when generating text. Can be used with either
promptormessages.
- Name
stream- Type
- boolean
- Description
When set to true, enables streaming for real-time responses as they're generated, rather than waiting for the complete response. Default is false.
Request Examples
import { createBucketClient } from '@cosmicjs/sdk'
const cosmic = createBucketClient({
bucketSlug: 'BUCKET_SLUG',
readKey: 'BUCKET_READ_KEY',
writeKey: 'BUCKET_WRITE_KEY'
})
// Using a simple prompt
const textResponse = await cosmic.ai.generateText({
prompt: 'Write a product description for a coffee mug',
max_tokens: 500
})
console.log(textResponse.text)
console.log(textResponse.usage)
Media Analysis Examples
import { createBucketClient } from '@cosmicjs/sdk'
const cosmic = createBucketClient({
bucketSlug: 'BUCKET_SLUG',
readKey: 'BUCKET_READ_KEY',
writeKey: 'BUCKET_WRITE_KEY'
})
const imageAnalysis = await cosmic.ai.generateText({
prompt: 'Describe this image in detail and suggest a caption for social media',
media_url: 'https://cdn.cosmicjs.com/mountain-landscape.jpg',
max_tokens: 500
})
console.log(imageAnalysis.text)
console.log(imageAnalysis.usage)
cURL Examples
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/text \
-d '{"prompt":"Write a product description for a coffee mug","max_tokens":500}' \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer {BUCKET_WRITE_KEY}"
Response Examples
{
"text": "Introducing our Artisan Ceramic Coffee Mug – the perfect companion for your daily brew. Crafted with care from high-quality ceramic, this elegant mug retains heat longer, ensuring your coffee stays at the ideal temperature. The ergonomic handle provides a comfortable grip, while the smooth, glazed interior prevents staining and makes cleaning effortless. Available in a range of sophisticated colors to match any kitchen aesthetic, this 12oz mug strikes the perfect balance between style and functionality.",
"usage": {
"input_tokens": 10,
"output_tokens": 89
}
}
Streaming Capabilities
Text generation supports real-time streaming responses, allowing you to receive and display content as it's being generated.
Using with the SDK
import { TextStreamingResponse } from '@cosmicjs/sdk';
// Enable streaming with the stream: true parameter
const result = await cosmic.ai.generateText({
prompt: 'Tell me about coffee mugs',
// or use messages array format
max_tokens: 500,
stream: true // Enable streaming
});
// Cast the result to TextStreamingResponse
const stream = result as TextStreamingResponse;
// Option 1: Event-based approach
let fullResponse = '';
stream.on('text', (text) => {
fullResponse += text;
process.stdout.write(text); // Display text as it arrives
});
stream.on('usage', (usage) => console.log('Usage:', usage));
stream.on('end', (data) => console.log('Complete:', fullResponse));
stream.on('error', (error) => console.error('Error:', error));
// Option 2: For-await loop approach
async function processStream() {
let fullResponse = '';
try {
for await (const chunk of stream) {
if (chunk.text) {
fullResponse += chunk.text;
process.stdout.write(chunk.text);
}
}
console.log('\nComplete text:', fullResponse);
} catch (error) {
console.error('Error:', error);
}
}
Using the simplified stream method
// Simplified streaming method
const stream = await cosmic.ai.stream({
prompt: 'Tell me about coffee mugs',
max_tokens: 500,
});
// Process stream using events or for-await loop as shown above
Using cURL for streaming
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/text \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer ${BUCKET_WRITE_KEY}" \
-d '{"prompt":"Tell me about coffee mugs","stream":true}' \
--no-buffer
The TextStreamingResponse supports two usage patterns:
-
Event-based: Extends EventEmitter with these events:
text: New text fragmentsusage: Token usage informationend: Final data when stream completeserror: Stream errors
-
AsyncIterator: For for-await loops, with chunk objects containing:
text: Text fragmentsusage: Token usage informationend: Set to true for the final chunkerror: Error information
Generate Image
This endpoint enables you to create AI-generated images based on text prompts.
The data request payload will need to be sent in JSON format with the Content-Type: application/json header set.
Required parameters
- Name
prompt- Type
- string
- Description
A text description of the image you want to generate. More detailed prompts typically yield better results.
Optional parameters
- Name
model- Type
- string
- Description
The image generation model to use. Options:
gemini-3-pro-image-preview(default, recommended) ordall-e-3. Defaults togemini-3-pro-image-preview.
- Name
size- Type
- string
- Description
The size of the generated image. For DALL-E 3:
1024x1024,1792x1024, or1024x1792. For Gemini 3 Pro Image:1024x1024,2048x2048, or4096x4096. Defaults to1024x1024.
- Name
quality- Type
- string
- Description
The quality of the generated image. Options:
standardorhd. Defaults tostandard. HD quality provides more detailed images but costs more. (DALL-E 3 only)
- Name
reference_images- Type
- array
- Description
Array of reference image URLs to provide context for image generation. The AI will analyze these images and use them to inform the generated image. Only supported with
gemini-3-pro-image-previewmodel.
- Name
folder- Type
- string
- Description
Media folder to store the generated image in. (Folder must exist)
- Name
alt_text- Type
- string
- Description
Alt text for the generated image.
- Name
metadata- Type
- object
- Description
User-added JSON metadata for the generated image.
Request Examples
import { createBucketClient } from '@cosmicjs/sdk'
const cosmic = createBucketClient({
bucketSlug: 'BUCKET_SLUG',
readKey: 'BUCKET_READ_KEY',
writeKey: 'BUCKET_WRITE_KEY'
})
const imageResponse = await cosmic.ai.generateImage({
prompt: 'A serene mountain landscape at sunset',
folder: 'ai-generated',
alt_text: 'AI-generated mountain landscape at sunset',
metadata: {
caption: 'Beautiful mountain vista',
generated_by: 'Cosmic AI'
}
})
console.log(imageResponse.media.url)
console.log(imageResponse.media.imgix_url)
Response Example
{
"media": {
"id": "65f3a2c8853cca45f4c9fd96",
"name": "c20391e0-b8a4-11e6-8836-fbdfd6956b31-mountain-sunset.png",
"original_name": "mountain-sunset.png",
"size": 457307,
"folder": "ai-generated",
"type": "image/png",
"bucket": "5839c67f0d3201c114000004",
"created_at": "2024-03-14T15:34:05.054Z",
"url": "https://cdn.cosmicjs.com/c20391e0-b8a4-11e6-8836-fbdfd6956b31-mountain-sunset.png",
"imgix_url": "https://imgix.cosmicjs.com/c20391e0-b8a4-11e6-8836-fbdfd6956b31-mountain-sunset.png",
"alt_text": "AI-generated mountain landscape at sunset",
"metadata": {
"caption": "Beautiful mountain vista",
"generated_by": "Cosmic AI"
}
},
"revised_prompt": "A serene mountain landscape at sunset with golden light illuminating the peaks, reflecting in a calm lake below, and wispy clouds catching the last rays of sunlight."
}
Generate Video
This endpoint enables you to create AI-generated videos using Google's Veo 3.1 models with native audio generation.
The data request payload will need to be sent in JSON format with the Content-Type: application/json header set. Video generation is an asynchronous operation that typically takes 30-90 seconds.
Content Policy: Video generation will fail if the prompt contains references to real people (celebrities, public figures), copyrighted characters, or other policy-violating content. If you receive a "No videos generated" error, try rephrasing your prompt to avoid these elements.
Required parameters
- Name
prompt- Type
- string
- Description
A detailed text description of the video you want to generate. More descriptive prompts yield better results.
Optional parameters
- Name
model- Type
- string
- Description
The video generation model to use. Options:
veo-3.1-fast-generate-preview(recommended, faster) orveo-3.1-generate-preview(premium quality). Defaults toveo-3.1-fast-generate-preview.
- Name
duration- Type
- number
- Description
Video duration in seconds. Options:
4,6, or8. Defaults to8.
- Name
resolution- Type
- string
- Description
Video resolution. Options:
720por1080p. Defaults to720p.
- Name
reference_images- Type
- array
- Description
Array with 1 reference image URL to use as the first frame for video generation. Veo uses image-to-video mode to animate from this starting frame, ensuring precise control over the initial appearance, composition, and style. Ideal for product showcases, character consistency, and brand-accurate animations. Maximum 1 image.
- Name
folder- Type
- string
- Description
Media folder to store the generated video in. (Folder must exist)
- Name
metadata- Type
- object
- Description
User-added JSON metadata for the generated video.
Request Examples
import { createBucketClient } from '@cosmicjs/sdk'
const cosmic = createBucketClient({
bucketSlug: 'BUCKET_SLUG',
readKey: 'BUCKET_READ_KEY',
writeKey: 'BUCKET_WRITE_KEY'
})
const videoResponse = await cosmic.ai.generateVideo({
prompt: 'A calico kitten playing with a ball of yarn in golden sunlight',
duration: 8,
resolution: '720p',
folder: 'ai-videos'
})
console.log(videoResponse.media.url)
console.log(videoResponse.usage)
Response Example
{
"media": {
"id": "65f8a3b2c4d5e6f7g8h9i0j1",
"name": "veo-a1b2c3d4-e5f6-g7h8-i9j0-k1l2m3n4o5p6.mp4",
"original_name": "veo-a1b2c3d4-e5f6-g7h8-i9j0-k1l2m3n4o5p6.mp4",
"size": 8450000,
"type": "video/mp4",
"bucket": "65a1b2c3d4e5f6g7h8i9j0k1",
"created_at": "2025-12-20T10:30:00.000Z",
"folder": "ai-videos",
"url": "https://cdn.cosmicjs.com/veo-xxx.mp4",
"imgix_url": "https://imgix.cosmicjs.com/veo-xxx.mp4",
"alt_text": "A calico kitten playing with a ball of yarn in golden sunlight",
"metadata": {
"duration": 8,
"resolution": "720p",
"generation_time_seconds": 45
}
},
"usage": {
"input_tokens": 288000,
"output_tokens": 0,
"total_tokens": 288000
},
"generation_time_seconds": 45
}
Available Models
Cosmic supports a variety of AI models from Anthropic, Google Gemini, and OpenAI for text, image, and video generation.
Text Generation Models
Anthropic Claude Models
| Model | API Model ID | Description |
|---|---|---|
| Claude Opus 4.5 | claude-opus-4-5-20251101 | Latest flagship model with improved performance |
| Claude Opus 4.1 | claude-opus-4-1-20250805 | Most powerful and capable model |
| Claude Opus 4 | claude-opus-4-20250514 | Previous flagship model |
| Claude Sonnet 4.5 | claude-sonnet-4-5-20250929 | High-performance model (recommended) |
| Claude Sonnet 4 | claude-sonnet-4-20250514 | High-performance model with exceptional reasoning |
| Claude Haiku 3.5 | claude-3-5-haiku-20241022 | Fastest model for simple tasks |
Google Gemini Models
| Model | API Model ID | Description |
|---|---|---|
| Gemini 3 Pro 🍌 | gemini-3-pro-preview | Google's most intelligent model with state-of-the-art reasoning, built for agentic workflows |
| Gemini 3 Pro Image 🍌 | gemini-3-pro-image-preview | Native image generation with contextual understanding, supports up to 4K images |
| Veo 3.1 Fast 📹 | veo-3.1-fast-generate-preview | Fast video generation with native audio (recommended for most use cases) |
| Veo 3.1 Standard 📹 | veo-3.1-generate-preview | Premium video generation with exceptional quality and cinematic realism |
OpenAI GPT Models
| Model | API Model ID | Description |
|---|---|---|
| GPT-5.2 | gpt-5.2 | Latest GPT-5 model with improved performance |
| GPT-5 | gpt-5 | OpenAI's most advanced multimodal model |
| GPT-5 Mini | gpt-5-mini | Cost-effective GPT-5 variant |
| GPT-5 Nano | gpt-5-nano | Ultra-fast and cost-effective for simple tasks |
| GPT-4.1 | gpt-4.1 | Enhanced coding and reasoning capabilities |
| GPT-4o | gpt-4o | Flagship multimodal model |
| o1 | o1 | Advanced reasoning model for complex problems |
| o3 | o3 | Latest reasoning model with improved performance |
Text Generation Model Selection
By default, Cosmic uses Claude Sonnet 4.5 for text generation. You can specify a different model by including the model parameter in your request:
// Using Claude
const response = await cosmic.ai.generateText({
prompt: 'Write a product description',
model: 'claude-opus-4-5-20251101', // Optional: specify model
max_tokens: 500
})
// Using Gemini for agentic workflows
const geminiResponse = await cosmic.ai.generateText({
prompt: 'Analyze this content and suggest improvements',
model: 'gemini-3-pro-preview', // Use Gemini 3 Pro
max_tokens: 500
})
Or using cURL:
# Using Claude
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/text \
-d '{"prompt":"Write a product description","model":"claude-opus-4-5-20251101","max_tokens":500}' \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer {BUCKET_WRITE_KEY}"
# Using Gemini
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/text \
-d '{"prompt":"Analyze this content and suggest improvements","model":"gemini-3-pro-preview","max_tokens":500}' \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer {BUCKET_WRITE_KEY}"
Image Generation Models
Google Gemini Image Models
| Model | API Model ID | Description | Supported Sizes |
|---|---|---|---|
| Gemini 3 Pro Image 🍌 | gemini-3-pro-image-preview | Native image generation with contextual understanding (default, recommended) | 1024x1024, 2048x2048, 4096x4096 |
OpenAI DALL-E Models
| Model | API Model ID | Description |
|---|---|---|
| DALL-E 3 | dall-e-3 | OpenAI's latest image generation model |
Image Generation Model Selection
By default, Cosmic uses Gemini 3 Pro Image for image generation. You can explicitly specify the model by including the model parameter in your request.
Gemini 3 Pro Image supports reference images for contextual generation. You can provide URLs to existing images, and Gemini will analyze their style, composition, and content to inform the generated image. This is perfect for:
- Maintaining consistent visual styles across generated images
- Creating variations based on existing artwork
- Applying the aesthetic of one image to a new scene
- Combining elements from multiple reference images
// Using DALL-E 3
const dalleResponse = await cosmic.ai.generateImage({
prompt: 'A futuristic cityscape at night',
model: 'dall-e-3', // Optional: defaults to dall-e-3
size: '1792x1024',
quality: 'hd'
})
// Using Gemini 3 Pro Image for 4K generation
const geminiResponse = await cosmic.ai.generateImage({
prompt: 'A futuristic cityscape at night with neon lights',
model: 'gemini-3-pro-image-preview', // Use Gemini for higher resolution
size: '4096x4096', // 4K image support
quality: 'hd'
})
// Using Gemini with reference images for style consistency
const styledResponse = await cosmic.ai.generateImage({
prompt: 'A mountain landscape in the same artistic style',
model: 'gemini-3-pro-image-preview',
reference_images: [
'https://cdn.cosmicjs.com/your-style-reference.jpg'
],
size: '2048x2048'
})
Or using cURL:
# Using DALL-E 3
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/image \
-d '{"prompt":"A futuristic cityscape at night","model":"dall-e-3","size":"1792x1024","quality":"hd"}' \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer {BUCKET_WRITE_KEY}"
# Using Gemini 3 Pro Image
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/image \
-d '{"prompt":"A futuristic cityscape at night","model":"gemini-3-pro-image-preview","size":"4096x4096","quality":"hd"}' \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer {BUCKET_WRITE_KEY}"
Video Generation Models
Google Veo Models
| Model | API Model ID | Description |
|---|---|---|
| Veo 3.1 Fast | veo-3.1-fast-generate-preview | Faster generation, excellent quality (30-90s) |
| Veo 3.1 Standard | veo-3.1-generate-preview | Premium quality, cinematic realism (60-180s) |
Video Generation Model Selection
By default, Cosmic uses Veo 3.1 Fast for video generation. Both models support:
- Native Audio: Automatically generated audio that matches the scene
- Flexible Durations: 4, 6, or 8 seconds
- High Resolution: 720p or 1080p
- Image-to-Video Mode: Use 1 reference image as the starting frame
Veo 3.1 Fast is recommended for most use cases, offering excellent quality with faster generation times. Use Veo 3.1 Standard for premium marketing content and cinematic quality.
// Using Veo Fast (recommended)
const fastVideo = await cosmic.ai.generateVideo({
prompt: 'A peaceful zen garden with koi fish swimming',
model: 'veo-3.1-fast-generate-preview',
duration: 8,
resolution: '720p'
})
// Using Veo Standard for premium quality
const premiumVideo = await cosmic.ai.generateVideo({
prompt: 'Cinematic shot of city skyline at golden hour',
model: 'veo-3.1-generate-preview',
duration: 8,
resolution: '1080p'
})
// Using reference image as starting frame (image-to-video)
const styledVideo = await cosmic.ai.generateVideo({
prompt: 'Product rotates smoothly revealing all angles',
model: 'veo-3.1-fast-generate-preview',
duration: 6,
reference_images: [
'https://cdn.cosmicjs.com/product-hero.jpg'
]
})
Or using cURL:
# Using Veo Fast
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/video \
-d '{"prompt":"A peaceful zen garden with koi fish","model":"veo-3.1-fast-generate-preview","duration":8}' \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer {BUCKET_WRITE_KEY}"
# Using Veo Standard for premium quality
curl https://workers.cosmicjs.com/v3/buckets/${BUCKET_SLUG}/ai/video \
-d '{"prompt":"Cinematic city skyline at golden hour","model":"veo-3.1-generate-preview","duration":8,"resolution":"1080p"}' \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer {BUCKET_WRITE_KEY}"
Using Reference Images with Video Generation (Image-to-Video Mode)
Veo 3.1 supports image-to-video mode, where you provide 1 reference image that becomes the first frame of your video. This ensures precise control over the starting appearance, composition, and style.
What Image-to-Video Does:
- Precise Starting Frame: Your reference image becomes the exact first frame of the video
- Product Accuracy: Start with your exact product photo and animate from there
- Character Consistency: Begin with a specific character pose or appearance
- Brand Control: Ensure videos start with approved brand imagery
Best Practices:
// Product showcase starting from hero image
const productVideo = await cosmic.ai.generateVideo({
prompt: 'Product rotates smoothly, revealing all angles with studio lighting',
reference_images: [
'https://cdn.cosmicjs.com/product-hero-front.jpg'
],
duration: 8
})
// Character animation from specific pose
const characterVideo = await cosmic.ai.generateVideo({
prompt: 'Character waves and smiles at camera with friendly expression',
reference_images: [
'https://cdn.cosmicjs.com/character-neutral-pose.jpg'
],
duration: 6
})
// Brand scene animation
const brandVideo = await cosmic.ai.generateVideo({
prompt: 'Camera slowly zooms in while maintaining brand aesthetic',
reference_images: [
'https://cdn.cosmicjs.com/brand-scene-wide.jpg'
],
duration: 8
})
Guidelines:
- Use high-resolution images (1024x1024+ recommended)
- Only 1 reference image supported (becomes the first frame)
- Ensure URL is publicly accessible
- Your prompt should describe the motion/animation starting from that image
- Think of it as "how should this image come to life?"
Advanced Veo 3.1 Capabilities
According to Google's Veo 3.1 documentation, Veo 3.1 supports several advanced features:
✅ Image-to-Video Mode
Use 1 reference image as the first frame of your video. Veo animates from this starting point with precise control over initial appearance. See examples above.
✅ Video Extension
Extend previously generated videos to create longer content. See the Extend Video section below for full documentation.
Extend Video
This endpoint enables you to extend a previously generated Veo video by creating a new 8-second clip that continues from the final second of the original video.
Video extension creates a new video that maintains visual continuity with the original. Only videos generated with Veo can be extended (they must have a veo_file_uri in their metadata). Extension is always 8 seconds at 720p resolution.
Required parameters
- Name
prompt- Type
- string
- Description
A text description of how to continue the video. Should describe the motion/action that follows the original video.
- Name
media_id- Type
- string
- Description
The ID of the original Veo-generated video to extend. The video must have been generated using the Generate Video endpoint.
Optional parameters
- Name
model- Type
- string
- Description
The video generation model to use. Options:
veo-3.1-fast-generate-preview(recommended) orveo-3.1-generate-preview. Defaults toveo-3.1-fast-generate-preview.
- Name
folder- Type
- string
- Description
Media folder to store the extended video in. (Folder must exist)
- Name
metadata- Type
- object
- Description
User-added JSON metadata for the extended video.
Limitations
- Duration: Extensions are always 8 seconds
- Resolution: Extensions are always 720p (even if the original was 1080p)
- Source: Only Veo-generated videos can be extended (must have
veo_file_uriin metadata) - Chaining: Extended videos can also be extended, enabling creation of longer narratives
Request Examples
import { createBucketClient } from '@cosmicjs/sdk'
const cosmic = createBucketClient({
bucketSlug: 'BUCKET_SLUG',
readKey: 'BUCKET_READ_KEY',
writeKey: 'BUCKET_WRITE_KEY'
})
// First, generate an initial video
const initialVideo = await cosmic.ai.generateVideo({
prompt: 'A calico kitten sitting peacefully in golden sunlight',
duration: 8,
resolution: '720p'
})
console.log('Initial video ID:', initialVideo.media.id)
// Then extend it with a continuation
const extendedVideo = await cosmic.ai.extendVideo({
media_id: initialVideo.media.id,
prompt: 'The kitten stands up and walks away into the garden'
})
console.log('Extended video URL:', extendedVideo.media.url)
console.log('Source video ID:', extendedVideo.source_media_id)
Response Example
{
"media": {
"id": "65f8b4c3d5e6f7g8h9i0j1k2",
"name": "veo-ext-a1b2c3d4-e5f6-g7h8-i9j0-k1l2m3n4o5p6.mp4",
"original_name": "veo-ext-a1b2c3d4-e5f6-g7h8-i9j0-k1l2m3n4o5p6.mp4",
"size": 8650000,
"type": "video/mp4",
"bucket": "65a1b2c3d4e5f6g7h8i9j0k1",
"created_at": "2025-12-20T10:35:00.000Z",
"folder": null,
"url": "https://cdn.cosmicjs.com/veo-ext-xxx.mp4",
"imgix_url": "https://imgix.cosmicjs.com/veo-ext-xxx.mp4",
"alt_text": "The scene continues with the character walking away",
"metadata": {
"duration": 8,
"resolution": "720p",
"generation_time_seconds": 52,
"is_extension": true,
"source_media_id": "65f8a3b2c4d5e6f7g8h9i0j1",
"veo_file_uri": "https://generativelanguage.googleapis.com/v1beta/files/..."
}
},
"usage": {
"input_tokens": 288000,
"output_tokens": 0,
"total_tokens": 288000
},
"generation_time_seconds": 52,
"source_media_id": "65f8a3b2c4d5e6f7g8h9i0j1",
"is_extension": true
}
Video Extension Use Cases
Creating Longer Narratives
Chain multiple 8-second extensions to create minute-long videos or longer stories:
// Build a 32-second video (4 segments)
let currentVideo = await cosmic.ai.generateVideo({
prompt: 'Opening scene: sunrise over mountains',
duration: 8
})
const segments = [currentVideo]
for (const continuation of [
'Birds take flight from the trees below',
'A hiker appears on the trail in the distance',
'Close-up of the hiker reaching the summit'
]) {
currentVideo = await cosmic.ai.extendVideo({
media_id: currentVideo.media.id,
prompt: continuation
})
segments.push(currentVideo)
}
console.log(`Created ${segments.length} connected segments (${segments.length * 8}s total)`)
Extending Product Showcases
Continue product demonstrations with smooth transitions:
const productIntro = await cosmic.ai.generateVideo({
prompt: 'Product slowly rotates on white background',
reference_images: ['https://cdn.cosmicjs.com/product.jpg'],
duration: 8
})
const productDetails = await cosmic.ai.extendVideo({
media_id: productIntro.media.id,
prompt: 'Camera zooms in to show product texture and details'
})
Building Background Loops
Create longer ambient footage for websites or presentations:
const ambientBase = await cosmic.ai.generateVideo({
prompt: 'Gentle rain falling on window with city lights blurred behind',
duration: 8
})
const ambientExtended = await cosmic.ai.extendVideo({
media_id: ambientBase.media.id,
prompt: 'Rain continues with occasional lightning flash in the distance'
})
Usage and Limitations
AI capabilities in Cosmic are subject to the following considerations:
- Rate Limits: AI generation requests may be subject to rate limiting based on your plan.
- Token Usage: Text, image, and video generation consume tokens. Media generation (images and videos) has higher token costs due to computational requirements.
- Media Storage: Generated images and videos are stored in your Cosmic Bucket's media library and count toward your storage quota.
- Video Generation Time: Video generation is asynchronous and typically takes 30-90 seconds (Fast) or 60-180 seconds (Standard).
- Content Policy: Generated content must comply with Cosmic's terms of service and content policies. Video generation will fail without a specific error message if the prompt violates content policies (e.g., celebrity likeness, copyrighted material, or safety concerns). If you receive a "No videos generated" error, try rephrasing your prompt to avoid references to real people, brands, or copyrighted characters.
- Regional Restrictions: In EU, UK, CH, MENA regions, person generation has limitations (see Google's documentation for details).
- Watermarking: Videos created by Veo are watermarked using SynthID for AI-generated content identification.
Pricing Overview
All AI features consume tokens from your monthly allocation or token packs. Token costs vary by model complexity and media type.
Text Generation Pricing
Text generation uses tiered pricing based on model capability. More powerful models cost more tokens to use, reflecting their higher computational requirements and superior performance.
How Token Multipliers Work: Your actual token deduction is multiplied by the tier multiplier:
| Tier | Multiplier | Models | Token Deduction |
|---|---|---|---|
| Budget | 1.0x | GPT-5 Nano, GPT-4.1 Nano, GPT-4o Mini, GPT-5 Mini, GPT-4.1 Mini, Claude Haiku 3, Claude Haiku 3.5 | 1,000 actual tokens = 1,000 deducted |
| Standard | 2.0x | GPT-5, GPT-5.2, GPT-4.1, GPT-4o, o1-mini, o3, o3-mini, o4-mini, Claude Sonnet 4, Claude Sonnet 4.5, Claude Opus 4.5, Gemini 3 Pro | 1,000 actual tokens = 2,000 deducted |
| Premium | 4.0x | Claude Opus 4, Claude Opus 4.1, o1, o3-pro | 1,000 actual tokens = 4,000 deducted |
Example: Using Claude Sonnet 4.5 (Standard tier) with a 1,000 token response will deduct 2,000 tokens from your balance (1,000 × 2.0x).
Media Generation Pricing
Images and videos use fixed token costs. All media generation costs are billed as output tokens.
| Feature | Token Cost (Output) |
|---|---|
| DALL-E 3 Image | 4,800 tokens |
| Gemini 1K/2K Image | 32,160 tokens |
| Gemini 4K Image | 57,600 tokens |
| Veo Fast Video (4s) | 144,000 tokens |
| Veo Fast Video (6s) | 216,000 tokens |
| Veo Fast Video (8s) | 288,000 tokens |
| Veo Fast Extension | 288,000 tokens |
| Veo Standard Video (4s) | 384,000 tokens |
| Veo Standard Video (6s) | 576,000 tokens |
| Veo Standard Video (8s) | 768,000 tokens |
| Veo Standard Extension | 768,000 tokens |
Note: Input tokens (your prompt) for media generation are minimal compared to the output cost. The token costs listed above represent the generation cost and are deducted from your output token allocation.
For more information about AI capabilities and pricing, please refer to the Cosmic pricing page or contact Cosmic support.