SwiftyAISwiftyAI

Search documentation

Find a docs page by title or section

1

Video

generateVideo starts a provider video job, polls until it is complete, and returns a VideoResponse with MP4 data.

Response fieldMeaning
idProvider job or result id
dataFinal video bytes
mediaTypeUsually MP4 unless the provider returns another type
statusProvider status when surfaced
modelModel name when returned

Generate Video

let videoModel = OpenAICompatibleProvider(
    baseURL: "https://api.openai.com/v1",
    apiKey: ProcessInfo.processInfo.environment["OPENAI_API_KEY"]!,
    model: "sora-video-model"
)
 
let video = try await generateVideo(
    model: videoModel,
    prompt: "A simple product walkthrough of a Swift package documentation site.",
    options: VideoGenerationOptions(
        size: .landscape1536x1024,
        seconds: 8,
        pollInterval: .seconds(5),
        maxPollAttempts: 60
    )
)
 
try video.data.write(to: outputURL)

VideoResponse includes the provider job id, final data, mediaType, optional model, and optional status.

Gemini Video

Gemini video generation uses the same function with a GeminiProvider.

let gemini = GeminiProvider(
    apiKey: ProcessInfo.processInfo.environment["GEMINI_API_KEY"]!,
    model: "veo-video-model"
)
 
let video = try await generateVideo(
    model: gemini,
    prompt: "A calm onboarding animation for a task app.",
    options: VideoGenerationOptions(
        aspectRatio: "16:9",
        negativePrompt: "No text overlays"
    )
)

The current Gemini video encoder sends aspectRatio, negativePrompt, and seed. seconds is used by OpenAI-compatible video generation, but it is not sent to Gemini today. Video model names and limits change by provider account, so keep them in configuration rather than hard-coding them deep in UI code.

Polling Controls

VideoGenerationOptions includes pollInterval and maxPollAttempts. Increase maxPollAttempts for long jobs. Decrease pollInterval only if the provider allows frequent polling.

OptionUse
secondsRequested duration for OpenAI-compatible video endpoints
size / aspectRatioOutput shape for the target surface
negativePromptThings the model should avoid
pollIntervalDelay between provider status checks
maxPollAttemptsUpper bound for long-running jobs
Related docs

Use multimodal input when video is part of a prompt rather than the generated output.