Generation Options

GenerationOptions is the request-level configuration object for text, streaming, structured output, and tool loops.

let options = GenerationOptions(
    system: "Answer as a careful Swift reviewer.",
    temperature: 0.2,
    topP: 0.9,
    maxTokens: 600,
    stopSequences: ["</answer>"],
    headers: ["X-Request-ID": requestID],
    retryPolicy: RetryPolicy(maxAttempts: 3)
)

Option area	Fields
Prompt behavior	`system`, `stopSequences`
Sampling	`temperature`, `topP`, `topK`, `seed`
Length and repetition	`maxTokens`, `presencePenalty`, `frequencyPenalty`
Provider transport	`headers`, `retryPolicy`
Tool-step cache hints	`promptCaching`

onChunk and onFinish are callback parameters on streaming helpers such as streamText, not fields on GenerationOptions.

Prompt And Sampling

Use these fields to shape the model response:

system: instruction prepended as the system message when the provider supports it.
temperature: randomness. Lower values are more deterministic.
topP: nucleus sampling.
topK: candidate cap for providers that expose it.
maxTokens: output token budget.
seed: deterministic hint for providers that support seeded generation.
presencePenalty: discourages repeating existing topics.
frequencyPenalty: discourages repeating the same tokens.
stopSequences: strings that stop generation.

let response = try await generateText(
    model: model,
    prompt: "Suggest five concise commit messages.",
    options: GenerationOptions(
        system: "Return only a numbered list.",
        temperature: 0.4,
        maxTokens: 180,
        stopSequences: ["6."]
    )
)

Provider support varies. Unsupported controls are ignored by providers that do not expose matching parameters.

Headers And Retries

Per-request headers are useful for request ids, beta flags, or provider-specific routing. Retries are opt-in.

let response = try await generateText(
    model: model,
    prompt: "Summarize this incident report.",
    options: GenerationOptions(
        headers: [
            "X-Request-ID": requestID,
            "X-Feature": "incident-summary"
        ],
        retryPolicy: RetryPolicy(
            maxAttempts: 4,
            baseDelay: .milliseconds(300),
            retryableStatusCodes: [429, 500, 502, 503, 504]
        )
    )
)

RetryPolicy.none is the default and means one attempt.

Prompt Caching

PromptCachingOptions exists on GenerationOptions, but current provider support is narrow. Today the OpenAI-compatible tool-calling step encoder forwards cacheKey and retention as prompt_cache_key and prompt_cache_retention. Normal generateText, streamText, Anthropic, Gemini, and cachedContent do not currently use these fields.

let options = GenerationOptions(
    system: "Use the product manual as reference.",
    promptCaching: PromptCachingOptions(
        cacheKey: "manual-v4",
        retention: "ephemeral",
        cachedContent: largePromptPrefix
    )
)

Use these fields only when you control or have verified the target provider path. For general text and streaming calls, treat prompt caching as not implemented until the provider you use explicitly supports it.

PromptCachingOptions fields are cacheKey, retention, and cachedContent (all optional String). SwiftyAI passes the intent through the provider layer. The provider decides whether the cache key, retention, or cached content fields map to its API.

Defaults With Middleware

If one model should always receive the same defaults, wrap it instead of repeating options at every call.

let wrapped = wrapLanguageModel(
    model,
    middleware: [
        defaultSettingsMiddleware(
            system: "Answer in the product team's voice.",
            temperature: 0.3,
            maxTokens: 400
        )
    ]
)
 
let response = try await generateText(
    model: wrapped,
    prompt: "Explain workspace sharing."
)

Per-request values override middleware defaults.