Middleware

Middleware lets you wrap a model with reusable request and response behavior. It is useful for defaults, cleanup, simulated streaming, private policy checks, and small provider adaptations that should not live in every call site.

Language Middleware

wrapLanguageModel composes LanguageModelMiddleware around any AIModel. Each middleware receives a mutable request and a continuation, so it can adjust inputs before the provider runs or transform the AIResponse after the provider returns.

import SwiftyAI
 
let wrapped = wrapLanguageModel(
    model,
    middleware: [
        defaultSettingsMiddleware(
            system: "Answer with concise Swift examples.",
            temperature: 0.2,
            maxTokens: 500
        ),
        extractReasoningMiddleware(),
        extractJsonMiddleware(onFailure: .leaveUnchanged)
    ]
)
 
let response = try await generateText(
    model: wrapped,
    prompt: "Return a JSON object with a title and summary."
)

Middleware	Use it for
`defaultSettingsMiddleware`	Applying default system prompts, sampling, headers, retries, and prompt-caching values while letting explicit request options win
`extractJsonMiddleware`	Cleaning model output when JSON is wrapped in Markdown fences or prose
`extractReasoningMiddleware`	Removing hidden reasoning tags before app code consumes the final text
Custom `LanguageModelMiddleware`	Auditing, prompt rewriting, response decoration, internal policy checks, or test doubles

Middleware composes in order. The first middleware sees the request first, the provider runs last, and the response comes back through the same chain.

defaultSettingsMiddleware only overrides a per-request value when the request did not set it. The retry policy override fires whenever the request's retryPolicy still looks like RetryPolicy.none (one attempt with the default retryable status codes). If you want to keep a request at "no retries" while a middleware sets a more aggressive policy, change the retryable status codes too — for example RetryPolicy(maxAttempts: 1, retryableStatusCodes: []) will not be treated as the default and will not be overridden.

Streaming Middleware

Streaming middleware wraps models that conform to AIStreamModel. It can set defaults for streaming requests or simulate a stream from a non-streaming model.

let streaming = wrapLanguageModel(
    model,
    streamMiddleware: [
        defaultStreamingSettingsMiddleware(temperature: 0.3),
        simulateStreamingMiddleware(
            chunkSize: 24,
            delay: .milliseconds(30)
        )
    ]
)
 
for try await chunk in streamText(model: streaming, prompt: "Draft a reply.") {
    print(chunk.text, terminator: "")
}

Use simulated streaming for demos, previews, tests, or model adapters that only expose full-response generation. Use native provider streams for production chat and long outputs when the provider supports them.

Image Middleware

Image models use ImageModelMiddleware instead of LanguageModelMiddleware.

let wrappedImageModel = wrapImageModel(
    imageModel,
    middleware: [
        ImageModelMiddleware { request, next in
            var options = request.options
            options.headers["X-Request-ID"] = requestID
            return try await next.generate(prompt: request.prompt, options: options)
        }
    ]
)

Use this for image-specific request tagging, policy checks, option defaults, and response inspection.

Custom Middleware

Custom middleware is plain Swift. Keep it narrow and predictable: update the request, call next, then optionally transform the response.

let suffixPrompt = LanguageModelMiddleware { request, next in
    var request = request
    request.promptText += "\n\nWrite for an iOS engineering audience."
    return try await next(request)
}
 
let wrapped = wrapLanguageModel(model, middleware: [suffixPrompt])

request.promptText is a convenience over request.prompt: [AIMessageContent]. Setting it (including +=) replaces the entire prompt with a single .text part, which drops any image, PDF, audio, video, or file parts. For multimodal prompts, mutate request.prompt directly:

let suffixPrompt = LanguageModelMiddleware { request, next in
    var request = request
    request.prompt.append(.text("\n\nWrite for an iOS engineering audience."))
    return try await next(request)
}

Pattern	Better as middleware when
Default options	The same model policy applies across many call sites
Output cleanup	Providers return useful text with wrappers your app does not want
Request tagging	Internal gateways need headers, request ids, or metadata
Test behavior	A feature should exercise streaming or cleanup without network access