Bring your own model

Any backend that can produce text can drive an AgentKit session: a local llama.cpp server, a research endpoint, a company-internal gateway. Conform to AgentProvider and everything else — the tool loop, guards, undo, context, limits — comes from the session.

Two requirements carry the contract:

capabilities — a ProviderCapabilities value declaring what your backend can actually do. Every flag changes session behavior, so declare honestly.
stream(_ request:) — takes a CompletionRequest, returns an AsyncThrowingStream<StreamEvent, Error>.

Two more have defaults: validateTools(_:) (returns no warnings) and inFlightTracker (nil). Most providers implement just the first two.

A minimal provider

The smallest conformance — an echo model with no tool calling:

struct EchoProvider: AgentProvider {
    let capabilities = ProviderCapabilities(
        executionModel: .appDriven,
        toolDiscovery: .dynamicPerRequest,
        supportsStreaming: true,
        supportsToolCalling: false,
        supportsVision: false,
        supportsStructuredOutput: false,
        supportsSamplingConfig: false,
        supportsParallelToolCalls: false,
        modelSelection: .none,
        managedConversation: false,
        requiresNetworking: false
    )

    func stream(_ request: CompletionRequest) -> AsyncThrowingStream<StreamEvent, Error> {
        var userText = ""
        if let lastUser = request.messages.last(where: { $0.role == .user }) {
            userText = lastUser.content.compactMap { item -> String? in
                if case .text(let text) = item { return text }
                return nil
            }.joined(separator: "\n")
        }

        return AsyncThrowingStream { continuation in
            continuation.yield(.textDelta("You said: \(userText)"))
            continuation.yield(.done)
            continuation.finish()
        }
    }
}

Providers must be Sendable — the session calls them across concurrency domains. A struct of immutable state, like this one, conforms automatically.

Wire it up with a spec; the closure builds your provider when the agent is created:

let agent = try runtime.makeAgent(
    provider: AgentProviderSpec { EchoProvider() },
    role: AgentRole(staticPersona: "Echo everything.")
)
try await agent.send("hello")
// agent.currentText == "You said: hello"

When your provider needs to know the agent's tools at construction (the on-device Apple provider does — it executes them itself), use the registry-aware form. The closure receives the agent's scoped registry — only the domains in the agent's scope, with executors resolvable per tool:

let spec = AgentProviderSpec(registryAware: { registry in
    MyProvider(tools: registry.allTools())
})

Declare capabilities honestly

ProviderCapabilities is not metadata — every flag changes how the session behaves. The load-bearing ones:

Flag	What the session does with it
`executionModel`	`.appDriven` — the session runs the tool loop: your provider emits tool-call requests and receives the results in the next request's messages. `.providerDriven` — your provider executes tools itself and reports lifecycle events.
`toolDiscovery`	`.dynamicPerRequest` — the current active tool list rides every request; discovery starts from the built-in `agentkit.*` meta-tools. `.eagerSessionTools` — every scoped domain is active from the first request. `.sessionRebuild(maxRebuilds:)` — eager tools, re-primed mid-conversation when new domains activate.
`supportsToolCalling`	`false` — the session routes tools through text: `request.tools` arrives empty, the system prompt carries the tool instructions, and tool calls are parsed back out of the model's reply.
`requiresNetworking`	`true` — a configured pre-transmit filter runs on every outbound request before your provider sees it. See control what leaves the device.
`managedConversation`	`true` — the session sends only messages your backend has not seen yet, for backends that hold conversation state server-side.
`backendManagedSystemPrompt`	`true` — per-turn directives and context summaries ride a preamble inside the user message instead of growing the system prompt, for backends that own the prompt server-side.
`supportsToolChoice`	`false` — a send with `.required` or `.none` fails typed before any request is built. The default is `false`: fail closed. See steer tool use.
`supportsStructuredOutput`	`false` — `generate()` fails typed before your provider is called. See get data, not prose.
`maxTools` / `contextWindow`	tool activation is capped at `maxTools`; history is compacted against `contextWindow` before every request.

What a request carries

CompletionRequest is everything one provider round trip needs:

Field	Contents
`systemPrompt`	persona plus per-turn directives and context (unless `backendManagedSystemPrompt`)
`messages`	conversation history, already compacted against `contextWindow`
`tools`	the active tool definitions — may include the built-in `agentkit.*` meta-tools; empty when `supportsToolCalling` is false
`sampling`	optional `temperature` / `topP` / `maxTokens` — map what your backend supports, ignore the rest (and declare `supportsSamplingConfig` accordingly)
`structuredOutput`	when set, constrain output to the schema and emit the JSON as text — arrives only if you declare `supportsStructuredOutput`; otherwise `generate()` fails typed and this is never populated
`maxToolCallsPerTurn`	provider-driven only — the cap on tool executions inside your turn
`toolChoice`	`.auto` / `.required` / `.none`; non-auto arrives only if you advertise `supportsToolChoice`

What a provider emits

The event vocabulary splits by execution model. App-driven providers emit:

.textDelta(String) — a chunk of assistant text
.toolCallComplete(ToolCall) — a tool the session should execute; the result arrives in your next request's messages
.toolCallPartial(id:name:argsDelta:) — optional incremental arguments; the session acts on toolCallComplete
.usage(UsageReport) — token accounting, surfaced as lastUsageReport
.done — the turn's final event

Provider-driven providers execute tools themselves and report the lifecycle: .toolCallStarted(ToolCall), then exactly one terminal event per call — .toolCallCompleted, .toolCallDenied, .toolCallFailed, .toolCallCancelled, or .toolCallConflict. Finishing the stream by throwing fails the turn; the session rolls it back.

The provider-driven event contract

The session builds durable conversation history from the lifecycle events a provider-driven provider emits, so it enforces their shape: every toolCallStarted reaches exactly one terminal event, call ids are unique within the turn, and everything lands before the stream ends. A violating turn fails with the typed AgentSessionError.providerEventContractViolation and persists no assistant or tool history beyond the already-appended user message — a broken provider is never laundered into valid-looking history.

If you are building a provider-driven provider, exercise it against an AgentSession with these traces: a valid single call and valid parallel calls (both must succeed and persist); a terminal event with no started, a duplicate started id, a duplicate terminal, and a started with no terminal (each must fail typed).

Validate tools

validateTools(_:) runs before every send. Return [ToolSchemaWarning] for non-fatal issues — they surface on agent.lastSchemaWarnings and the turn proceeds. Throw for schemas your backend cannot represent — the send fails as a schema validation error before the user message is recorded.

How a turn works — the loop your provider plugs into.
Capability matrix — how the built-in providers declare themselves.
When it fails — what your stream's errors become.