Bring your own model
Any backend that can produce text can drive an AgentKit session: a local
llama.cpp server, a research endpoint, a company-internal gateway. Conform to
AgentProvider and everything else — the tool loop, guards, undo, context,
limits — comes from the session.
Two requirements carry the contract:
capabilities— aProviderCapabilitiesvalue declaring what your backend can actually do. Every flag changes session behavior, so declare honestly.stream(_ request:)— takes aCompletionRequest, returns anAsyncThrowingStream<StreamEvent, Error>.
Two more have defaults: validateTools(_:) (returns no warnings) and
inFlightTracker (nil). Most providers implement just the first two.
A minimal provider
The smallest conformance — an echo model with no tool calling:
struct EchoProvider: AgentProvider {
let capabilities = ProviderCapabilities(
executionModel: .appDriven,
toolDiscovery: .dynamicPerRequest,
supportsStreaming: true,
supportsToolCalling: false,
supportsVision: false,
supportsStructuredOutput: false,
supportsSamplingConfig: false,
supportsParallelToolCalls: false,
modelSelection: .none,
managedConversation: false,
requiresNetworking: false
)
func stream(_ request: CompletionRequest) -> AsyncThrowingStream<StreamEvent, Error> {
var userText = ""
if let lastUser = request.messages.last(where: { $0.role == .user }) {
userText = lastUser.content.compactMap { item -> String? in
if case .text(let text) = item { return text }
return nil
}.joined(separator: "\n")
}
return AsyncThrowingStream { continuation in
continuation.yield(.textDelta("You said: \(userText)"))
continuation.yield(.done)
continuation.finish()
}
}
}
Providers must be Sendable — the session calls them across concurrency
domains. A struct of immutable state, like this one, conforms automatically.
Wire it up with a spec; the closure builds your provider when the agent is created:
let agent = try runtime.makeAgent(
provider: AgentProviderSpec { EchoProvider() },
role: AgentRole(staticPersona: "Echo everything.")
)
try await agent.send("hello")
// agent.currentText == "You said: hello"
When your provider needs to know the agent's tools at construction (the on-device Apple provider does — it executes them itself), use the registry-aware form. The closure receives the agent's scoped registry — only the domains in the agent's scope, with executors resolvable per tool:
let spec = AgentProviderSpec(registryAware: { registry in
MyProvider(tools: registry.allTools())
})
Declare capabilities honestly
ProviderCapabilities is not metadata — every flag changes how the session
behaves. The load-bearing ones:
| Flag | What the session does with it |
|---|---|
executionModel |
.appDriven — the session runs the tool loop: your provider emits tool-call requests and receives the results in the next request's messages. .providerDriven — your provider executes tools itself and reports lifecycle events. |
toolDiscovery |
.dynamicPerRequest — the current active tool list rides every request; discovery starts from the built-in agentkit.* meta-tools. .eagerSessionTools — every scoped domain is active from the first request. .sessionRebuild(maxRebuilds:) — eager tools, re-primed mid-conversation when new domains activate. |
supportsToolCalling |
false — the session routes tools through text: request.tools arrives empty, the system prompt carries the tool instructions, and tool calls are parsed back out of the model's reply. |
requiresNetworking |
true — a configured pre-transmit filter runs on every outbound request before your provider sees it. See control what leaves the device. |
managedConversation |
true — the session sends only messages your backend has not seen yet, for backends that hold conversation state server-side. |
backendManagedSystemPrompt |
true — per-turn directives and context summaries ride a preamble inside the user message instead of growing the system prompt, for backends that own the prompt server-side. |
supportsToolChoice |
false — a send with .required or .none fails typed before any request is built. The default is false: fail closed. See steer tool use. |
supportsStructuredOutput |
false — generate() fails typed before your provider is called. See get data, not prose. |
maxTools / contextWindow |
tool activation is capped at maxTools; history is compacted against contextWindow before every request. |
What a request carries
CompletionRequest is everything one provider round trip needs:
| Field | Contents |
|---|---|
systemPrompt |
persona plus per-turn directives and context (unless backendManagedSystemPrompt) |
messages |
conversation history, already compacted against contextWindow |
tools |
the active tool definitions — may include the built-in agentkit.* meta-tools; empty when supportsToolCalling is false |
sampling |
optional temperature / topP / maxTokens — map what your backend supports, ignore the rest (and declare supportsSamplingConfig accordingly) |
structuredOutput |
when set, constrain output to the schema and emit the JSON as text — arrives only if you declare supportsStructuredOutput; otherwise generate() fails typed and this is never populated |
maxToolCallsPerTurn |
provider-driven only — the cap on tool executions inside your turn |
toolChoice |
.auto / .required / .none; non-auto arrives only if you advertise supportsToolChoice |
What a provider emits
The event vocabulary splits by execution model. App-driven providers emit:
.textDelta(String)— a chunk of assistant text.toolCallComplete(ToolCall)— a tool the session should execute; the result arrives in your next request's messages.toolCallPartial(id:name:argsDelta:)— optional incremental arguments; the session acts ontoolCallComplete.usage(UsageReport)— token accounting, surfaced aslastUsageReport.done— the turn's final event
Provider-driven providers execute tools themselves and report the lifecycle:
.toolCallStarted(ToolCall), then exactly one terminal event per call —
.toolCallCompleted, .toolCallDenied, .toolCallFailed,
.toolCallCancelled, or .toolCallConflict. Finishing the stream by
throwing fails the turn; the session rolls it back.
The provider-driven event contract
The session builds durable conversation history from the lifecycle events a
provider-driven provider emits, so it enforces their shape: every
toolCallStarted reaches exactly one terminal event, call ids are unique
within the turn, and everything lands before the stream ends. A violating
turn fails with the typed AgentSessionError.providerEventContractViolation
and persists no assistant or tool history beyond the already-appended user
message — a broken provider is never laundered into valid-looking history.
If you are building a provider-driven provider, exercise it against an
AgentSession with these traces: a valid single call and valid parallel
calls (both must succeed and persist); a terminal event with no started, a
duplicate started id, a duplicate terminal, and a started with no terminal
(each must fail typed).
Validate tools
validateTools(_:) runs before every send. Return [ToolSchemaWarning] for
non-fatal issues — they surface on agent.lastSchemaWarnings and the turn
proceeds. Throw for schemas your backend cannot represent — the send fails as
a schema validation error before the user message is recorded.
Next
- How a turn works — the loop your provider plugs into.
- Capability matrix — how the built-in providers declare themselves.
- When it fails — what your stream's errors become.