Skip to main content
Version: 2.0

Steps

Steps let an agent move through distinct phases of a session while preserving the full conversation history. When the agent transitions from one step to the next, the history carries over but the system prompt effectively changes.

Most agents do not need steps. A single step with well-written instructions handles the overwhelming majority of use cases. Reach for steps when the agent genuinely moves through phases that benefit from different prompts — for example, triage → research → draft → review.

If you want a fresh context rather than a prompt change, use sub-agents instead. Steps preserve history; sub-agents start fresh.

Define steps on the agent

Steps live in the steps map on the agent, keyed by step name. The agent entry point is named via first_step_name.

AGENT WITH MULTIPLE STEPS

Code example with json syntax.
1

Each step carries its own instructions, reminders, allowed_tools, and allowed_skills — the same building blocks as a single-step agent, scoped to just that step. Leaving allowed_tools unset exposes every tool; an empty list exposes none.

Two fields are specific to multi-step agents: next_steps controls transitions (covered below), and reentry_step picks which step the session resumes at when a new user message arrives on a later turn. Without reentry_step, re-entry happens at whichever step the prior turn ended on — usually fine, but set it explicitly when you want every new user turn to start from (say) a triage step.

An output_parser of structured tells the step to emit JSON instead of free-form text. This pairs especially well with next_steps: a classifier step emits { "intent": "sales" } and a downstream step routes on get('$.output.intent') without string parsing. See Structured outputs for the schema format.

Transitions

The next_steps array on a step is evaluated after the step's LLM turn completes. Each entry has a step_name and an optional condition; the first matching condition wins, and a next_steps entry with no condition is the catch-all. If nothing matches, the agent ends the turn in the current step.

Conditions are UserFn expressions using the get() function with JSONPath. The context exposes four top-level keys:

{
"agent": { ... },
"session": { ... },
"tools": { "<tool_config_name>": { "outputs": { "latest": { } } } },
"output": { ... }
}

Agent and session metadata you set elsewhere is reachable under $.agent.metadata.* and $.session.metadata.*. The most recent call of any tool the step used lives at $.tools.<name>.outputs.latest. $.output is the LLM output of the current step — a text field under the default parser, or your JSON fields directly under $.output with a structured parser.

get('$.output.intent') == 'sales'
get('$.session.metadata.tier') == 'enterprise' and get('$.output.urgent') == true

Two get() behaviors to design around:

  • get() returns scalars only — a string, number, boolean, or null. Pointing it at an object or array (get('$.tools.approval.outputs.latest')) errors and the transition fails for that turn. Read a specific field, not the wrapping object. If a tool legitimately returns nested data, expose the routing value at the top level of the tool's output.
  • A missing path returns null, not an error. If the agent never called the tool a condition reads from, get('$.tools.approval.outputs.latest.decision') is null. A comparison like get(...) == 'approved' cleanly evaluates to false, so a catch-all next_steps entry handles the "never called" case. There is no separate signal for "called this turn" versus "called earlier in the session"; if you need to distinguish, have the tool always return an explicit value (such as "pending") and test the string rather than null.

The key under $.tools is the tool configuration name from the agent's tool_configurations map — not the underlying tool ID.

next_steps are evaluated once, after the step's final text turn — not between tool calls within a step.

Session history across steps

When the agent transitions between steps, the session history is preserved in full. The new step sees all prior user messages, agent outputs, tool calls, and tool outputs — it just applies a different system prompt on top of them. This is the main thing that distinguishes steps from sub-agents, which run with a fresh history.

When to reach for steps

Typical uses:

  • Classification and routing. A classifier step reads the user input, decides which specialist step should handle the rest, and transitions.
  • Phase-structured workflows. An investigation that genuinely has distinct gather → analyze → report phases, where each phase benefits from different guidance and tool access.
  • Gated escalation. A support step that transitions to an escalate_to_human step once a condition is met.
  • Plan-then-execute. See the dedicated section below.

Many workflows do not require steps at all. A single-step agent with the right combination of skills, tools, and instructions can handle several task types without any explicit routing logic — the LLM picks the right tool and loads the right skill based on the user's request. Reach for steps when phases need different tool access and the transition between them must be driven by a condition you define, not a decision the model makes on the fly.

Plan-then-execute

A plan step with read-only tools produces a structured plan and writes it to an artifact via artifact_create. An execute step with write access calls artifact_read to load the plan and carries it out. Front-loading the thinking keeps the implementation turns focused and leaves a durable plan the agent can re-read on later turns. This is the same pattern Claude Code's plan mode uses.

{
"first_step_name": "plan",
"steps": {
"plan": {
"instructions": [{ "type": "inline", "template": "Produce a plan. Do not act. Call artifact_create with the plan." }],
"allowed_tools": ["artifact_create", "search"],
"next_steps": [{ "step_name": "execute" }]
},
"execute": {
"instructions": [{ "type": "inline", "template": "Call artifact_read to load the plan, then carry it out." }],
"allowed_tools": ["artifact_read", "write_file"]
}
}
}

When not to use steps:

  • If the phases differ only in tone or style, a single step with good instructions is simpler.
  • If you want isolated context for a side task, use a sub-agent.
  • If you just need to remind the agent of something as the session grows, use a reminder.

Approval gates

An approval gate is a step that must complete before the agent can reach the tools on a later step. Because each step has its own allowed_tools list, tools on a later step are unreachable until the agent transitions through the gate. This happens only when the gate's next_steps condition is satisfied. It is more reliable than an instruction such as "ask before doing X": the model cannot skip a step boundary.

What the gate step can do

Choose an approach that produces the right approval signal for your situation. Prefer routing on a tool's return value over routing on the model's own output: the model can hallucinate a {"confirmed": true} emit, but it cannot fabricate the return value of a tool it actually called.

Hand off to an approval tool. Attach a tool that talks to the approval system — a Wolken lookup, a ServiceNow workflow, a custom webhook, or a lambda tool wrapping an internal endpoint. Advance the step when that tool returns an approved decision. This is the recommended pattern.

Ask the user in the chat. The gate step's instructions tell the agent to summarize what is about to happen and require the user to reply CONFIRM, then emit a structured {"confirmed": true} so the next_steps condition can test a typed field. Even with this gate, the model can confuse itself and emit confirmed: true without a real CONFIRM, so reserve this pattern for low-stakes flows. The escalated tools are still unreachable from the gate step, so the worst case is a spurious transition, not a spurious tool call.

Build approval into the tool. For actions that must be gated regardless of agent behavior, push the approval check into the tool itself. The tool pauses, notifies the appropriate person, and only executes once approved. This is custom scoping work for specific high-risk tools.

Gate on tool output

Register the approval tool under a stable key in the agent's tool_configurations map — that key is what the next_steps condition reads from $.tools.<name>.outputs.latest. The gate step exposes only the approval tool; downstream provisioning tools live on the next step and are unreachable until the condition fires.

Design the tool to return a top-level scalar field for the routing decision (an enum string like "approved" | "rejected" | "needs_review" beats a boolean — it leaves room for a third state without breaking the schema). Then branch on it:

{
"first_step_name": "request_approval",
"steps": {
"request_approval": {
"instructions": [{ "type": "inline", "template": "Summarize the request and call the approval tool. Do not respond to the user until you have a decision." }],
"allowed_tools": ["approval"],
"next_steps": [
{ "condition": "get('$.tools.approval.outputs.latest.decision') == 'approved'", "step_name": "provision" },
{ "condition": "get('$.tools.approval.outputs.latest.decision') == 'rejected'", "step_name": "notify_rejection" },
{ "step_name": "ask_for_info" }
]
},
"provision": {
"instructions": [{ "type": "inline", "template": "Execute the approved request." }],
"allowed_tools": ["create_ticket", "send_notification"]
},
"notify_rejection": {
"instructions": [{ "type": "inline", "template": "Tell the user the request was declined and cite get('$.tools.approval.outputs.latest.reason')." }]
},
"ask_for_info": {
"instructions": [{ "type": "inline", "template": "Ask the user for whatever information the approval tool needs." }]
}
}
}

The catch-all ask_for_info entry covers both "needs_review" and the case where the agent never called the tool — both look like "decision is not 'approved' and not 'rejected'" to the condition.

latest is the most recent call across the whole session, not just the current step. If a later step re-asks the same tool, the gate's condition will read the newer result. Give the per-step tool a distinct tool_config_name (for example approval_followup) if you need to route on each call independently.

Async external approval (fire and forward)

For approval flows where a human acts in an external system (for example, an approvals Slack channel), the intake session ends after the gate step posts the request. There is no in-session next_steps transition waiting for approval. Instead, a separate handler agent receives the approval event through a connector and acts on it. In this pattern the gate step has no next_steps at all — its purpose is to post the notification and tell the user their request has been submitted.

See Integrations for how to set up a Slack connector that feeds replies into a handler agent.

Notifications

Notifications are tool calls like any other: attach a tool that posts to the target surface (a webhook, a ticket comment), and lock the destination in argument_override so the LLM cannot change it. Instruct the agent when to call it, typically on request submission, on approval, and on completion.

For time-based escalation (following up if no approval arrives within a set window), use the agent's Schedules tab to trigger a follow-up invocation after a configured interval.

Limits

A single session turn can transition between steps at most 500 times before the platform stops execution with a step_transition_limit_exceeded event. In practice this only triggers when two steps ping-pong at each other via mis-written conditions — a real workflow rarely crosses a dozen transitions. Treat the ceiling as a loop-breaker, not a budget to spend.

Transitions run serially; an agent is never in two steps at once. Use sub-agents when you need parallel work.