Skip to main content
Version: 2.0

Reminders

LLMs have a well-documented recency bias: the most recent tokens in the prompt dominate their attention. In a short session, that's fine — the system instructions are not far away. In a long session, it means an instruction written at turn 1 is effectively invisible by turn 40.

Reminders are short pieces of text the platform automatically appends to user inputs or tool outputs on every turn, so the things you care about stay near the end of the prompt where the model is still paying attention. They are how you counteract drift on long sessions without rewriting the system prompt on every message.

Where reminders live

Reminders are attached to an agent step, not to the agent as a whole, so they only fire while the agent is in that step. They're triggered by hooks — named event types. Two are supported: input_message (the user sent a message) and tool_output (a tool returned a result). A single reminder can subscribe to both.

Two kinds of reminders

There are two reminder types.

Templated reminder

A Velocity-templated string re-injected on every matching hook event. Use this for general context, rules, and anything that depends on agent or session metadata.

TEMPLATED REMINDER ON A STEP

Code example with json syntax.
1

The template can reference ${session.metadata.*}, ${agent.metadata.*}, and ${currentDate}, but cannot depend on per-turn state. If you need truly per-turn data, put it in the tool output or the instruction itself.

TEMPLATED REMINDER ON A RETRIEVAL TOOL'S OUTPUT

Code example with json syntax.
1

When a reminder needs to stay silent under some condition, emit an empty string from the Velocity template rather than trying to suppress the hook:

#if($session.metadata.tier == "enterprise")Reminder: enterprise SLA applies — respond within 2 business hours.#end

Glossary expansion reminder

A specialized reminder that looks up terms from a glossary in user input and appends their expansions as a hint block. Use this for internal acronyms, codenames, and team names the LLM would not recognize.

GLOSSARY EXPANSION REMINDER

Code example with json syntax.
1

Glossary reminders always fire on input_message; there are no configurable hooks. See Glossaries for the matching rules and how the hint block is formatted.

Throttling: fire_every and skip_first

Templated reminders default to firing on every matching event. That's the right behavior for short, ever-relevant guardrails, but it gets expensive — and noisy — for longer reminders that only matter once the session has gone on for a while. Two integer fields on a templated reminder let you control the cadence:

  • fire_every — fire on every Nth matching event. Default 1 (every event). 3 means turns 1, 4, 7, …
  • skip_first — skip the first N matching events at session start before firing for the first time. Default 0 (no warmup).

Both counters are tracked per hook: input_message and tool_output count independently, so a reminder subscribed to both hooks won't have its turn-count drift because the agent happened to call a lot of tools.

The late-conversation reminding pattern

The intended use case is context rot: in a short session the system prompt is still close enough that the model is honoring it, so re-injecting the same rule on every turn just burns tokens. As the conversation grows the system prompt fades from the model's effective attention, and that's when the reminder starts earning its keep. Combine skip_first with a non-trivial fire_every to delay the reminder until it actually matters and then keep it infrequent enough to stay cheap.

LATE-FIRING REMINDER FOR A LONG SUPPORT SESSION

Code example with json syntax.
1

A good starting point for crucial rules in long agentic sessions is skip_first: 4–8, fire_every: 3–5. Tune from there: if the model is still drifting, lower fire_every; if you're paying for reminders nobody needed, raise skip_first.

Counters reset on compaction

When a session is compacted, the post-compaction event stream is treated as a fresh sequence — index 0 — and both counters restart. skip_first warmup applies again from the first new matching event after the compaction boundary.

This is deliberate: compaction is the platform's signal that the earlier history has rotted out of the model's working context, so the reminder schedule for keeping things fresh should restart along with it. If you've tuned skip_first to delay a reminder until "the session has gone on for a while," that condition is true again immediately after a compaction — but the counter recount means you get the same warmup ramp on the new stretch of conversation rather than firing on the very first post-compaction turn.

fire_every and skip_first only apply to templated reminders. Glossary expansion reminders always fire when their input matches.

How the LLM sees a reminder

input_message hooks

Use this hook when the constraint is about how the agent should interpret or respond to user input — tone, guardrails keyed to who is speaking, per-user context expansion. The reminder text is appended to the user's content, so from the model's point of view it looks like the user said it:

how do I deploy?

Reminder: the user is on the enterprise tier. Never quote a discount unless the tier is 'enterprise'.

The Reminder: prefix is part of the author's template — the platform doesn't prepend anything. Write the template however reads best.

tool_output hooks

Use this hook when the constraint is about how to handle results coming back from a tool — citation requirements on retrieval, redaction rules on PII-bearing fetches, "don't invent fields not present in this payload." The reminder rides along with the tool output so the model sees it immediately after the result, exactly where it's deciding what to do with that data.

If multiple reminders fire on the same tool output, their texts are joined and delivered together.


The original user message and tool output stay unchanged in the session event log — the reminder is only layered on for the LLM.

Reminders vs instructions

  • Instructions are the system prompt. They're compiled once at the start of a step, and the LLM reads them as the framing for the entire conversation. Best for persona, rules, output format, and static guidance.
  • Reminders fire on every matching event. They land near the end of the prompt, where the model's attention is strongest. Best for constraints the model tends to forget over time, per-message context expansion, and recency-sensitive guardrails.

Use instructions for things that need to be true from turn 1. Use reminders for things that need to remain true at turn 50.

When to reach for a reminder

Typical uses:

  • Guardrails that matter per turn. "Do not share any document with confidentiality=internal without a managerial approval note."
  • Dynamic user context. "The user is in the EU. Respond in the language indicated by session.metadata.language."
  • Output format that drifts. "Every response must end with a one-line next-action summary."
  • Glossary expansion — see Glossaries.

Reminders cost tokens on every turn they fire, so keep each one short and specific. A 30-token reminder is doing useful work; a 300-token one is an instruction in disguise and probably belongs in the system prompt.

Trade-offs

  • Compiled at session start. Templated reminders bind to session.metadata once, so mid-session metadata changes won't be reflected. If you need truly live values, put them in the tool output instead.
  • Every matching turn pays tokens. A reminder that fires on every user message adds its tokens to every user message. That's the whole point — but it means long reminders are expensive, and a reminder that's really an instruction belongs in the system prompt. Use fire_every and skip_first to back off the cadence when the reminder only needs to surface once context has had a chance to rot.
  • Scoped to a single step. Reminders don't cross step transitions. Either repeat the reminder on each step where it applies, or lean into the scoping and let different steps carry different reminders.
  • Fire-or-don't at the hook level. A reminder cannot conditionally skip a hook; use the Velocity template to emit an empty string when it shouldn't apply.