Tools
Tools provide agents with capabilities to interact with data and external systems. An agent uses the conversational context and its instructions to decide which tools to call, and how use the tools' responses to respond to the user's query.
Vectara offers a number of useful tools out-of-the-box, but you can also build your own. For a complete list of available tools, refer to the Tools API docs.
Tools represent external or internal capabilities that agents can invoke dynamically. They are defined by:
- A unique ID (
tol_abcd) and name. - A description of their function.
- An input schema describing accepted parameters (in JSON Schema format).
- Metadata for categorization.
- Runtime availability (enabled or disabled).
Searching corpora with tools
You configure corpus search behavior for Vectara agents using the
query_configuration parameter within the corpora_search tool. This
parameter uses the same search and generation object formatting as shown
in Advanced Single Corpus Query. Before using this tool,
ensure that you have at least one indexed corpus with data. The LLM cannot
modify these predefined search parameters during
conversation.
For more details about the different corpus objects, see Configure Query Parameters.
Agent configuration examples
This example demonstrates a basic configuration.
BASIC QUERY CONFIGURATION EXAMPLE
Code example with json syntax.1
Transforming tool output before the LLM sees it
Tool responses are often much larger than the agent actually needs. Search APIs in particular tend to return raw scoring fields, provider metadata, and deeply nested envelopes — useful for a human inspecting the call, but mostly noise for the model, and expensive on every turn that result hangs around in context.
Every tool configuration accepts an optional output_transform: a
jq expression applied to the tool's
JSON response before it is handed to the LLM. The result of the
expression replaces the original output. If the expression fails to
compile or evaluate at runtime, the failure is surfaced to the agent
as a tool error so it can react rather than silently ingesting bad
data.
Trimming web_search
The built-in web_search tool returns a WebSearchResponse that
looks like:
{
"query": "site:vectara.com agents",
"results": [
{ "title": "...", "url": "...", "snippet": "...", "score": 0.873 }
],
"metadata": { "...provider-specific..." },
"results_count": 10
}
For most agents, the score, the top-level metadata, and the
echoed query are noise. A jq projection drops them:
WEB_SEARCH TRIMMED TO THE FIELDS THE AGENT ACTUALLY USES
Code example with json syntax.1
Pairing web_search with web_get
output_transform shines when paired with web_get, the built-in
fetcher. Configure search to return only enough metadata for the
agent to choose a result, and let the agent call web_get on the
URLs it actually wants to read. Two benefits:
- The search response that lingers in context stays small, so it doesn't crowd out the conversation as the session grows.
- The agent only pays for a full page payload on the results it actually needs, not the entire search hit list.
A WebGetResponse itself is mostly the page content, but it carries
some plumbing fields (response_headers, total_lines, truncated,
the redirected url) that the agent rarely needs to reason over. A
matching transform keeps the body and the status code:
SEARCH → FETCH WITH BOTH TOOLS TRIMMED
Code example with json syntax.1
Other shapes worth knowing:
.results[0:5] | map({title, url})— cap to the first N entries while projecting fields.del(.metadata, .results_count)— strip noisy fields without restructuring..results | map(select(.score > 0.5) | {title, url, snippet})— filter on a field before dropping it.
Iterating on a transform
A wrinkle worth knowing: the original tool response schema is visible in the API and in tool definitions, but the post-transform shape that the LLM actually sees is not yet surfaced anywhere. We plan to expose the effective output schema on the tool configuration so authors can preview exactly what the model receives, but until then expect some trial and error — call the tool, inspect what the agent reasoned over (or what it complained was missing), adjust the jq expression, and try again. Keeping the transform short and field-projecting (rather than restructuring) makes this loop much faster, and keeps you out of jq edge cases like null-vs-missing semantics.
Working with artifact-based tools
Some agent tools work with files uploaded to a session's workspace. Rather than embedding file contents in every request, these tools use artifact references.
Document conversion tool
The document conversion tool extracts content from uploaded files and converts them to markdown format. It accepts an artifact reference as input and creates a new artifact containing the markdown output.
Supported file types include:
- PDF documents (
.pdf) - Microsoft Word (
.doc,.docx) - Microsoft PowerPoint (
.ppt,.pptx) - Images with OCR capability (
.jpg,.png)
For example ahe tool reads a PDF artifact, converts it to markdown, stores the result as a new artifact, and returns the new artifact reference to the agent.
Structured document indexing tool
The structured document indexing tool adds content from artifacts to corpora. It references pre-converted markdown artifacts instead of requiring inline document content, enabling efficient indexing workflows. When the agent calls this tool, it references the artifact to index.