Skip to main content
Version: 2.0

Chats

This guide covers the Vectara Python SDK for managing chat conversations, enabling conversational AI with Retrieval Augmented Generation (RAG) and chat history. These methods enable you to create chats, maintain multi-turn conversations, and manage chat history, ideal for building interactive applications like support chatbots or customer service platforms.

Prerequisites

This guide assumes you have a corpus called my-docs with indexed documents. If you haven't created a corpus yet, follow the Quick Start guide to set up your first corpus and add some documents.

Create a chat session

CREATE A CHAT SESSION
1

Create a chat session that can maintain conversation context across multiple exchanges. The session handles RAG integration automatically, providing contextual responses based on your corpus content.

The create_chat_session method corresponds to the HTTP POST /v2/chats endpoint. For more details on request and response parameters, see the Create Chat REST API.

Key Parameters:

  • SearchCorporaParameters: Defines which corpora to search and filtering options
  • GenerationParameters: Controls response generation quality and style
  • ChatParameters: Enables conversation history storage for multi-turn interactions
  • store=True: Essential for maintaining context across conversation turns

Returns:

  • chat_id: Unique identifier for the conversation session
  • answer: AI-generated response based on corpus content
  • factual_consistency_score: Reliability score for the response

Multi-turn conversation

MULTI-TURN CONVERSATION EXAMPLE
1

Demonstrate a natural multi-turn conversation where the AI maintains context across exchanges. Each subsequent message builds on the previous conversation history without requiring explicit context management.

The chat turn method corresponds to the HTTP POST /v2/chats/{chat_id}/turns endpoint. For more details on request and response parameters, see the Create Chat Turn REST API.

Conversation Flow:

  1. Initial Question: Establishes the topic and context
  2. Follow-up Questions: Reference previous answers using pronouns and implicit context
  3. Automatic Context: The session maintains conversation history transparently

Benefits:

  • Natural conversation flow without manual context passing
  • Each response considers the full conversation history
  • Factual consistency maintained across all turns
  • Easy to implement - just call session.chat() for each turn

List chat conversations

LIST CHAT CONVERSATIONS
1?

Retrieve and display chat conversation history for monitoring, analytics, or user interface display. Useful for building chat interfaces that show conversation lists.

The chats.list method corresponds to the HTTP GET /v2/chats endpoint. For more details on request and response parameters, see the List Chats REST API.

Chat Metadata Includes:

  • id: Unique chat identifier
  • first_query: Opening message of the conversation
  • created_at: Timestamp of chat creation
  • enabled: Whether the chat is active

Streaming chat responses

STREAMING CHAT RESPONSES
1

Stream chat responses in real-time for better user experience in interactive applications. Perfect for creating responsive chat interfaces where users see responses as they're generated.

The chat stream method corresponds to the HTTP POST /v2/chats/{chat_id}/turns/stream endpoint.

Streaming Benefits:

  • Immediate feedback as the response generates
  • Better perceived performance for longer responses
  • Natural conversation feel in interactive applications
  • Can be stopped early if needed

Chat history management

CHAT HISTORY MANAGEMENT
1?

Access and display complete conversation history for a specific chat session for audit, analysis, or display purposes.

The chats.get method corresponds to the HTTP GET /v2/chats/{chat_id} endpoint. For more details on request and response parameters, see the Get Chat REST API.

The chats.turns.list method corresponds to the HTTP GET /v2/chats/{chat_id}/turns endpoint. For more details on request and response parameters, see the List Chat Turns REST API.

History Components:

  • Chat Metadata: Overall conversation information
  • Turns: Individual message exchanges between user and assistant
  • Turn Details: Each turn includes query, answer, and timestamp

Best Practices

  • Monitor factual consistency scores for quality control
  • Use appropriate max_used_search_results (25-50 for most cases)
  • Enable chat storage (store=True) for multi-turn conversations
  • Implement session management for user conversations
  • Consider streaming for better user experience
  • Limit conversations to 50-100 turns to maintain context quality

Error Handling

  • 400 Bad Request: Check query parameters and corpus configuration
  • 403 Forbidden: Verify API key has chat permissions
  • 404 Not Found: Ensure corpus exists and is accessible
  • Rate Limiting: Implement retry logic with exponential backoff

Next steps

After understanding chat functionality:

  • Integration: Combine with document indexing for dynamic knowledge bases
  • Customization: Experiment with different generation presets and prompts
  • Analytics: Track conversation patterns and user satisfaction
  • Scaling: Implement session management for multiple concurrent users