Version: 2.0

Use OpenAI SDK with the Vectara Chat Completions API

This tutorial demonstrates how to use Vectara's Chat Completions API through OpenAI-compatible interfaces. Learn how to integrate Vectara's generative AI capabilities into your applications using either direct HTTP requests or the OpenAI Python SDK. This enables seamless migration from OpenAI or integration with OpenAI-compatible tools. By completing this tutorial, you will use Vectara's API directly or via OpenAI SDK.

This tutorial contains the following steps:

Prerequisites and setup
Step 1. Install the required packages
Step 2. Implement the VectaraChat client
Step 3. Enter your API key
Step 4. Initialize the Vectara chat client
Step 5. Perform tests

Note

We recommend that you complete this tutorial in Google Colab.

Prerequisites and setup

Python 3.8 or higher
Basic understanding of REST APIs and HTTP requests
A valid Vectara API key with access to the Chat Completions endpoint.

Step 1. Install the required packages

Install the required Python packages. The requests library handles direct HTTP calls, while openai provides the official OpenAI SDK for simplified integration.

INSTALL REQUIRED PACKAGES

Code example with bash syntax.

Step 2. Implement the VectaraChat client

The following code contains the implementation of the VectaraChat client, which provides methods for interacting with Vectara's Chat Completions API.

VECTARACHAT CLIENT IMPLEMENTATION

Code example with python syntax.

tip

Enable verbose=True during development to see detailed request/response logging for debugging.

Step 3. Enter your API key

API KEY CONFIGURATION

Code example with python syntax.

Step 4: Initialize the Vectara chat client

Create the VectaraChat instance and choose between Bearer token authentication (recommended) or x-api-key header authentication.

INITIALIZE CLIENT

Code example with python syntax.

Step 5. Perform tests

Now that you've set up the VectaraChat client and initialized it with your API key, let's test both implementation approaches. The following tests demonstrate four different scenarios: direct HTTP requests (streaming and non-streaming) and OpenAI SDK integration (streaming and non-streaming). Each test shows you how to make requests and handle responses in different ways.

Test 1: Direct API (non-streaming)

Let's test the direct API approach without streaming:

Direct HTTP Request

NON-STREAMING DIRECT API CALL

Code example with python syntax.

NON-STREAMING RESPONSE EXAMPLE

Code example with json syntax.

OpenAI SDK Request

NON-STREAMING WITH OPENAI SDK

Code example with python syntax.

Test 2: Direct API (streaming)

Now let's test with streaming enabled:

STREAMING DIRECT API CALL

Code example with python syntax.

STREAMING OUTPUT EXAMPLE

Code example with text syntax.

Test 3: OpenAI SDK (non-streaming)

Now let's test using the OpenAI SDK without streaming:

OPENAI SDK NON-STREAMING CALL

Code example with python syntax.

OPENAI SDK OUTPUT EXAMPLE

Code example with text syntax.

Test 4: OpenAI SDK (streaming)

Finally, let's test the OpenAI SDK with streaming:

OPENAI SDK STREAMING CALL

Code example with python syntax.

STREAMING OUTPUT EXAMPLE

Code example with text syntax.

Advanced usage examples

Beyond the basic tests, explore these advanced usage patterns to build production-ready applications:

Multi-turn conversations - Maintain context across multiple exchanges.
Use different models - Switch between available LLM models.
Customize generation parameters - Control output with temperature and token limits.

Multi-turn conversations

The previous tests showed single-question interactions. Real conversational applications need to maintain context across multiple exchanges. The Chat Completions API supports multi-turn conversations by including the conversation history in each request. Here's how to build a contextual conversation:

MULTI-TURN CONVERSATION EXAMPLE

Code example with python syntax.

MULTI-TURN CONVERSATION OUTPUT

Code example with text syntax.

Use different models

Vectara supports various LLM models. Let's try a different model:

USING DIFFERENT MODELS

Code example with python syntax.

DIFFERENT MODEL OUTPUT

Code example with text syntax.

Customize generation parameters

You can customize generation parameters to control the output:

CUSTOMIZING PARAMETERS

Code example with python syntax.

CUSTOMIZED OUTPUT

Code example with text syntax.

This tutorial demonstrated how to use the Vectara Chat Completions API, both directly and with the OpenAI SDK. You can use this API to add powerful generative AI capabilities to your applications with OpenAI-compatible interfaces.

For integration examples with external applications, see Use with External Applications.

Prerequisites and setup​

Step 1. Install the required packages​

INSTALL REQUIRED PACKAGES

Step 2. Implement the VectaraChat client​

VECTARACHAT CLIENT IMPLEMENTATION

Step 3. Enter your API key​

API KEY CONFIGURATION

Step 4: Initialize the Vectara chat client​

INITIALIZE CLIENT

Step 5. Perform tests​

Test 1: Direct API (non-streaming)​

Direct HTTP Request​

NON-STREAMING DIRECT API CALL

NON-STREAMING RESPONSE EXAMPLE

OpenAI SDK Request​

NON-STREAMING WITH OPENAI SDK

Test 2: Direct API (streaming)​

STREAMING DIRECT API CALL

STREAMING OUTPUT EXAMPLE

Test 3: OpenAI SDK (non-streaming)​

OPENAI SDK NON-STREAMING CALL

OPENAI SDK OUTPUT EXAMPLE

Test 4: OpenAI SDK (streaming)​

OPENAI SDK STREAMING CALL

STREAMING OUTPUT EXAMPLE

Advanced usage examples​

Multi-turn conversations​

MULTI-TURN CONVERSATION EXAMPLE

MULTI-TURN CONVERSATION OUTPUT

Use different models​

USING DIFFERENT MODELS

DIFFERENT MODEL OUTPUT

Customize generation parameters​

CUSTOMIZING PARAMETERS

CUSTOMIZED OUTPUT

Prerequisites and setup

Step 1. Install the required packages

Step 2. Implement the VectaraChat client

Step 3. Enter your API key

Step 4: Initialize the Vectara chat client

Step 5. Perform tests

Test 1: Direct API (non-streaming)

Direct HTTP Request

OpenAI SDK Request

Test 2: Direct API (streaming)

Test 3: OpenAI SDK (non-streaming)

Test 4: OpenAI SDK (streaming)

Advanced usage examples

Multi-turn conversations

Use different models

Customize generation parameters