Vertex AI

llmock supports Google Vertex AI endpoints using the same Gemini wire format with a different URL routing pattern. Vertex AI requests are handled by the same Gemini handler internally.

Endpoints

Method	Path	Description
POST	/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:generateContent	Non-streaming content generation
POST	/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:streamGenerateContent	Streaming content generation (SSE)

URL Pattern Difference

The key difference between consumer Gemini and Vertex AI is the URL routing. Consumer Gemini uses:

/v1beta/models/{model}:generateContent

While Vertex AI uses the fully qualified GCP resource path:

/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:generateContent

llmock matches Vertex AI paths using this regex pattern:

Vertex AI route matching ts

const VERTEX_AI_RE =
  /^\/v1\/projects\/[^/]+\/locations\/[^/]+\/publishers\/google\/models\/([^/:]+):(generateContent|streamGenerateContent)$/;

Quick Start

vertex-ai-quick-start.ts ts

import { LLMock } from "@copilotkit/llmock";

const mock = new LLMock();
mock.onMessage("hello", { content: "Hi from Vertex AI!" });
await mock.start();

// Vertex AI SDK configuration
const res = await fetch(
  `${mock.url}/v1/projects/my-project/locations/us-central1/publishers/google/models/gemini-pro:generateContent`,
  {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      contents: [
        { role: "user", parts: [{ text: "hello" }] },
      ],
    }),
  },
);

Same Wire Format as Gemini

Vertex AI uses the exact same request and response wire format as the consumer Gemini API. The request body uses contents with parts, and responses use candidates with content.parts. See the Gemini documentation for full details on the wire format, streaming events, and fixture examples.

Internally, both consumer Gemini and Vertex AI routes are handled by the same handleGemini() function. The only difference is the provider key used for recording and metrics: consumer Gemini uses "gemini" while Vertex AI uses "vertexai".

SDK Configuration

To use llmock with the Vertex AI SDK, point the SDK's API endpoint to your llmock instance. The project, location, and model segments in the URL are matched but can be any value — llmock extracts the model name for fixture matching.

Vertex AI SDK setup ts

import { VertexAI } from "@google-cloud/vertexai";

const vertexAI = new VertexAI({
  project: "my-project",
  location: "us-central1",
  apiEndpoint: "localhost:PORT", // llmock URL
});

const model = vertexAI.getGenerativeModel({
  model: "gemini-pro",
});

Fixture Examples

Fixtures for Vertex AI are identical to Gemini fixtures. The same match/response format works for both:

vertex-ai-fixtures.json json

{
  "fixtures": [
    {
      "match": { "userMessage": "hello" },
      "response": { "content": "Hi from Vertex AI!" }
    },
    {
      "match": { "userMessage": "analyze" },
      "response": {
        "toolCalls": [
          {
            "name": "analyze_data",
            "arguments": "{\"dataset\":\"sales_q4\"}"
          }
        ]
      }
    }
  ]
}

Metrics Path Normalization

Vertex AI paths are normalized for Prometheus metric labels. The dynamic segments (project, location, model) are replaced with placeholders:

/v1/projects/{p}/locations/{l}/publishers/google/models/{m}:generateContent