Vertex AI
llmock supports Google Vertex AI endpoints using the same Gemini wire format with a different URL routing pattern. Vertex AI requests are handled by the same Gemini handler internally.
Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /v1/projects/{project}/locations/{location}/publishers/google/models/{model}:generateContent | Non-streaming content generation |
| POST | /v1/projects/{project}/locations/{location}/publishers/google/models/{model}:streamGenerateContent | Streaming content generation (SSE) |
URL Pattern Difference
The key difference between consumer Gemini and Vertex AI is the URL routing. Consumer Gemini uses:
/v1beta/models/{model}:generateContent
While Vertex AI uses the fully qualified GCP resource path:
/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:generateContent
llmock matches Vertex AI paths using this regex pattern:
const VERTEX_AI_RE =
/^\/v1\/projects\/[^/]+\/locations\/[^/]+\/publishers\/google\/models\/([^/:]+):(generateContent|streamGenerateContent)$/;
Quick Start
import { LLMock } from "@copilotkit/llmock";
const mock = new LLMock();
mock.onMessage("hello", { content: "Hi from Vertex AI!" });
await mock.start();
// Vertex AI SDK configuration
const res = await fetch(
`${mock.url}/v1/projects/my-project/locations/us-central1/publishers/google/models/gemini-pro:generateContent`,
{
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
contents: [
{ role: "user", parts: [{ text: "hello" }] },
],
}),
},
);
Same Wire Format as Gemini
Vertex AI uses the exact same request and response wire format as the consumer Gemini API.
The request body uses contents with parts, and responses use
candidates with content.parts. See the
Gemini documentation for full details on the wire format,
streaming events, and fixture examples.
Internally, both consumer Gemini and Vertex AI routes are handled by the same
handleGemini() function. The only difference is the provider key used for
recording and metrics: consumer Gemini uses "gemini" while Vertex AI uses
"vertexai".
SDK Configuration
To use llmock with the Vertex AI SDK, point the SDK's API endpoint to your llmock instance. The project, location, and model segments in the URL are matched but can be any value — llmock extracts the model name for fixture matching.
import { VertexAI } from "@google-cloud/vertexai";
const vertexAI = new VertexAI({
project: "my-project",
location: "us-central1",
apiEndpoint: "localhost:PORT", // llmock URL
});
const model = vertexAI.getGenerativeModel({
model: "gemini-pro",
});
Fixture Examples
Fixtures for Vertex AI are identical to Gemini fixtures. The same
match/response format works for both:
{
"fixtures": [
{
"match": { "userMessage": "hello" },
"response": { "content": "Hi from Vertex AI!" }
},
{
"match": { "userMessage": "analyze" },
"response": {
"toolCalls": [
{
"name": "analyze_data",
"arguments": "{\"dataset\":\"sales_q4\"}"
}
]
}
}
]
}
Metrics Path Normalization
Vertex AI paths are normalized for Prometheus metric labels. The dynamic segments (project, location, model) are replaced with placeholders:
/v1/projects/{p}/locations/{l}/publishers/google/models/{m}:generateContent