llmock Documentation

llmock is a deterministic mock LLM server for testing. It runs a real HTTP server that any process on the machine can reach, serving fixture-driven responses in the authentic SSE format for OpenAI, Anthropic Claude, and Google Gemini APIs.

Quick Start

Install shell
# npm
npm install @copilotkit/llmock

# pnpm
pnpm add @copilotkit/llmock
Programmatic usage (vitest) ts
import { LLMock } from "@copilotkit/llmock";
import { describe, it, expect, beforeAll, afterAll } from "vitest";

let mock: LLMock;

beforeAll(async () => {
  mock = new LLMock();
  await mock.start();
});

afterAll(async () => {
  await mock.stop();
});

it("returns a text response", async () => {
  mock.on({ userMessage: "hello" }, { content: "Hi there!" });

  const res = await fetch(`${mock.url}/v1/chat/completions`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      model: "gpt-4",
      messages: [{ role: "user", content: "hello" }],
      stream: false,
    }),
  });
  const body = await res.json();
  expect(body.choices[0].message.content).toBe("Hi there!");
});
CLI usage shell
# Start the server with fixture files
npx llmock --fixtures ./fixtures --port 5555

# Point your app at it
export OPENAI_BASE_URL=http://localhost:5555/v1
export OPENAI_API_KEY=mock-key

Supported Endpoints

Endpoint Provider Transport
POST /v1/chat/completions OpenAI HTTP SSE / JSON
POST /v1/responses OpenAI HTTP SSE
WS /v1/responses OpenAI WebSocket
WS /v1/realtime OpenAI WebSocket
POST /v1/messages Anthropic HTTP SSE / JSON
POST /v1beta/models/:model:* Google Gemini HTTP SSE / JSON
WS /ws/google.ai.generativelanguage.* Google Gemini Live WebSocket
POST /v1/embeddings OpenAI JSON

Feature Pages

API Reference

LLMock class

Method Description
new LLMock(opts?) Create instance. Options: port, host, latency, chunkSize, logLevel
start() Start the HTTP server. Returns the base URL.
stop() Stop the server.
on(match, response, opts?) Add a fixture with match criteria and response.
onMessage(pattern, response) Shorthand: match on userMessage.
onToolCall(name, response) Shorthand: match on toolName.
onEmbedding(pattern, response) Shorthand: match on inputText (embeddings).
onJsonOutput(pattern, json) Shorthand: match userMessage + responseFormat=json_object.
onToolResult(id, response) Shorthand: match on toolCallId.
nextRequestError(status, body?) Queue a one-shot error for the next request.
addFixture(fixture) Add a raw Fixture object.
loadFixtureFile(path) Load fixtures from a JSON file.
loadFixtureDir(path) Load all fixture JSON files from a directory.
reset() Clear all fixtures and journal entries.
getRequests() Get all journal entries.
getLastRequest() Get the most recent journal entry.
.url / .port Access the server URL and port.