Record & Replay

VCR-style record-and-replay support. When a request doesn't match any fixture, llmock proxies it to the real upstream provider, records the response as a fixture on disk and in memory, then replays it on subsequent identical requests.

How It Works

Client sends a request to llmock
llmock attempts fixture matching as usual
On miss: the request is forwarded to the configured upstream provider
The upstream response is relayed back to the client immediately
The response is collapsed (if streaming) and saved as a fixture to disk and memory
Subsequent identical requests match the newly recorded fixture

Quick Start

CLI usage bash

npx llmock --fixtures ./fixtures \
  --record \
  --provider-openai https://api.openai.com \
  --provider-anthropic https://api.anthropic.com

CLI Flags

Flag	Description
`--record`	Enable record mode (proxy-on-miss)
`--strict`	Strict mode: return 503 (not 404) on unmatched requests
`--provider-openai <url>`	Upstream URL for OpenAI
`--provider-anthropic <url>`	Upstream URL for Anthropic
`--provider-gemini <url>`	Upstream URL for Gemini
`--provider-vertexai <url>`	Upstream URL for Vertex AI
`--provider-bedrock <url>`	Upstream URL for Bedrock
`--provider-azure <url>`	Upstream URL for Azure OpenAI
`--provider-ollama <url>`	Upstream URL for Ollama
`--provider-cohere <url>`	Upstream URL for Cohere

Programmatic API

Programmatic recording ts

import { LLMock } from "@copilotkit/llmock";

const mock = new LLMock();
await mock.start();

// Enable recording with upstream providers
mock.enableRecording({
  providers: {
    openai: "https://api.openai.com",
    anthropic: "https://api.anthropic.com",
  },
  fixturePath: "./fixtures/recorded",
});

// Make requests — unmatched ones are proxied and recorded
// ...

// Disable recording — recorded fixtures persist on disk
mock.disableRecording();

Stream Collapsing

When the upstream provider returns a streaming response, llmock collapses it into a non-streaming fixture. Six streaming formats are supported:

Format	Provider	Content-Type
OpenAI SSE	OpenAI, Azure	`text/event-stream`
Anthropic SSE	Anthropic	`text/event-stream`
Gemini SSE	Gemini, Vertex AI	`text/event-stream`
Cohere SSE	Cohere	`text/event-stream`
Ollama NDJSON	Ollama	`application/x-ndjson`
Bedrock EventStream	AWS Bedrock	`application/vnd.amazon.eventstream`

The collapse extracts text content and tool calls from streaming chunks and produces a simple { content } or { toolCalls } fixture response.

Auth Header Forwarding

When proxying to upstream providers, llmock forwards these headers from the original request:

authorization
x-api-key
content-type
accept

Auth headers are never saved in recorded fixtures. The fixture only contains the match criteria (derived from the last user message) and the response content.

Strict Mode

When --strict is enabled, unmatched requests that cannot be proxied (no upstream configured for that provider) return 503 Service Unavailable instead of the default 404. This is useful for CI environments where you want to catch unexpected API calls.

Fixture Auto-Generation

Recorded fixtures are saved to disk with timestamped filenames:

Recorded fixture file json

// fixtures/recorded/openai-2025-01-15T10-30-00-000Z-0.json
{
  "fixtures": [
    {
      "match": { "userMessage": "What is the weather?" },
      "response": { "content": "I don't have real-time weather data..." }
    }
  ]
}

Match criteria are derived from the original request: the last user message becomes userMessage, or for embedding requests, the input becomes inputText. If no match criteria can be derived (e.g., empty messages), the fixture is saved to disk with a warning but not registered in memory.

Fixture Lifecycle

On disk: Fixtures persist in the configured fixturePath directory (default: ./fixtures/recorded)
In memory: Recorded fixtures are immediately available for matching subsequent requests in the same session
After restart: Load the recorded fixture directory to replay previous recordings