Record & Replay

VCR-style record-and-replay support. When a request doesn't match any fixture, llmock proxies it to the real upstream provider, records the response as a fixture on disk and in memory, then replays it on subsequent identical requests.

How It Works

  1. Client sends a request to llmock
  2. llmock attempts fixture matching as usual
  3. On miss: the request is forwarded to the configured upstream provider
  4. The upstream response is relayed back to the client immediately
  5. The response is collapsed (if streaming) and saved as a fixture to disk and memory
  6. Subsequent identical requests match the newly recorded fixture

Quick Start

CLI usage bash
npx llmock --fixtures ./fixtures \
  --record \
  --provider-openai https://api.openai.com \
  --provider-anthropic https://api.anthropic.com

CLI Flags

Flag Description
--record Enable record mode (proxy-on-miss)
--strict Strict mode: return 503 (not 404) on unmatched requests
--provider-openai <url> Upstream URL for OpenAI
--provider-anthropic <url> Upstream URL for Anthropic
--provider-gemini <url> Upstream URL for Gemini
--provider-vertexai <url> Upstream URL for Vertex AI
--provider-bedrock <url> Upstream URL for Bedrock
--provider-azure <url> Upstream URL for Azure OpenAI
--provider-ollama <url> Upstream URL for Ollama
--provider-cohere <url> Upstream URL for Cohere

Programmatic API

Programmatic recording ts
import { LLMock } from "@copilotkit/llmock";

const mock = new LLMock();
await mock.start();

// Enable recording with upstream providers
mock.enableRecording({
  providers: {
    openai: "https://api.openai.com",
    anthropic: "https://api.anthropic.com",
  },
  fixturePath: "./fixtures/recorded",
});

// Make requests — unmatched ones are proxied and recorded
// ...

// Disable recording — recorded fixtures persist on disk
mock.disableRecording();

Stream Collapsing

When the upstream provider returns a streaming response, llmock collapses it into a non-streaming fixture. Six streaming formats are supported:

Format Provider Content-Type
OpenAI SSE OpenAI, Azure text/event-stream
Anthropic SSE Anthropic text/event-stream
Gemini SSE Gemini, Vertex AI text/event-stream
Cohere SSE Cohere text/event-stream
Ollama NDJSON Ollama application/x-ndjson
Bedrock EventStream AWS Bedrock application/vnd.amazon.eventstream

The collapse extracts text content and tool calls from streaming chunks and produces a simple { content } or { toolCalls } fixture response.

Auth Header Forwarding

When proxying to upstream providers, llmock forwards these headers from the original request:

Auth headers are never saved in recorded fixtures. The fixture only contains the match criteria (derived from the last user message) and the response content.

Strict Mode

When --strict is enabled, unmatched requests that cannot be proxied (no upstream configured for that provider) return 503 Service Unavailable instead of the default 404. This is useful for CI environments where you want to catch unexpected API calls.

Fixture Auto-Generation

Recorded fixtures are saved to disk with timestamped filenames:

Recorded fixture file json
// fixtures/recorded/openai-2025-01-15T10-30-00-000Z-0.json
{
  "fixtures": [
    {
      "match": { "userMessage": "What is the weather?" },
      "response": { "content": "I don't have real-time weather data..." }
    }
  ]
}

Match criteria are derived from the original request: the last user message becomes userMessage, or for embedding requests, the input becomes inputText. If no match criteria can be derived (e.g., empty messages), the fixture is saved to disk with a warning but not registered in memory.

Fixture Lifecycle