WebSocket APIs

llmock implements three WebSocket APIs with zero dependencies — real RFC 6455 framing built from scratch. The same fixtures drive HTTP and WebSocket transports.

Endpoints

Path	API	Protocol
/v1/responses	OpenAI Responses API	WebSocket JSON messages
/v1/realtime	OpenAI Realtime API	WebSocket JSON messages
/ws/google.ai.generativelanguage.*	Gemini Live	WebSocket JSON messages

OpenAI Responses (WebSocket)

ws-responses.test.ts ts

const instance = await createServer([
  { match: { userMessage: "hello" }, response: { content: "Hi there!" } }
]);

const ws = await connectWebSocket(instance.url, "/v1/responses");

// Send a response.create message
ws.send(JSON.stringify({
  type: "response.create",
  model: "gpt-4",
  input: [{ role: "user", content: "hello" }],
}));

const messages = await ws.waitForMessages(9);
const events = messages.map(m => JSON.parse(m));
const types = events.map(e => e.type);

expect(types[0]).toBe("response.created");
expect(types).toContain("response.output_text.delta");
expect(types).toContain("response.completed");

OpenAI Realtime

The Realtime API uses a conversational protocol with session management.

ws-realtime.test.ts ts

const ws = await connectWebSocket(instance.url, "/v1/realtime");

// Server sends session.created on connect
const [sessionMsg] = await ws.waitForMessages(1);
expect(JSON.parse(sessionMsg).type).toBe("session.created");

// Configure session
ws.send(JSON.stringify({
  type: "session.update",
  session: { modalities: ["text"] }
}));

// Add a user message
ws.send(JSON.stringify({
  type: "conversation.item.create",
  item: {
    type: "message",
    role: "user",
    content: [{ type: "input_text", text: "hello" }]
  }
}));

// Request a response
ws.send(JSON.stringify({ type: "response.create" }));

// Wait for response events
const msgs = await ws.waitForMessages(8);
const events = msgs.map(m => JSON.parse(m));
expect(events.some(e => e.type === "response.text.delta")).toBe(true);

Gemini Live

Bidirectional streaming for Google Gemini Live API.

ws-gemini-live.test.ts ts

const ws = await connectWebSocket(
  instance.url,
  "/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent"
);

// Send setup message
ws.send(JSON.stringify({
  setup: { model: "models/gemini-2.0-flash-live" }
}));

// Send client content
ws.send(JSON.stringify({
  clientContent: {
    turns: [{ role: "user", parts: [{ text: "hello" }] }],
    turnComplete: true,
  }
}));

Implementation Details

Built on raw RFC 6455 WebSocket framing — zero external dependencies
Text messages only (no binary/audio/video)
Same fixture matching as HTTP endpoints
All WebSocket connections are logged in the journal

Gemini Live text support is unverified — no text-capable Gemini Live model existed at time of implementation. The WebSocket framing and protocol messages follow the published API spec.

Provider WebSocket Support

Not all LLM providers offer WebSocket APIs. Here's the current landscape:

Provider	WebSocket API	llmock Status
OpenAI Realtime	wss://api.openai.com/v1/realtime	Supported ✓
OpenAI Responses	wss://api.openai.com/v1/responses	Supported ✓
Gemini Live	wss://...BidiGenerateContent	Implemented, awaiting text model
Anthropic Claude	None	N/A
Azure OpenAI	Uses OpenAI Realtime	Covered by OpenAI
Mistral / Groq / Cohere	None	N/A
AWS Bedrock	EventStream (not WebSocket)	N/A

llmock includes drift canary tests that automatically detect when providers add new WebSocket capabilities. When a canary fires, it signals that llmock should be updated to support the new API.