WebSocket APIs

llmock implements three WebSocket APIs with zero dependencies — real RFC 6455 framing built from scratch. The same fixtures drive HTTP and WebSocket transports.

Endpoints

Path API Protocol
/v1/responses OpenAI Responses API WebSocket JSON messages
/v1/realtime OpenAI Realtime API WebSocket JSON messages
/ws/google.ai.generativelanguage.* Gemini Live WebSocket JSON messages

OpenAI Responses (WebSocket)

ws-responses.test.ts ts
const instance = await createServer([
  { match: { userMessage: "hello" }, response: { content: "Hi there!" } }
]);

const ws = await connectWebSocket(instance.url, "/v1/responses");

// Send a response.create message
ws.send(JSON.stringify({
  type: "response.create",
  model: "gpt-4",
  input: [{ role: "user", content: "hello" }],
}));

const messages = await ws.waitForMessages(9);
const events = messages.map(m => JSON.parse(m));
const types = events.map(e => e.type);

expect(types[0]).toBe("response.created");
expect(types).toContain("response.output_text.delta");
expect(types).toContain("response.completed");

OpenAI Realtime

The Realtime API uses a conversational protocol with session management.

ws-realtime.test.ts ts
const ws = await connectWebSocket(instance.url, "/v1/realtime");

// Server sends session.created on connect
const [sessionMsg] = await ws.waitForMessages(1);
expect(JSON.parse(sessionMsg).type).toBe("session.created");

// Configure session
ws.send(JSON.stringify({
  type: "session.update",
  session: { modalities: ["text"] }
}));

// Add a user message
ws.send(JSON.stringify({
  type: "conversation.item.create",
  item: {
    type: "message",
    role: "user",
    content: [{ type: "input_text", text: "hello" }]
  }
}));

// Request a response
ws.send(JSON.stringify({ type: "response.create" }));

// Wait for response events
const msgs = await ws.waitForMessages(8);
const events = msgs.map(m => JSON.parse(m));
expect(events.some(e => e.type === "response.text.delta")).toBe(true);

Gemini Live

Bidirectional streaming for Google Gemini Live API.

ws-gemini-live.test.ts ts
const ws = await connectWebSocket(
  instance.url,
  "/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent"
);

// Send setup message
ws.send(JSON.stringify({
  setup: { model: "models/gemini-2.0-flash-live" }
}));

// Send client content
ws.send(JSON.stringify({
  clientContent: {
    turns: [{ role: "user", parts: [{ text: "hello" }] }],
    turnComplete: true,
  }
}));

Implementation Details

Gemini Live text support is unverified — no text-capable Gemini Live model existed at time of implementation. The WebSocket framing and protocol messages follow the published API spec.

Provider WebSocket Support

Not all LLM providers offer WebSocket APIs. Here's the current landscape:

Provider WebSocket API llmock Status
OpenAI Realtime wss://api.openai.com/v1/realtime Supported ✓
OpenAI Responses wss://api.openai.com/v1/responses Supported ✓
Gemini Live wss://...BidiGenerateContent Implemented, awaiting text model
Anthropic Claude None N/A
Azure OpenAI Uses OpenAI Realtime Covered by OpenAI
Mistral / Groq / Cohere None N/A
AWS Bedrock EventStream (not WebSocket) N/A

llmock includes drift canary tests that automatically detect when providers add new WebSocket capabilities. When a canary fires, it signals that llmock should be updated to support the new API.