WebSocket APIs
llmock implements three WebSocket APIs with zero dependencies — real RFC 6455 framing built from scratch. The same fixtures drive HTTP and WebSocket transports.
Endpoints
| Path | API | Protocol |
|---|---|---|
| /v1/responses | OpenAI Responses API | WebSocket JSON messages |
| /v1/realtime | OpenAI Realtime API | WebSocket JSON messages |
| /ws/google.ai.generativelanguage.* | Gemini Live | WebSocket JSON messages |
OpenAI Responses (WebSocket)
const instance = await createServer([
{ match: { userMessage: "hello" }, response: { content: "Hi there!" } }
]);
const ws = await connectWebSocket(instance.url, "/v1/responses");
// Send a response.create message
ws.send(JSON.stringify({
type: "response.create",
model: "gpt-4",
input: [{ role: "user", content: "hello" }],
}));
const messages = await ws.waitForMessages(9);
const events = messages.map(m => JSON.parse(m));
const types = events.map(e => e.type);
expect(types[0]).toBe("response.created");
expect(types).toContain("response.output_text.delta");
expect(types).toContain("response.completed");
OpenAI Realtime
The Realtime API uses a conversational protocol with session management.
const ws = await connectWebSocket(instance.url, "/v1/realtime");
// Server sends session.created on connect
const [sessionMsg] = await ws.waitForMessages(1);
expect(JSON.parse(sessionMsg).type).toBe("session.created");
// Configure session
ws.send(JSON.stringify({
type: "session.update",
session: { modalities: ["text"] }
}));
// Add a user message
ws.send(JSON.stringify({
type: "conversation.item.create",
item: {
type: "message",
role: "user",
content: [{ type: "input_text", text: "hello" }]
}
}));
// Request a response
ws.send(JSON.stringify({ type: "response.create" }));
// Wait for response events
const msgs = await ws.waitForMessages(8);
const events = msgs.map(m => JSON.parse(m));
expect(events.some(e => e.type === "response.text.delta")).toBe(true);
Gemini Live
Bidirectional streaming for Google Gemini Live API.
const ws = await connectWebSocket(
instance.url,
"/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent"
);
// Send setup message
ws.send(JSON.stringify({
setup: { model: "models/gemini-2.0-flash-live" }
}));
// Send client content
ws.send(JSON.stringify({
clientContent: {
turns: [{ role: "user", parts: [{ text: "hello" }] }],
turnComplete: true,
}
}));
Implementation Details
- Built on raw RFC 6455 WebSocket framing — zero external dependencies
- Text messages only (no binary/audio/video)
- Same fixture matching as HTTP endpoints
- All WebSocket connections are logged in the journal
Gemini Live text support is unverified — no text-capable Gemini Live model existed at time of implementation. The WebSocket framing and protocol messages follow the published API spec.
Provider WebSocket Support
Not all LLM providers offer WebSocket APIs. Here's the current landscape:
| Provider | WebSocket API | llmock Status |
|---|---|---|
| OpenAI Realtime | wss://api.openai.com/v1/realtime | Supported ✓ |
| OpenAI Responses | wss://api.openai.com/v1/responses | Supported ✓ |
| Gemini Live | wss://...BidiGenerateContent | Implemented, awaiting text model |
| Anthropic Claude | None | N/A |
| Azure OpenAI | Uses OpenAI Realtime | Covered by OpenAI |
| Mistral / Groq / Cohere | None | N/A |
| AWS Bedrock | EventStream (not WebSocket) | N/A |
llmock includes drift canary tests that automatically detect when providers add new WebSocket capabilities. When a canary fires, it signals that llmock should be updated to support the new API.