Streaming Physics

Simulate realistic LLM streaming timing with configurable time-to-first-token (TTFT), tokens-per-second (TPS), and random jitter. Perfect for testing loading states, progress indicators, and streaming UX under realistic conditions.

StreamingProfile

The streamingProfile option can be set on any fixture to control the timing of streamed chunks.

Property Type Description
ttft number Time to first token in milliseconds. Delay before the first chunk is sent.
tps number Tokens per second. Each chunk after the first is delayed by 1000 / tps ms.
jitter number Random variance factor (0–1). Each delay is multiplied by 1 + random(-1,1) * jitter. Default 0 (no variance).

Programmatic Usage

streaming-physics.test.ts ts
const mock = new LLMock();
await mock.start();

// Simulate GPT-4 streaming timing
mock.on(
  { userMessage: "hello" },
  { content: "Hello! How can I help you today?" },
  {
    streamingProfile: {
      ttft: 800,    // 800ms before first token
      tps: 50,      // 50 tokens/sec after that
      jitter: 0.2,  // +/-20% variance on each delay
    },
  },
);

JSON Fixture File

fixtures/slow-model.json json
{
  "fixtures": [
    {
      "match": { "userMessage": "think carefully" },
      "response": { "content": "Let me think about this..." },
      "streamingProfile": {
        "ttft": 2000,
        "tps": 30,
        "jitter": 0.1
      }
    }
  ]
}

Interaction with latency

Realistic Profiles

Here are some example profiles that approximate real-world LLM behavior:

profiles.ts ts
// Fast model (GPT-4o-mini, Claude 3 Haiku)
{ ttft: 200, tps: 100, jitter: 0.15 }

// Standard model (GPT-4o, Claude 3.5 Sonnet)
{ ttft: 500, tps: 60, jitter: 0.2 }

// Reasoning model (o1, o3, Claude with extended thinking)
{ ttft: 5000, tps: 80, jitter: 0.1 }

// Slow/overloaded (rate-limited or cold start)
{ ttft: 3000, tps: 15, jitter: 0.4 }

Streaming physics applies to all provider APIs — OpenAI Chat Completions, Responses API, Claude Messages, and Gemini. The same streamingProfile field works across all of them.