Streaming Physics
Simulate realistic LLM streaming timing with configurable time-to-first-token (TTFT), tokens-per-second (TPS), and random jitter. Perfect for testing loading states, progress indicators, and streaming UX under realistic conditions.
StreamingProfile
The streamingProfile option can be set on any fixture to control the timing
of streamed chunks.
| Property | Type | Description |
|---|---|---|
ttft
|
number
|
Time to first token in milliseconds. Delay before the first chunk is sent. |
tps
|
number
|
Tokens per second. Each chunk after the first is delayed by
1000 / tps ms.
|
jitter
|
number
|
Random variance factor (0–1). Each delay is multiplied by
1 + random(-1,1) * jitter. Default 0 (no variance).
|
Programmatic Usage
streaming-physics.test.ts ts
const mock = new LLMock();
await mock.start();
// Simulate GPT-4 streaming timing
mock.on(
{ userMessage: "hello" },
{ content: "Hello! How can I help you today?" },
{
streamingProfile: {
ttft: 800, // 800ms before first token
tps: 50, // 50 tokens/sec after that
jitter: 0.2, // +/-20% variance on each delay
},
},
);
JSON Fixture File
fixtures/slow-model.json json
{
"fixtures": [
{
"match": { "userMessage": "think carefully" },
"response": { "content": "Let me think about this..." },
"streamingProfile": {
"ttft": 2000,
"tps": 30,
"jitter": 0.1
}
}
]
}
Interaction with latency
-
When
streamingProfileis set, it takes priority over thelatencyfield. -
If
streamingProfileis not set, the existinglatencybehavior applies (flat delay per chunk). -
If
streamingProfileis set but has neitherttftnortps, it falls back tolatency.
Realistic Profiles
Here are some example profiles that approximate real-world LLM behavior:
profiles.ts ts
// Fast model (GPT-4o-mini, Claude 3 Haiku)
{ ttft: 200, tps: 100, jitter: 0.15 }
// Standard model (GPT-4o, Claude 3.5 Sonnet)
{ ttft: 500, tps: 60, jitter: 0.2 }
// Reasoning model (o1, o3, Claude with extended thinking)
{ ttft: 5000, tps: 80, jitter: 0.1 }
// Slow/overloaded (rate-limited or cold start)
{ ttft: 3000, tps: 15, jitter: 0.4 }
Streaming physics applies to all provider APIs — OpenAI Chat Completions,
Responses API, Claude Messages, and Gemini. The same
streamingProfile field works across all of them.