Prometheus Metrics
llmock exposes Prometheus-compatible metrics via GET /metrics. Opt-in with
--metrics. Zero external dependencies — implements counters,
histograms, and gauges with Prometheus text exposition format serialization.
Endpoint
| Method | Path | Description |
|---|---|---|
| GET | /metrics | Prometheus text exposition format metrics |
Quick Start
Enable metrics bash
npx llmock --fixtures ./fixtures --metrics
Scrape metrics bash
curl http://localhost:4010/metrics
Available Metrics
| Metric | Type | Labels | Description |
|---|---|---|---|
llmock_requests_total |
Counter | method, path, status |
Total number of requests handled |
llmock_request_duration_seconds |
Histogram | method, path |
Request duration in seconds |
llmock_fixtures_loaded |
Gauge | — | Number of fixtures currently loaded |
llmock_chaos_triggered_total |
Counter | action |
Number of chaos events triggered (action: drop, malformed,
disconnect)
|
Path Normalization
Dynamic path segments are normalized to placeholders in metric labels to prevent high cardinality. The normalization rules:
| Provider | Raw Path | Normalized Label |
|---|---|---|
| Bedrock | /model/anthropic.claude-v2/invoke |
/model/{modelId}/invoke |
| Gemini | /v1beta/models/gemini-pro:generateContent |
/v1beta/models/{model}:generateContent |
| Azure | /openai/deployments/gpt4/chat/completions |
/openai/deployments/{id}/chat/completions |
| Vertex AI |
/v1/projects/my-proj/locations/us-c1/publishers/google/models/gemini-pro:generateContent
|
/v1/projects/{p}/locations/{l}/publishers/google/models/{m}:generateContent
|
| Others | /v1/chat/completions |
/v1/chat/completions (unchanged) |
Output Format
The GET /metrics endpoint returns Prometheus text exposition format. Example
output:
Example /metrics response text
# TYPE llmock_requests_total counter
llmock_requests_total{method="POST",path="/v1/chat/completions",status="200"} 42
llmock_requests_total{method="POST",path="/v1/messages",status="200"} 15
# TYPE llmock_request_duration_seconds histogram
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.005"} 0
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.01"} 5
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.025"} 20
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.05"} 35
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.1"} 40
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.25"} 42
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.5"} 42
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="1"} 42
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="2.5"} 42
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="5"} 42
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="10"} 42
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="+Inf"} 42
llmock_request_duration_seconds_sum{method="POST",path="/v1/chat/completions"} 1.234
llmock_request_duration_seconds_count{method="POST",path="/v1/chat/completions"} 42
# TYPE llmock_fixtures_loaded gauge
llmock_fixtures_loaded{} 12
# TYPE llmock_chaos_triggered_total counter
llmock_chaos_triggered_total{action="drop"} 3
llmock_chaos_triggered_total{action="malformed"} 1
Histogram Buckets
Duration histograms use Prometheus-style bucket boundaries (in seconds):
0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10
Implementation Details
-
Zero dependencies. The metrics registry is implemented from scratch
— no
prom-clientor other libraries required. - Three metric types: counters (monotonically increasing), histograms (cumulative buckets with sum and count), and gauges (arbitrary values).
- Label escaping. Label values are escaped per Prometheus text exposition format: backslashes, double quotes, and newlines.
- Stable output. Metrics are serialized in insertion order for deterministic output.