Prometheus Metrics

llmock exposes Prometheus-compatible metrics via GET /metrics. Opt-in with --metrics. Zero external dependencies — implements counters, histograms, and gauges with Prometheus text exposition format serialization.

Endpoint

Method Path Description
GET /metrics Prometheus text exposition format metrics

Quick Start

Enable metrics bash
npx llmock --fixtures ./fixtures --metrics
Scrape metrics bash
curl http://localhost:4010/metrics

Available Metrics

Metric Type Labels Description
llmock_requests_total Counter method, path, status Total number of requests handled
llmock_request_duration_seconds Histogram method, path Request duration in seconds
llmock_fixtures_loaded Gauge Number of fixtures currently loaded
llmock_chaos_triggered_total Counter action Number of chaos events triggered (action: drop, malformed, disconnect)

Path Normalization

Dynamic path segments are normalized to placeholders in metric labels to prevent high cardinality. The normalization rules:

Provider Raw Path Normalized Label
Bedrock /model/anthropic.claude-v2/invoke /model/{modelId}/invoke
Gemini /v1beta/models/gemini-pro:generateContent /v1beta/models/{model}:generateContent
Azure /openai/deployments/gpt4/chat/completions /openai/deployments/{id}/chat/completions
Vertex AI /v1/projects/my-proj/locations/us-c1/publishers/google/models/gemini-pro:generateContent /v1/projects/{p}/locations/{l}/publishers/google/models/{m}:generateContent
Others /v1/chat/completions /v1/chat/completions (unchanged)

Output Format

The GET /metrics endpoint returns Prometheus text exposition format. Example output:

Example /metrics response text
# TYPE llmock_requests_total counter
llmock_requests_total{method="POST",path="/v1/chat/completions",status="200"} 42
llmock_requests_total{method="POST",path="/v1/messages",status="200"} 15

# TYPE llmock_request_duration_seconds histogram
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.005"} 0
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.01"} 5
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.025"} 20
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.05"} 35
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.1"} 40
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.25"} 42
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="0.5"} 42
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="1"} 42
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="2.5"} 42
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="5"} 42
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="10"} 42
llmock_request_duration_seconds_bucket{method="POST",path="/v1/chat/completions",le="+Inf"} 42
llmock_request_duration_seconds_sum{method="POST",path="/v1/chat/completions"} 1.234
llmock_request_duration_seconds_count{method="POST",path="/v1/chat/completions"} 42

# TYPE llmock_fixtures_loaded gauge
llmock_fixtures_loaded{} 12

# TYPE llmock_chaos_triggered_total counter
llmock_chaos_triggered_total{action="drop"} 3
llmock_chaos_triggered_total{action="malformed"} 1

Histogram Buckets

Duration histograms use Prometheus-style bucket boundaries (in seconds):

0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10

Implementation Details