Interactive challenge

LLM Observability and Cost Dashboard Lab

Create a dashboard model that joins user latency, queue wait, GPU pressure, token throughput, and cost signals.

Prerequisites

Metrics and logs stack

Guided step

Build the signal model

Connect user latency to runtime queueing, GPU pressure, token throughput, logs, and traces.

Commands

kubectl -n llm-serving get pod --show-labels
curl -sS "$METRICS_ENDPOINT" | grep -E "ttft|queue|tokens|gpu"
kubectl -n llm-serving logs deploy/<runtime-deployment> --tail=80

Expected signals

  • Metrics have stable route, model, and tenant-safe labels.
  • TTFT and queue wait are visible separately.
  • GPU pressure can be correlated with user latency.

Checks

Paste metric names or dashboard notes for LLM serving.

Confirm that prompt text is not used as a metric label.

Hints and solution

No hints opened for this step yet.