Interactive challenge
LLM Observability and Cost Dashboard Lab
Create a dashboard model that joins user latency, queue wait, GPU pressure, token throughput, and cost signals.
Prerequisites
Metrics and logs stack
Guided step
Build the signal model
Connect user latency to runtime queueing, GPU pressure, token throughput, logs, and traces.
Commands
kubectl -n llm-serving get pod --show-labels
curl -sS "$METRICS_ENDPOINT" | grep -E "ttft|queue|tokens|gpu"
kubectl -n llm-serving logs deploy/<runtime-deployment> --tail=80
Expected signals
- Metrics have stable route, model, and tenant-safe labels.
- TTFT and queue wait are visible separately.
- GPU pressure can be correlated with user latency.
Checks
Paste metric names or dashboard notes for LLM serving.
Confirm that prompt text is not used as a metric label.
Hints and solution
No hints opened for this step yet.