Progress dashboard
Track Kubernetes LLM lab progress locally.
This dashboard reads browser-local challenge status, completed steps, and blocked work. Accounts and saved cloud progress are intentionally deferred until premium lab packs exist.
Anonymous browser progress
Progress is local to this browser in v1. A paid product can later add accounts, saved cloud progress, team reporting, and premium lab packs without changing the challenge model.
vLLM Inference Challenge
Deploy a GPU-backed OpenAI-compatible endpoint and prove scheduling, health, TTFT, queueing, and rollback readiness.
RAG Retrieval Challenge
Operate ingestion, metadata filters, vector retrieval, answer evaluation, and failure drills for production RAG.
Production Readiness Challenge
Run a launch review across security, quota, rollout, observability, cost, and ownership before live traffic.
LLM Observability Challenge
Build the signal model needed to debug user latency, runtime saturation, GPU pressure, traces, logs, and alerts.
vLLM Kubernetes Deployment Lab
Design the deployment contract for vLLM with model cache, readiness, runtime flags, and service exposure.
KServe vs Ray Serve Decision Lab
Choose the serving abstraction by ownership model, CRDs, graph complexity, autoscaling, and rollout needs.
GPU Node Pool Scheduling Lab
Prove accelerator placement with labels, taints, tolerations, quotas, and unschedulable-pod debugging.
RAG Retrieval Quality Lab
Measure retrieval recall, citation accuracy, tenant filtering, and reranking latency before generation.
Inference Cost Model Lab
Calculate cost per request from input tokens, output tokens, GPU profile, utilization, and cache behavior.
LLM Rollout and Rollback Lab
Design traffic shifting, readiness gates, rollback triggers, and model-version ownership for inference services.
Multi-Tenant LLM Security Lab
Review tenant routing, namespace boundaries, secrets, NetworkPolicy, prompt logging, and retrieval authorization.
LLM Observability and Cost Dashboard Lab
Create a dashboard model that joins user latency, queue wait, GPU pressure, token throughput, and cost signals.