# Kubernetes LLM Platform ## Docs - [Architecture](https://ai.wmx.dev/architecture/index.md) - [Cold start](https://ai.wmx.dev/architecture/lessons/cold-start.md) - [GPU signals](https://ai.wmx.dev/architecture/lessons/gpu-signals-and-autoscaling.md) - [Concepts](https://ai.wmx.dev/architecture/lessons/index.md) - [Operational gotchas](https://ai.wmx.dev/architecture/lessons/operational-gotchas.md) - [Portability](https://ai.wmx.dev/architecture/lessons/portability.md) - [Serving layers compared](https://ai.wmx.dev/architecture/serving-layers.md) - [Benchmark results](https://ai.wmx.dev/benchmarks.md) - [Design rationale](https://ai.wmx.dev/decisions/index.md) - [1. Configure](https://ai.wmx.dev/getting-started/configure.md) - [Get started](https://ai.wmx.dev/getting-started/index.md) - [3. Install the platform](https://ai.wmx.dev/getting-started/install-platform.md) - [2. Provision infrastructure](https://ai.wmx.dev/getting-started/provision-infra.md) - [Benchmarking](https://ai.wmx.dev/guides/benchmarking.md) - [Change models](https://ai.wmx.dev/guides/change-or-add-a-model.md) - [Coding assistant](https://ai.wmx.dev/guides/coder-stack.md) - [GPU debugging](https://ai.wmx.dev/guides/gpu-debugging.md) - [Guardrails](https://ai.wmx.dev/guides/guardrails.md) - [Guides](https://ai.wmx.dev/guides/index.md) - [Inference gateway](https://ai.wmx.dev/guides/inference-gateway.md) - [KEDA autoscaling](https://ai.wmx.dev/guides/keda-autoscaling.md) - [API key portal](https://ai.wmx.dev/guides/key-portal.md) - [KServe](https://ai.wmx.dev/guides/kserve.md) - [OCI modelcar](https://ai.wmx.dev/guides/kserve-modelcar.md) - [LiteLLM tenant gateway](https://ai.wmx.dev/guides/litellm.md) - [Spend dashboard](https://ai.wmx.dev/guides/litellm-spend-dashboard.md) - [n8n automation](https://ai.wmx.dev/guides/n8n.md) - [Pending GPU workloads](https://ai.wmx.dev/guides/pending-gpu-workloads.md) - [Production HA](https://ai.wmx.dev/guides/prod-ha-validation.md) - [Secret contract](https://ai.wmx.dev/guides/secret-contract.md) - [Security enforcement](https://ai.wmx.dev/guides/security-enforcement.md) - [SSO with Dex](https://ai.wmx.dev/guides/sso-dex.md) - [Staged bring-up](https://ai.wmx.dev/guides/staged-bring-up.md) - [Switch serving layer](https://ai.wmx.dev/guides/switch-serving-layer.md) - [Validation and teardown](https://ai.wmx.dev/guides/teardown.md) - [Raw vLLM](https://ai.wmx.dev/guides/vllm-serving.md) - [Kubernetes LLM Platform](https://ai.wmx.dev/index.md) - [agentgateway egress and MCP](https://ai.wmx.dev/reference/agentgateway.md) - [Glossary](https://ai.wmx.dev/reference/glossary.md) - [Reference](https://ai.wmx.dev/reference/index.md) - [Make targets](https://ai.wmx.dev/reference/make-targets.md) - [Model catalog](https://ai.wmx.dev/reference/model-catalog.md) - [Repo layout](https://ai.wmx.dev/reference/repo-layout.md) - [Future Roadmap](https://ai.wmx.dev/reference/roadmap.md) - [Secrets](https://ai.wmx.dev/reference/secrets.md) - [Security posture](https://ai.wmx.dev/reference/security.md) - [Trust model](https://ai.wmx.dev/reference/trust-model.md)