Deployment profiles
A profile is a cumulative selection of layers, applied withmake root PROFILE=… (default
platform). Each profile is a superset of the one before: you widen, you don’t switch.
PROFILE | Layers applied | Adds |
|---|---|---|
platform | platform | GitOps base: GPU platform, Kueue, observability, secrets |
serving | platform + serving | raw vLLM / KServe serving |
llm-gateway | platform + serving + routing + llm-gateway | GIE routing + LiteLLM tenant gateway |
full | all of the above + demos | example tenants and demos |
make argocd-repo first while the repo is private (registers the Argo CD repo credential).
Make targets
make help lists every target with its one-line description; the tables below group the ones you
run most.
Cloud substrate (OpenTofu, paid resources)
| Target | What |
|---|---|
tf-init / tf-validate / tf-plan | initialize, validate, plan the GKE root |
tf-apply | apply the cloud substrate (costs money; AUTO_APPROVE=1 for non-interactive/CI) |
tf-destroy | destroy the substrate (read the teardown runbook first) |
tf-fmt | check OpenTofu/Terraform formatting (the CI format gate) |
tf-credentials | print the gcloud get-credentials command from TF output |
Fork & config
| Target | What |
|---|---|
fork-init | propagate environments/$(CLUSTER)/config.yaml (repo URL + GCP project) across the repo |
config-check | validate config propagation and tfvars drift |
resolve-groups | resolve config.yaml features: into groups.generated.yaml (chains the four resolvers below) |
resolve-profile / resolve-secret-store / resolve-gpu / resolve-guardrails | resolve the profile (LiteLLM + CNPG HA overlays), the ESO ClusterSecretStore, the GPU stack (gpu_stack → gpu-operator group + DCGM scrape target), and the guardrails overlay |
GitOps bring-up
| Target | What |
|---|---|
require-kube | validate + print the dedicated cluster target (./kubeconfig); the gate every cluster command runs first |
bootstrap | install Argo CD on the repo’s dedicated cluster (./kubeconfig; pinned chart) |
argocd-repo | create/update the Argo CD repo credential (private repo) |
seed-secrets | seed the internal random secrets (LiteLLM/vLLM/Dex/oauth2-proxy/DB) into the backend, idempotent; prints the external ones you must supply |
reset-dex-admin | rotate a lost Dex static-admin password, persist password+hash in GSM, force ESO refresh/restart Dex |
root | apply the platform AppProject + app-of-apps roots for PROFILE |
wait / smoke / doctor | wait for sync · run smoke checks · validate prerequisites |
verify | end-to-end platform check (GitOps + economics + budget-429 + serving + edge/SSO), plain pass/fail |
seed-experience | (optional) manually re-mint Open WebUI’s experience secret from live LiteLLM; normally the in-cluster litellm-keys Job mints both apps’ keys automatically |
argocd-password / argocd-ui | print admin password · port-forward the UI to localhost:8080 |
credentials | collate operator credentials into the gitignored secrets/credentials.local.md (SSO is the real path) |
Serving & GPU (cost-sensitive)
| Target | What |
|---|---|
vllm-up / vllm-down | scale raw vLLM to 1 (brings up an L4, costs) / to 0 ($0 idle) |
vllm-smoke | send an authenticated OpenAI chat request to raw vLLM |
gpu-smoke | run the nvidia-smi GPU smoke Job (verifies the GPU stack; triggers a GPU node, scales back to 0) |
keda-demo-up / keda-demo-down | un-pause / re-pause the raw-vLLM KEDA ScaledObject (load-test only) |
modelcar-build | build + push the OCI modelcar, print the @sha256 digest to pin |
bench | run the vLLM concurrency-sweep benchmark Job (after vllm-up) |
bench-guidellm | run the GuideLLM SLO-frontier sweep Job (standard serving benchmark; after vllm-up) |
guardrails-smoke | prove the LiteLLM guardrails: PII masked + prompt-injection blocked (needs features.guardrails: true synced) |
Cost control
| Target | What |
|---|---|
pause | pause to $0: release Gateway LBs, destroy the substrate with tofu, audit orphans (destructive; Secret Manager values survive) |
resume | resume from a paused (destroyed) cluster: tofu apply + bootstrap + platform base |
Docs
| Target | What |
|---|---|
docs-serve | serve the docs site locally with live reload (Mintlify, http://localhost:3000) |
docs-build | validate the docs site (broken-link check, same as CI) |
Fork configuration
environments/<env>/config.yaml is the single source of fork config: repo URL, cloud project,
domain, DNS provider. Edit it, then make fork-init rewrites the repo to match. Secret values
never live here; they come from a cloud secret manager via External Secrets Operator.