GreenThreadDocs

With your cluster prerequisites in place (GPU Operator + DRA Driver), install the GreenThread engine via Helm.

This page covers the engine only. To layer the UI and customer portal on top, install LiquidCompute and AI Console afterwards.

Pull credentials

The chart and the engine images all live on licence.greenthread.ai — see Licensing › Pulling charts and images for the URL convention. Two setup steps, both using your licence token:

# 1. Helm registry login (one-time, per workstation)
helm registry login licence.greenthread.ai \
  --username licence \
  --password <your-licence-token>

# 2. In-cluster image-pull secret (the namespace must exist first)
kubectl create namespace greenthread-system
kubectl create secret docker-registry greenthread-registry \
  -n greenthread-system \
  --docker-server=licence.greenthread.ai \
  --docker-username=licence \
  --docker-password=<your-licence-token>

Install

helm upgrade --install greenthread \
  oci://licence.greenthread.ai/greenthread/charts/greenthread \
  --namespace greenthread-system \
  --set licence.token=<your-licence-token> \
  --set controller.huggingFaceToken=<your-hf-token> \
  --set 'global.imagePullSecrets[0].name=greenthread-registry'

The chart's agent, controller, webhook, and sidecar images all default to licence.greenthread.ai/greenthread/*, so there are no image-repository overrides to pass — the pull secret above is what authenticates them. Image tags default to the chart's appVersion, and the licence server only serves tags you have access to. If you have direct GHCR access instead, point the repos back at ghcr.io/greenthread-ai/* with --set agent.image.repository=… (and the controller / webhook / controller.sidecarImage equivalents).

That's it. The chart installs:

WorkloadTypeWhat it does
greenthread-controllerDeployment (2 replicas, leader-elected)Reconciles Model, GPUShareClaim, GPU, StorageNode. One leader phones home to the licence server.
greenthread-webhookDeployment (2 replicas)Validating admission webhook for Model + GPUShareClaim.
greenthread-agentDaemonSet (on every node with nvidia.com/gpu.present=true)Discovers GPUs via NVML, runs per-GPU scheduler daemons, registers as a DRA kubelet plugin, publishes a ResourceSlice.

Plus the licence Secret (greenthread-licence in the operator namespace) and supporting RBAC.

No more storage DaemonSet

The standalone storage server has been folded into the per-Model sidecar. There's no gthread-storage DaemonSet to operate, no hugepages to configure, no NVMe host-mount to plan around. Weight staging is handled inline.

Helm values reference

The full chart values.yaml is at greenthread/charts/greenthread/values.yaml. The values most installs care about:

ValueDefaultPurpose
licence.enabledtrueSet false to skip the licence Secret + RBAC (dev clusters only).
licence.token""Customer licence token. Required when enabled=true.
licence.serverURLhttps://licence.greenthread.aiLicence server base URL. See Licensing.
slotsPerGPU10How many DRA slots the agent publishes per physical GPU. Higher = finer-grained sharing of small models; lower = whole-GPU workloads.
controller.replicas2Controller replica count (leader-elected).
controller.vllmImagevllm/vllm-openai:v0.19.0Default vLLM image stamped on every Model pod when Model.spec.vllmImage is empty.
controller.vllmOmniImagevllm/vllm-omni:v0.20.0Same, but for modelType: tts (vllm-omni).
controller.huggingFaceToken""Passed to conversion Jobs as HF_TOKEN for gated models (Llama, Gemma, etc.).
controller.imagePullSecret""Image-pull Secret attached to every Model pod and copied into every Model namespace.
controller.runtimeClassName""Optional runtimeClassName (e.g. nvidia) for GPU pods.
agent.nodeSelector{ nvidia.com/gpu.present: "true" }Which nodes the agent DaemonSet runs on.
agent.cdiSpecDir/var/run/cdiHost path the DRA driver writes per-claim CDI specs to. Must be in containerd's cdi_spec_dirs.
webhook.failurePolicyIgnoreFail if you want unreachable webhook to block all CR writes; Ignore if you'd rather risk a brief admission gap during a rollout.

Verify

# 1. All pods Running
kubectl get pods -n greenthread-system

# 2. The licence Secret has a cached JWT (within ~1 minute of install)
kubectl get secret greenthread-licence -n greenthread-system -o json \
  | jq '.data | keys'
# → ["cached-jwt", "token"]

# 3. GPUs auto-discovered (one GPU CR per physical device)
kubectl get gpus

Example from a 4-GPU dev cluster (1 dev node + 2 Blackwell workstation nodes):

$ kubectl get pods -n greenthread-system
NAME                                      READY   STATUS    AGE
greenthread-agent-6rxxs                   1/1     Running   4h
greenthread-agent-hhzgj                   1/1     Running   4h
greenthread-agent-xv88f                   1/1     Running   4h
greenthread-controller-85d5fc9d68-j8kw8   1/1     Running   4h
greenthread-controller-85d5fc9d68-vscpb   1/1     Running   4h
greenthread-webhook-c4cb6887b-5sfb5       1/1     Running   4h
greenthread-webhook-c4cb6887b-5vhhm       1/1     Running   4h

$ kubectl get gpus
NAME                  NODE            INDEX   PRODUCT                         MODE        HEALTH    AVAILABLE
gpu-blackwell-0-0     blackwell-0     0       NVIDIA-RTX-PRO-4000-Blackwell   inference   Healthy   4178575360
gpu-blackwell-1-0     blackwell-1     0       NVIDIA-RTX-PRO-4000-Blackwell   inference   Healthy   3740270592
gpu-gt-gpu-dev-01-0   gt-gpu-dev-01   0       NVIDIA-A40                      inference   Healthy   2856321024
gpu-gt-gpu-dev-01-1   gt-gpu-dev-01   1       NVIDIA-A40                      inference   Healthy   51527024640

The AVAILABLE column shows free VRAM in bytes; MODE is the current sharing class published by the DRA driver.

Next steps

Troubleshooting

licence.token is required

You ran helm install without --set licence.token=…. Either provide one or pass --set licence.enabled=false to install in unlicenced mode (dev clusters only).

kubectl get secret greenthread-licence shows only token, no cached-jwt

The controller couldn't reach the licence server. Check logs:

kubectl logs -n greenthread-system -l app.kubernetes.io/component=controller \
  --tail=50 | grep licence

Common causes: outbound HTTPS blocked by a NetworkPolicy, wrong licence.serverURL, or a revoked / mistyped token.

Webhook denies every Model create with "cluster is unlicenced"

Same as above — the cached JWT is missing or expired. The webhook fails closed when the licence reader has no valid cached JWT to verify.

Agent stuck CrashLoopBackOff

Almost always a CDI spec dir mismatch. Verify agent.cdiSpecDir (default /var/run/cdi) is one of the dirs in your containerd config's cdi_spec_dirs (the GPU Operator usually configures /etc/cdi and /var/run/cdi).

DRA Driver controller Pending on a managed Kubernetes service

Managed control planes have no schedulable nodes for the DRA controller. Override the affinity:

--set controller.nodeSelector=null \
--set "controller.affinity=null" \
--set "controller.tolerations[0].key=CriticalAddonsOnly" \
--set "controller.tolerations[0].operator=Exists"

(Pass these to the nvidia-dra-driver-gpu chart, not the GreenThread chart.)