The GreenThread platform is three independent components that snap together. You install only what you need — every layer below the one you stop at is optional.
| Component | Role | Audience |
|---|---|---|
| GreenThread | The engine. Kubernetes-native GPU-sharing platform; deploys models, schedules them across GPUs, sleeps idle ones to free VRAM. | Platform team |
| LiquidCompute | Self-hosted "Render.com for GPUs". Adds a Service abstraction, Projects, custom domains, a Fleet view, and an OpenAI-aggregator proxy on top of the engine. | Platform team / app teams |
| AI Console | Customer-facing AI portal. API keys, usage dashboards, audit log, and chat / TTS / STT playgrounds in front of LiquidCompute. | End users / customers |
How they fit together
Request flow for a chat completion
- A customer's app POSTs to
https://api.example.com/v1/chat/completionswith an API key. The DNS points at AI Console. - AI Console authenticates the key, records the request for usage, and proxies the payload to LiquidCompute's
lc-proxy. lc-proxyselects the right Model based on the request'smodelfield and forwards to the engine pod's sidecar.- The engine's sidecar:
- If the Model is serving, forwards straight to vLLM.
- If the Model is sleeping, restores weights to GPU (usually < 1 s), then forwards.
- If the GPU is full, the fairness policy sleeps an idle Model first.
- The streamed response flows back through
lc-proxy, AI Console, to the customer.
The customer never sees the engine, the GPU scheduling, or the sleep/wake — they get a standard OpenAI API.
Install order
Always install bottom-up. Each layer assumes the one below it exists.
- Stop at step 2 if you only want
kubectl apply -f model.yamland a per-Model URL. The engine is a complete, standalone platform. - Stop at step 3 if you want a self-service Projects/Services UI and a unified
/v1/*endpoint, but your end users are internal engineers, not customers. - Go to step 4 when you want a customer-facing portal with API keys, usage, and chat playgrounds — i.e. you're building "our own OpenAI".
Component reference
- GreenThread (engine) — the inference engine
- LiquidCompute — Render.com-for-GPUs platform
- AI Console — customer-facing portal
- Licensing — how the cluster authorises itself
