GreenThread Docs

Onyx is an open-source AI assistant and knowledge platform with RAG (retrieval-augmented generation), connectors for Slack, Google Drive, Confluence, and more, plus a chat UI. By connecting Onyx to GreenThread, all RAG and chat inference runs on your self-hosted models instead of cloud APIs.

Architecture

Users → Onyx (Chat UI / RAG) → Bifrost or GreenThread Ingress → Models

Onyx uses LiteLLM under the hood, so any OpenAI-compatible endpoint works as an LLM provider. You can point it at:

Bifrost — if you want auth, logging, and multi-provider routing
GreenThread ingress — for direct access to your models

Prerequisites

Onyx's Helm chart bundles PostgreSQL, Redis, and nginx as sub-chart dependencies. In most clusters you'll already have these (or prefer your own). Disable them and deploy standalone instances instead.

Create namespace

kubectl create namespace onyx

Deploy PostgreSQL

You can use any PostgreSQL instance — CloudNativePG, a managed service like RDS, or a simple deployment. Here's a minimal deployment for testing:

apiVersion: v1
kind: Secret
metadata:
  name: onyx-postgresql
  namespace: onyx
stringData:
  username: postgres
  password: postgres
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: onyx-postgres
  namespace: onyx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: onyx-postgres
  template:
    metadata:
      labels:
        app: onyx-postgres
    spec:
      containers:
      - name: postgres
        image: postgres:16-alpine
        env:
        - name: POSTGRES_USER
          value: postgres
        - name: POSTGRES_PASSWORD
          value: postgres
        - name: POSTGRES_DB
          value: postgres
        ports:
        - containerPort: 5432
---
apiVersion: v1
kind: Service
metadata:
  name: onyx-postgres
  namespace: onyx
spec:
  selector:
    app: onyx-postgres
  ports:
  - port: 5432

Production use

This minimal PostgreSQL deployment has no persistence or backups. For production, use CloudNativePG, Amazon RDS, or another managed PostgreSQL service.

Deploy Redis

apiVersion: apps/v1
kind: Deployment
metadata:
  name: onyx-redis
  namespace: onyx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: onyx-redis
  template:
    metadata:
      labels:
        app: onyx-redis
    spec:
      containers:
      - name: redis
        image: redis:7-alpine
        args: ["--requirepass", "password"]
        ports:
        - containerPort: 6379
---
apiVersion: v1
kind: Service
metadata:
  name: onyx-redis
  namespace: onyx
spec:
  selector:
    app: onyx-redis
  ports:
  - port: 6379

Apply both:

kubectl apply -f onyx-prereqs.yaml

Install Onyx

Add the Helm repo

helm repo add onyx https://onyx-dot-app.github.io/onyx/
helm repo update

Create a values file

Create onyx-values.yaml:

# Disable bundled operators — we deploy our own
redis:
  enabled: false
postgresql:
  enabled: false
nginx:
  enabled: false

# Disable Onyx's built-in inference model server
# (we use GreenThread for LLM inference)
# Keep indexing model server — Onyx requires it for embeddings at startup
inferenceCapability:
  replicaCount: 0

# Point at our standalone Redis + Postgres
configMap:
  DOMAIN: "onyx.example.com"
  WEB_DOMAIN: "https://onyx.example.com"
  POSTGRES_HOST: "onyx-postgres"
  REDIS_HOST: "onyx-redis"

# Ingress (assumes nginx ingress controller is installed)
ingress:
  enabled: true
  className: nginx
  api:
    host: onyx.example.com
  webserver:
    host: onyx.example.com

# TLS via cert-manager
letsencrypt:
  enabled: true
  email: admin@example.com

# OpenSearch password
auth:
  opensearch:
    values:
      opensearch_admin_password: "<your-opensearch-password>"

POSTGRES_HOST

Set POSTGRES_HOST to the Service name of your PostgreSQL instance. If using CloudNativePG, this is typically <cluster-name>-rw. The Onyx chart expects a secret named onyx-postgresql with username and password keys.

Install

helm upgrade --install onyx onyx/onyx \
  --namespace onyx \
  -f onyx-values.yaml

Wait for all pods to come up:

kubectl get pods -n onyx -w

Connect Onyx to GreenThread

The LLM provider is configured in the Onyx admin UI, not in Helm values.

Open your Onyx instance (e.g. https://onyx.example.com)
Create your admin account on first login
Go to Admin Panel → LLM → Add Custom LLM Provider

Provider settings

Field	Value
Display Name	`GreenThread`
Provider Name	`openai`
API Key	Your Bifrost Virtual Key, or `not-needed` if connecting to the ingress directly
API Base	`http://bifrost.bifrost.svc.cluster.local:8080/v1` or `http://gthread-ingress.greenthread-system.svc.cluster.local/v1`

Bifrost vs direct

Via Bifrost — use if you want auth, request logging, and cost tracking. Set the API Base to http://bifrost.bifrost.svc.cluster.local:8080/v1 and the API Key to a Bifrost Virtual Key. Prefix model names with greenthread/ (e.g. greenthread/gpt-oss-20b).

Direct to GreenThread — simpler setup. Set the API Base to http://gthread-ingress.greenthread-system.svc.cluster.local/v1 and the API Key to not-needed. Use the HuggingFace model ID as the model name (e.g. Qwen/Qwen3.5-9B).

Add models

Under Model Configurations, add each model you want Onyx to use:

Model Name	Max Input Tokens
`greenthread/gpt-oss-20b`	128000
`Qwen/Qwen3-4B-Thinking-2507`	128000
`Qwen/Qwen3.5-9B`	128000

Set the Default Model to your preferred model for chat.

Click Update to save.

Model names

The model names must match what the LLM backend expects. When routing through Bifrost, prefix with the provider name (e.g. greenthread/model-name). When connecting directly to the GreenThread ingress, use the HuggingFace model ID (e.g. Qwen/Qwen3.5-9B).

Verifying the integration

After saving the LLM provider config:

Go to the Onyx chat interface
Start a new conversation
Ask a question — Onyx will send the request to GreenThread via LiteLLM
If the model is sleeping, GreenThread wakes it automatically (first response may take a few seconds longer)

You can verify requests are flowing by checking the GreenThread ingress logs:

kubectl logs -n greenthread-system deployment/gthread-ingress -f | grep chat/completions

Troubleshooting

Onyx pods in CrashLoopBackOff

The most common cause is PostgreSQL or Redis not being reachable. Check:

# Verify Postgres is up
kubectl get pods -n onyx -l app=onyx-postgres

# Verify the secret exists
kubectl get secret -n onyx onyx-postgresql

# Check Onyx API server logs
kubectl logs -n onyx deployment/onyx-api-server --tail=50

"Model not found" errors in chat

Ensure the model names in the Onyx LLM config exactly match what GreenThread serves. Check available models:

# Direct to ingress
curl http://gthread-ingress.greenthread-system.svc.cluster.local/v1/models

# Or via Bifrost
curl http://bifrost.bifrost.svc.cluster.local:8080/v1/models \
  -H "x-provider: greenthread"

Embedding models

Onyx requires an embedding model for document indexing. By default, the Onyx Helm chart deploys its own indexing model server (inferenceCapability). If you set inferenceCapability.replicaCount: 0, Onyx will not be able to index documents unless you configure an embedding model in the LLM admin panel.

For most deployments, leave the default Onyx indexing model server running — it handles embeddings separately from the LLM used for chat.