Catalyst Helm Chart Reference
Configuration reference for the Catalyst Helm chart.
Prerequisites
- Kubernetes 1.20+
- Helm v3.12.0+
- Diagrid CLI
Chart Dependencies
This chart includes the following dependencies:
- OpenTelemetry Collector — Optional telemetry collection and export
Install
helm install catalyst oci://public.ecr.aws/diagrid/catalyst \
-n cra-agent \
--create-namespace \
-f catalyst-values.yaml \
--set join_token="${JOIN_TOKEN}"
JOIN_TOKEN must be obtained from Diagrid Cloud before installing. See the Getting Started guide for signup and token retrieval.
Uninstall
helm uninstall catalyst -n cra-agent
WARNING: The
regionresource is intended for a single installation, once you uninstall Catalyst, the region is no longer valid. If you want to uninstall Catalyst but allow re-installation, remove the clean up hook by setting the values:
cleanup:
enabled: false
Configuration
Cluster Requirements
Catalyst should be installed in a dedicated Kubernetes cluster. It manages global resources and dynamically provisions workloads, which may conflict with other applications in a shared cluster.
Permissions
Catalyst components require broad permissions to dynamically manage resources. We are working to reduce this scope in future releases.
Images
The chart deploys multiple images. Below is a reference for users who need to mirror images to private registries.
Installation Images
Most images are hosted in the Diagrid public repository:
REPO=us-central1-docker.pkg.dev/prj-common-d-shared-89549/reg-d-common-docker-public
By default, a consolidated image is used:
| Component | Default Image | Description |
|---|---|---|
| Catalyst | $REPO/catalyst-all:<tag> | Catalyst services |
Alternatively, separate images can be used:
| Component | Default Image | Description |
|---|---|---|
| Catalyst Agent | $REPO/cra-agent:<tag> | Catalyst agent service |
| Catalyst Management | $REPO/catalyst-management:<tag> | Catalyst management service |
| Gateway Control Plane | $REPO/catalyst-gateway:<tag> | Gateway control plane service |
| Gateway Identity Injector | $REPO/identity-injector:<tag> | Identity injection service |
Dependencies:
| Component | Default Image | Description |
|---|---|---|
| Envoy Proxy | envoyproxy/envoy:<tag> | Envoy proxy for gateway |
| Piko | $REPO/diagrid-piko:<tag> | Piko reverse tunneling service |
Runtime Images
The Agent provisions these at runtime:
| Component | Default Image | Description |
|---|---|---|
| Dapr Server | $REPO/sidecar:<tag> | Catalyst dapr server |
| OpenTelemetry Collector | $REPO/catalyst-otel-collector:<tag> | OTel collector for telemetry |
| Dapr Control Plane (Catalyst) | $REPO/dapr:<tag> | Catalyst Dapr control plane services |
| Dapr Control Plane (OSS) | daprio/dapr:<tag> | Dapr control plane services |
Optional Images
Used when OpenTelemetry addons are enabled:
| Component | Default Image | Description |
|---|---|---|
| OpenTelemetry Collector (OSS) | ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-k8s:<tag> | Collector for traces, metrics, and logs |
Private Image Registry
Point the chart at a mirrored registry:
global:
image:
registry: my-registry.example.com
If using OpenTelemetry addons:
opentelemetry-deployment:
enabled: true
image:
repository: my-registry.example.com/opentelemetry-collector-k8s
tag: "0.112.0"
opentelemetry-daemonset:
enabled: true
image:
repository: my-registry.example.com/opentelemetry-collector-k8s
tag: "0.112.0"
For the full mirror procedure (including the mirror-images.sh script) see the Air-gapped installs guide.
Private Helm Registry
Configure chart registry authentication:
global:
charts:
registry: "oci://my-registry.example.com/diagrid/catalyst"
username: "my-username"
password: "my-password"
# Or use existingSecret, clientCert, clientKey, customCA
See the Air-gapped installs guide for the steps to mirror the chart itself.
Dapr PKI
By default, Dapr Sentry generates a self-signed root CA. For production, integrate with your own PKI by providing an issuer CA and trust anchors:
agent:
config:
internal_dapr:
pki:
issuer:
secret:
name: dapr-trust-bundle
namespace: cert-manager
cert: tls.crt
key: tls.key
trust:
config_map:
name: dapr-trust-bundle
namespace: cert-manager
chain: ca.crt
For an end-to-end walkthrough using cert-manager and trust-manager to provision these certificates, see the Dapr PKI guide.
Pod Security and Seccomp Profiles
Catalyst components expose podSecurityContext and securityContext values so you can enforce enterprise pod security standards, including custom seccomp profiles.
Example using a node-local seccomp profile:
agent:
podSecurityContext:
seccompProfile:
type: Localhost
localhostProfile: profiles/catalyst-agent.json
Pod Scheduling
Every Catalyst workload exposes standard Kubernetes scheduling primitives — nodeSelector, tolerations, and affinity — so you can pin pods to specific node pools or tolerate node taints.
One knob for everything: shared.scheduling
shared.scheduling applies to every pod Catalyst runs or provisions:
- Control-plane pods the chart renders directly: agent, management, gateway.envoy, gateway.controlplane, piko.
- Dapr sidecars the agent provisions per app.
- Per-project Dapr control plane (sentry, scheduler, operator, placement) the agent provisions per project.
- Per-project OTel collectors (metrics Deployment + logs DaemonSet) the agent provisions.
Pin everything to one pool with a single block:
shared:
scheduling:
nodeSelector:
workload: catalyst
tolerations:
- key: workload
operator: Equal
value: catalyst
effect: NoSchedule
Per-workload overrides
For cases where one workload needs different scheduling from the rest, set a per-workload block. Merge rules (consistent everywhere):
nodeSelectorandaffinity(maps): per-workload wins overshared.schedulingon key collision.tolerations(list): per-workload is appended toshared.scheduling.tolerations.
Control-plane per-component overrides:
| Workload | Path |
|---|---|
| agent | agent.{nodeSelector,tolerations,affinity} |
| management | management.{nodeSelector,tolerations,affinity} |
| gateway envoy | gateway.envoy.{nodeSelector,tolerations,affinity} |
| gateway control plane | gateway.controlplane.{nodeSelector,tolerations,affinity} |
| piko | piko.{nodeSelector,tolerations,affinity} |
Agent-provisioned per-workload overrides:
| Workload | Path |
|---|---|
| Dapr sidecars | agent.config.sidecar.{node_selector,tolerations,affinity} |
| Per-project Dapr control plane | agent.config.internal_dapr.{node_selector,tolerations,affinity} |
| Per-project OTel collectors | agent.config.otel.{node_selector,tolerations,affinity} |
Example — pin everything to one pool, but also spread the gateway across nodes:
shared:
nodeSelector: { workload: catalyst }
tolerations:
- { key: workload, operator: Equal, value: catalyst, effect: NoSchedule }
gateway:
envoy:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: gateway-envoy
topologyKey: kubernetes.io/hostname
Caveats worth knowing
- Sidecar
affinityoverride replaces the default pod anti-affinity. Whenagent.config.sidecar.affinityis unset, the cra chart spreads sidecar replicas across nodes via a built-inpodAntiAffinity. Settingaffinityreplaces that block entirely — include an equivalentpodAntiAffinityin your override if you want to keep the spread. - Sidecar tolerations are always appended to platform-managed ones. The free-plan spot toleration (
diagrid.dev/spot) is appended on top of shared + per-workload tolerations when the sidecar is scheduled on spot. - Affinity map-merge is shallow. If
shared.affinityhasnodeAffinityand a per-workload block setspodAntiAffinity, both end up on the pod. If both set the same top-level key (e.g. both setnodeAffinity), the per-workload value replaces shared's. - For agent-provisioned workloads (sidecar, internal_dapr, otel), prefer
matchExpressionsovermatchLabelsinside affinity when your label keys contain.(e.g.kubernetes.io/arch). The agent's config loader (viper) treats.as a path separator, so dotted keys insidematchLabelsget split. Chart-rendered workloads (agent, management, gateway, piko) don't have this constraint.
Shared OTel subcharts (the optional opentelemetry-deployment and opentelemetry-daemonset subcharts at the chart root) accept the same keys and pass them through to the upstream open-telemetry Helm subchart:
opentelemetry-deployment:
enabled: true
nodeSelector: { workload: catalyst }
See the Kubernetes scheduling docs for the full field reference.
Gateway TLS
To terminate TLS at the Catalyst Gateway, provide a certificate and key:
gateway:
tls:
enabled: true
existingSecret: "my-tls-secret"
# Or provide cert/key inline
For step-by-step instructions covering self-signed (dev), bring-your-own certificates, cert-manager integration, private CA trust for sidecars, and rotation, see the Gateway TLS guide.
Managed Domain
Set this when you want Diagrid Cloud to allocate the region's public wildcard hostname and TLS certificate for you, instead of bringing your own. Your --ingress endpoint then only needs to resolve locally (or privately) to the gateway — the controlplane allocates a public wildcard subdomain (under privatediagrid.net for Diagrid Cloud) and a matching wildcard certificate, then delivers them to the dataplane.
Use this if you do not want to own a public wildcard DNS zone for the region, or provision and rotate a wildcard TLS certificate for the gateway.
Enable it at region creation time:
diagrid region create <region-id> --enable-managed-domain --ingress <local-endpoint>
With managed domain, --ingress is a locally resolvable address (e.g. an internal hostname or an IP). Without it, --ingress must be a publicly resolvable wildcard FQDN that you control (e.g. *.my-region.company.com).
When managed domain is enabled, you can omit the gateway.tls block from your Helm values — the certificate is provisioned by the controlplane and delivered to the dataplane gateway. See Gateway TLS for the bring-your-own case.
Note: managed domains add a runtime dependency on the controlplane's DNS and certificate-issuance infrastructure. If you require strict isolation from external services, stick with a bring-your-own domain and certificate.
Workflows
Catalyst uses an external PostgreSQL database to store workflow state and provide rich visualizations. To enable this feature, configure the connection details as follows:
agent:
config:
project:
default_managed_state_store_type: postgresql-shared-external
external_postgresql:
enabled: true
auth_type: connectionString
namespace: postgres
connection_string_host: postgres-postgresql.postgres.svc.cluster.local
connection_string_port: 5432
connection_string_username: postgres
connection_string_password: postgres
connection_string_database: catalyst
If you wish to disable this feature, you must set:
agent:
config:
project:
default_managed_state_store_type: postgresql-shared-disabled
OpenTelemetry Collector (Optional)
Catalyst includes optional OpenTelemetry Collector addons for collecting and exporting telemetry. See the official documentation for configuration details.
To emit traces from Dapr apps into the collector (or any other OTLP backend), see the tracing guide — tracing is enabled per App ID via the Diagrid CLI, not through chart values.
Secrets
Catalyst supports four secret provider backends: Kubernetes Secrets (default), AWS Secrets Manager and PostgreSQL.
To use AWS Secrets Manager:
global:
secrets:
provider: "aws.secretmanager"
aws:
region: "us-east-1"
PostgreSQL Secrets Provider
The PostgreSQL secrets provider stores Catalyst application secrets in a PostgreSQL database using envelope encryption (each secret is encrypted with a data encryption key, which is itself encrypted by a key encryption key).
Inline configuration (connection string and keys provided directly in values):
global:
secrets:
provider: postgresql
postgresql:
kek_provider: "local" # "local" (AES-256) or "awskms" (AWS KMS)
connection_string: "postgres://user:password@host:5432/dbname"
primary_encryption_key: "<64 hex characters>"
primary_key_version: 1
Using an existing Kubernetes secret (recommended for production — keeps all sensitive config out of values files):
First, create the Kubernetes secret in the same namespace as the Catalyst installation:
kubectl create secret generic catalyst-pg-secrets -n cra-agent \
--from-literal=connection_string="postgres://user:password@host:5432/dbname" \
--from-literal=kek_provider="local" \
--from-literal=primary_encryption_key="<64 hex characters>" \
--from-literal=primary_key_version="1"
Then reference it in values:
global:
secrets:
provider: postgresql
postgresql:
existingSecret: "catalyst-pg-secrets"
NOTE: When
existingSecretis set, all PostgreSQL secrets provider config is read from the referenced Kubernetes secret via environment variables — nothing is written to the ConfigMap. All keys are read withoptional: true, so keys that are absent from the secret are simply not set and the application uses its built-in defaults (useful for optional fields like secondary keys or AWS KMS config). By default the secret key names match the config field names. Override individual key names usingexistingSecretKeysif your secret uses different naming:
global:
secrets:
postgresql:
existingSecret: "catalyst-pg-secrets"
existingSecretKeys:
connection_string: "pg_conn_str"
primary_encryption_key: "kek_primary"
primary_key_version: "kek_primary_version"
App Tunnels
App tunnels (via Piko) connect Catalyst to applications on private networks without needing to expose them. Tunnels are always secured with mTLS. To enable TLS for the proxy connection itself:
piko:
enabled: true
certificates:
proxy:
enabled: true
secretName: "piko-proxy-tls"
Management API Tunnel
Set this when the region was created with --enable-public-management-api (i.e. its spec has exposeTunnel: true). With that flag, the controlplane stands up a Piko upstream endpoint for the region; the management service in the dataplane must dial that upstream and register as its upstream so customers can reach the region's management API over the tunnel instead of via direct ingress.
management:
config:
tunnel:
enabled: true
piko:
upstream_url: https://tunnel-upstream.r1.diagrid.io
audience: piko # optional, defaults to "piko"
upstream_url is the Piko upstream of the controlplane that issued the region — the operator of that controlplane provides this value.
The management service authenticates to Piko using a JWT-SVID issued by Dapr Sentry, so the Sentry remote endpoint configured during region join must already be in place (it is, by default).
Production Tuning
For production deployments, start from the values-production.yaml overlay shipped alongside this chart:
helm install catalyst ./catalyst \
-f values-production.yaml \
-f my-environment.yaml \
--set join_token="${JOIN_TOKEN}"
The overlay enables HPAs, drops Kubernetes resource limits on Go components (which also drops the auto-derived GOMEMLIMIT), lowers log verbosity, and raises the per-project Dapr scheduler memory floor. See the Production Tuning guide for the rationale behind each value, plus guidance on managed infrastructure (managed Kubernetes, managed PostgreSQL), PodDisruptionBudgets, securityContext hardening, and NetworkPolicies.
Networking
Catalyst Enterprise Self-Hosted requires outbound connectivity to Diagrid Cloud. Ensure your network allows access to:
| Domain | Description | Required |
|---|---|---|
api.r1.diagrid.io | Region join (installation only). | Yes |
catalyst-cloud.r1.diagrid.io | Resource configuration updates. | Yes |
sentry.r1.diagrid.io | Workload identity (mTLS). | Yes |
trust.r1.diagrid.io | Trust anchors (mTLS). | Yes |
tunnels.trust.diagrid.io | OIDC provider for Piko tunnels. | No |
tunnel-upstream.r1.diagrid.io | Management API tunnel upstream (Piko). | Only if exposeTunnel is set on the region. |
catalyst-metrics.r1.diagrid.io | Dapr runtime metrics. | No |
catalyst-logs.r1.diagrid.io | Dapr sidecar logs. | No |
Note: mTLS is used for secure communication. Ensure your proxy/firewall does not inspect this traffic.
Network Policies
Catalyst configures Kubernetes NetworkPolicy resources per project namespace using three symmetric lists:
| Key | What it does |
|---|---|
agent.config.project.blocked_egress | Denies destinations in the sidecar 0.0.0.0/0 egress rule. CIDR-only (NetworkPolicy except limitation). |
agent.config.project.allowed_egress | Additive egress allow rules. May target CIDRs and/or namespaces, with optional port scoping. |
agent.config.project.allowed_ingress | Additive ingress allow rules into project namespaces. May target CIDRs and/or namespaces, with optional port scoping. |
Precedence: allow beats block. NetworkPolicy rules are additive — the API server OR's them together — so any
destination matched by an allowed_egress entry is reachable even if its CIDR also appears in blocked_egress. Use
this to punch narrow holes through the block list rather than weakening it.
Ingress floor (non-configurable): the agent namespace (cra-agent) and the monitoring namespace are always
permitted as ingress sources. Management and Prometheus scraping therefore never break regardless of allowed_ingress.
Default block list
agent:
config:
project:
blocked_egress:
- name: rfc1918
cidrs:
- "10.0.0.0/8"
- "172.16.0.0/12"
- "192.168.0.0/16"
- name: link-local # covers cloud instance metadata endpoints
cidrs:
- "169.254.0.0/16"
Trim entries or replace the whole list to match your environment (e.g. on GKE/AKS the pod network lives inside
10.0.0.0/8; prefer punching allow holes through allowed_egress rather than removing the block). Set
blocked_egress: [] to permit all egress (not recommended for production usage).
Allowing specific destinations
Both cidrs and namespaces may be set on the same rule. Ports are optional and default to "all ports" when omitted:
agent:
config:
project:
allowed_egress:
- name: rds
cidrs: ["10.4.5.6/32"]
ports:
- port: 5432
protocol: TCP
- name: msk
cidrs: ["10.4.5.0/24"]
ports:
- port: 9094
protocol: TCP
- name: worker-namespace
namespaces: ["my-workers"]
Allowing ingress
agent:
config:
project:
allowed_ingress:
- name: extra-prom
namespaces: ["observability"]
ports:
- port: 9090
protocol: TCP
Disabling network policies entirely
agent:
config:
project:
disable_network_policies: true
When disabled, no NetworkPolicy resources are created for project namespaces, and any previously created policies are
removed on the next reconcile.
CNI requirement: NetworkPolicy enforcement requires a CNI that supports it (Calico, Cilium, Azure NPM, AWS VPC CNI with
ENABLE_NETWORK_POLICY=true, or kube-router). If none is detected at startup the agent logs a warning and policies are still created altough not enforced.