OpenClaw Kubernetes Operator

Self-host OpenClaw AI agents on Kubernetes with production-grade security, observability, and lifecycle management.

OpenClaw is an AI agent platform that acts on your behalf across Telegram, Discord, WhatsApp, and Signal. It manages your inbox, calendar, smart home, and more through 50+ integrations. While OpenClaw.rocks offers fully managed hosting, this operator lets you run OpenClaw on your own infrastructure with the same operational rigor.

Why an Operator?

Deploying AI agents to Kubernetes involves more than a Deployment and a Service. You need network isolation, secret management, persistent storage, health monitoring, optional browser automation, and config rollouts, all wired correctly. This operator encodes those concerns into a single OpenClawInstance custom resource so you can go from zero to production in minutes:

apiVersion: openclaw.rocks/v1alpha1
kind: OpenClawInstance
metadata:
  name: my-agent
spec:
  envFrom:
    - secretRef:
        name: openclaw-api-keys
  storage:
    persistence:
      enabled: true
      size: 10Gi

The operator reconciles this into a fully managed stack of 9+ Kubernetes resources: secured, monitored, and self-healing.

Agents That Adapt Themselves

Agents can autonomously install skills, patch their config, add environment variables, and seed workspace files through the Kubernetes API, validated by the operator on every request.

# 1. Enable self-configure on the instance
spec:
  selfConfigure:
    enabled: true
    allowedActions: [skills, config, envVars, workspaceFiles]

# 2. The agent creates this to install a skill at runtime
apiVersion: openclaw.rocks/v1alpha1
kind: OpenClawSelfConfig
metadata:
  name: add-fetch-skill
spec:
  instanceRef: my-agent
  addSkills:
    - "@anthropic/mcp-server-fetch"

Every request is validated against the instance’s allowlist policy. Protected config keys cannot be overwritten, and denied requests are logged with a reason.

Features

	Feature	Details
Declarative	Single CRD	One resource defines the entire stack: StatefulSet, Service, RBAC, NetworkPolicy, PVC, PDB, Ingress, and more
Adaptive	Agent self-configure	Agents autonomously install skills, patch config, and adapt their environment via the K8s API
Secure	Hardened by default	Non-root (UID 1000), read-only root filesystem, all capabilities dropped, seccomp RuntimeDefault, default-deny NetworkPolicy, validating webhook
Observable	Built-in metrics	Prometheus metrics, ServiceMonitor integration, structured JSON logging, Kubernetes events
Flexible	Provider-agnostic config	Use any AI provider (Anthropic, OpenAI, or others) via environment variables and inline or external config
Config Modes	Merge or overwrite	`overwrite` replaces config on restart; `merge` deep-merges with PVC config, preserving runtime changes
Skills	Declarative install	Install ClawHub skills, npm packages, or GitHub-hosted skill packs via `spec.skills`
Runtime Deps	pnpm and Python/uv	Built-in init containers install pnpm (via corepack) or Python 3.12 + uv for MCP servers and skills
Auto-Update	OCI registry polling	Opt-in version tracking: checks the registry for new semver releases, backs up first, rolls out, and auto-rolls back on failure
Scalable	Auto-scaling	HPA integration with CPU and memory metrics, min/max replica bounds, automatic StatefulSet replica management
Resilient	Self-healing lifecycle	PodDisruptionBudgets, health probes, automatic config rollouts via content hashing, 5-minute drift detection
Backup/Restore	S3-backed snapshots	Automatic backup on deletion, pre-update, and on a cron schedule; restore into a new instance from any snapshot
Gateway Auth	Auto-generated tokens	Automatic gateway token Secret per instance, bypassing mDNS pairing
Tailscale	Tailnet access	Expose via Tailscale Serve or Funnel with SSO auth
Extensible	Sidecars and init containers	Chromium for browser automation, Ollama for local LLMs, plus custom init containers and sidecars

Architecture Overview

+-----------------------------------------------------------------+
|  OpenClawInstance CR          OpenClawSelfConfig CR              |
|  (your declarative config)   (agent self-modification requests) |
+---------------+-------------------------------------------------+
                | watch
                v
+-----------------------------------------------------------------+
|  OpenClaw Operator                                              |
|  +-----------+  +-------------+  +----------------------------+ |
|  | Reconciler|  |   Webhooks  |  |   Prometheus Metrics       | |
|  |           |  |  (validate  |  |  (reconcile count,         | |
|  |  creates ->  |   & default)|  |   duration, phases)        | |
|  +-----------+  +-------------+  +----------------------------+ |
+---------------+-------------------------------------------------+
                | manages
                v
+-----------------------------------------------------------------+
|  Managed Resources (per instance)                               |
|                                                                 |
|  ServiceAccount -> Role -> RoleBinding    NetworkPolicy         |
|  ConfigMap        PVC      PDB            ServiceMonitor        |
|  GatewayToken Secret                                            |
|                                                                 |
|  StatefulSet                                                    |
|  +-----------------------------------------------------------+ |
|  | Init: config -> pnpm* -> python* -> skills* -> custom      | |
|  |                                        (* = opt-in)        | |
|  +------------------------------------------------------------+ |
|  | OpenClaw Container  Gateway Proxy (nginx)                  | |
|  |                     Chromium (opt) / Ollama (opt)          | |
|  |                     Tailscale (opt) + custom sidecars      | |
|  +------------------------------------------------------------+ |
|                                                                 |
|  Service (default: 18789, 18793 or custom) -> Ingress (opt)     |
+-----------------------------------------------------------------+

Next Steps

Quick Start to get your first instance running
Deployment Guide for platform-specific instructions
Architecture for a deep dive into the operator internals
API Reference for the complete CRD specification