Troubleshooting
This guide covers common issues and how to diagnose them.
Checking Operator Logs
The operator logs are the first place to look for any issue.
kubectl get pods -n openclaw-system -l app.kubernetes.io/name=openclaw-operator
kubectl logs -n openclaw-system -l app.kubernetes.io/name=openclaw-operator -f
kubectl logs -n openclaw-system -l app.kubernetes.io/name=openclaw-operator --all-containers
Checking Events
Kubernetes events provide a timeline of what happened to your resources.
kubectl describe openclawinstance my-assistant -n openclaw
kubectl get events -n openclaw --sort-by='.lastTimestamp'
kubectl get events -n openclaw --watch
Checking Instance Status
kubectl get openclawinstance -n openclaw
kubectl get openclawinstance my-assistant -n openclaw -o yaml | grep -A 50 'status:'
kubectl get openclawinstance my-assistant -n openclaw \
-o jsonpath='{.status.conditions[?(@.type=="Ready")]}'
Common Issues
Instance Stuck in Pending
Symptoms: The instance stays in Pending phase and never transitions to Provisioning.
Possible causes and solutions:
-
Operator is not running:
kubectl get pods -n openclaw-systemVerify the operator pod is
Runningand ready. If it is inCrashLoopBackOff, check its logs. -
CRD not installed or outdated:
kubectl get crd openclawinstances.openclaw.rocksIf the CRD is missing, install it. If you upgraded the operator but new fields are rejected as “field not declared in schema”, the CRD is outdated:
kubectl apply --server-side -f config/crd/bases/ -
RBAC issues with the operator:
kubectl auth can-i get openclawinstances --as=system:serviceaccount:openclaw-system:openclaw-operator -n openclaw -
Webhook blocking the request: Check for admission webhook errors in the API server logs or the operator logs.
Instance Stuck in Provisioning
Symptoms: The instance transitions to Provisioning but never reaches Running.
-
Resource creation failing silently: Check operator logs for errors:
kubectl logs -n openclaw-system deploy/openclaw-operator | grep -i error -
Resource quota exceeded:
kubectl describe resourcequota -n openclaw -
Deployment not becoming ready: Check the pod:
kubectl get pods -n openclaw -l app.kubernetes.io/instance=my-assistant kubectl describe pod -n openclaw -l app.kubernetes.io/instance=my-assistant
Instance in Failed State
Symptoms: The instance phase is Failed. The Ready condition shows status: "False" with a reason.
kubectl get openclawinstance my-assistant -n openclaw \
-o jsonpath='{.status.conditions[?(@.type=="Ready")].message}'
kubectl describe openclawinstance my-assistant -n openclaw
Common failure reasons:
-
Image pull errors: Look for
ImagePullBackOfforErrImagePull. Verify the image repository/tag and pull secrets. -
Insufficient resources: Look for
FailedSchedulingevents. Reduce resource requests or add capacity. -
ConfigMap or Secret not found: Verify referenced ConfigMaps and Secrets exist.
NetworkPolicy Blocking Traffic
Symptoms: Instance is Running but cannot reach external APIs or other pods cannot reach the instance.
-
Verify the NetworkPolicy exists:
kubectl get networkpolicy -n openclaw kubectl describe networkpolicy my-assistant -n openclaw -
Instance cannot reach AI APIs: The default NetworkPolicy allows egress to port 443 and 53. If the AI provider uses a non-standard port, add it to
allowedEgressCIDRs. -
DNS resolution failing: Ensure
allowDNSistrue. -
Other pods cannot reach the instance: Add namespaces to
allowedIngressNamespaces. -
Verify with a test pod:
kubectl run -n openclaw test-curl --rm -it --image=curlimages/curl -- \ curl -v http://my-assistant:18789
PVC Not Binding
Symptoms: Pod stuck in Pending or PVC shows Pending.
kubectl get pvc -n openclaw
kubectl describe pvc my-assistant-data -n openclaw
- StorageClass does not exist: Verify the
storageClassexists. - No capacity available: Check provisioner logs.
- Access mode incompatibility: Use
ReadWriteOnce(the default). - Zone mismatch: Ensure nodes and storage are in the same zone.
Webhook Errors
Symptoms: Creating or updating an OpenClawInstance fails with a webhook error.
- Webhook not enabled: Check webhook configurations exist.
- cert-manager not installed: Check certificate status.
- Webhook Service not reachable: Check service and endpoints.
- Bypass temporarily (last resort): Delete the webhook configuration.
Ingress Not Working
Symptoms: Ingress is created but traffic does not reach the instance.
- IngressClass not found: Verify the
classNamematches an installed IngressClass. - Ingress controller not installed: Verify the controller is running.
- TLS Secret missing: Check the referenced Secret exists.
- DNS not pointing to Ingress: Verify DNS resolution.
- NetworkPolicy blocking the Ingress controller: Add the controller’s namespace to
allowedIngressNamespaces.
Chromium Sidecar Issues
Symptoms: Chromium sidecar not starting, crashing, or browser automation fails.
- Check sidecar status and logs
- Insufficient shared memory: Increase memory limits
- Insufficient resources: Increase CPU/memory limits
- Security context restrictions: Check for SCC violations
Operator CrashLoopBackOff
Symptoms: The operator pod itself is restarting.
kubectl logs -n openclaw-system deploy/openclaw-operator --previous
- Leader election failure: Check for stale leases.
- Missing CRD: Verify CRD is installed.
- Insufficient RBAC: Verify ClusterRole and ClusterRoleBinding.
- Webhook certificate issues: Check certificate provisioning.
Useful Commands Reference
kubectl get openclawinstance -A
kubectl get openclawinstance my-assistant -n openclaw \
-o jsonpath='{.status.managedResources}' | jq .
kubectl annotate openclawinstance my-assistant -n openclaw \
force-reconcile=$(date +%s) --overwrite
kubectl get openclawinstance my-assistant -n openclaw -o yaml