Runbook: Reconcile Errors
Alert: OpenClawReconcileErrors
Meaning
The operator is failing to reconcile an OpenClawInstance. The openclaw_reconcile_total{result="error"} counter is increasing.
Impact
Managed resources (StatefulSet, Service, ConfigMap, etc.) may be out of sync with the desired state defined in the CR. The instance may not be updated or may have stale configuration.
Diagnosis
# Check operator logs for errors
kubectl logs -n openclaw-operator-system deploy/openclaw-operator-controller-manager -c manager --tail=100
# Check the instance status and conditions
kubectl get openclawinstance <name> -n <namespace> -o yaml
# Check events on the instance
kubectl describe openclawinstance <name> -n <namespace>
# Check if dependent resources exist (Secrets, ConfigMaps)
kubectl get secrets,configmaps -n <namespace>
Mitigation
- Missing secrets - Ensure all secrets referenced in
spec.envFromexist in the namespace - RBAC issues - Verify the operator has the required ClusterRole permissions
- Resource conflicts - Check if another controller or process is modifying the same resources
- API server issues - Check kube-apiserver health
- If the error is transient, the operator will retry with exponential backoff