Building Production-Ready Kubernetes Operators: From Reconciliation Loops to Day-2 Operations
Your StatefulSet handles the basics—stable network identities, ordered deployment, persistent storage. But when your database needs leader election, automated backups, or zero-downtime upgrades, you’re stuck writing custom scripts and hoping they don’t break at 3 AM. This is exactly the gap Kubernetes Operators were designed to fill.
Operators encode operational knowledge into software. They watch your custom resources, compare desired state against reality, and take corrective action automatically. The pattern has become the standard approach for running stateful workloads in production Kubernetes environments, from databases like PostgreSQL and MongoDB to message queues, caches, and custom distributed systems.
The Operator pattern emerged from CoreOS in 2016 as a way to extend Kubernetes beyond its original focus on stateless workloads. The insight was profound: instead of documenting runbooks and training operators to follow them manually, encode that operational expertise directly into controllers that run continuously alongside your applications. The result is infrastructure that heals itself, scales intelligently, and handles failure scenarios that would otherwise require human intervention.
This guide walks through building production-ready Operators using Kubebuilder and controller-runtime. We’ll cover the architectural foundations, implement a working MongoDB replica set manager, and explore the patterns that separate toy operators from systems you can trust at scale.
Why StatefulSets Aren’t Enough: The Operator Imperative
StatefulSets solve a specific set of problems: stable network identities through predictable pod names, ordered deployment and scaling, and persistent volume claim management. These primitives are necessary for running stateful applications, but they’re not sufficient for operating them.
The distinction between deploying and operating is crucial. Deploying a database means getting the processes running with the right configuration. Operating a database means keeping it healthy over time—handling failures, performing maintenance, scaling capacity, and upgrading versions without losing data or availability. StatefulSets handle deployment; Operators handle operations.
Consider what happens when a MongoDB replica set member fails. The StatefulSet will restart the pod with the same identity and reattach the persistent volume. But it won’t initiate a replica set reconfiguration, promote a secondary to primary, or verify data consistency before adding the recovered member back to the cluster. These operations require domain-specific knowledge about MongoDB’s behavior and operational requirements.
The pod might restart successfully from Kubernetes’ perspective while the database inside remains in an inconsistent state. The StatefulSet has no visibility into application health beyond container liveness probes. It cannot distinguish between a healthy MongoDB secondary and one that’s fallen hours behind in replication. It cannot determine whether reintroducing a recovered member will trigger a full resync or cause a split-brain scenario.
The gap between what StatefulSets provide and what production operations demand shows up across every stateful application:
Database failover requires detecting leader failure, coordinating election across remaining members, updating connection strings, and verifying the new topology before accepting writes. A StatefulSet sees a pod restart; an Operator sees a leadership transition that must be orchestrated. The Operator understands that promoting a secondary requires ensuring it has the latest writes, reconfiguring the replica set membership, and potentially redirecting client connections. This choreography happens in seconds when automated, but represents pages of runbook documentation when done manually.
Backup orchestration goes beyond running a cron job. Production backups require coordinating with the application to ensure consistency, managing retention policies, verifying backup integrity, and handling failures with retry logic and alerting. The backup process needs to understand application state—whether the database is healthy, whether a backup is already in progress, whether there’s sufficient storage. A naive backup of a running database may capture an inconsistent snapshot. An Operator-managed backup can invoke database-specific commands to flush writes, create a consistent snapshot, and verify the backup before recording success.
Schema migrations and version upgrades require careful coordination: pre-flight checks, ordered rollouts that respect replication topology, validation at each step, and rollback capabilities when things go wrong. Upgrading a three-node database cluster isn’t three independent pod updates—it’s a choreographed sequence that must preserve availability and data integrity. You upgrade secondaries first while the primary continues serving traffic. Then you trigger a controlled failover to one of the upgraded secondaries before finally upgrading the former primary. Miss this sequence, and you risk an outage or, worse, data loss.
Scaling operations for distributed systems often involve data rebalancing that the application itself must coordinate. Adding a node to a Cassandra cluster triggers streaming of data from existing nodes. Scaling an Elasticsearch cluster requires shard reallocation. The StatefulSet can add the pod, but the Operator must integrate the new member correctly into the distributed system.
StatefulSets give you the building blocks. Operators encode the operational runbook that tells Kubernetes how to use those building blocks correctly for your specific application. The decision point is clear: if your stateful application requires operational logic beyond start, stop, and restart, you need an Operator.
Anatomy of a Kubernetes Operator: CRDs, Controllers, and the Reconciliation Loop
Every Kubernetes Operator consists of two core components: a Custom Resource Definition (CRD) that extends the Kubernetes API with your domain-specific types, and a controller that watches those resources and reconciles actual state with desired state. Understanding how these pieces interact is essential before writing any code.
Custom Resource Definitions as Your Domain API
CRDs let you define new resource types that feel native to Kubernetes. Instead of managing your MongoDB cluster through ConfigMaps, Secrets, and annotations scattered across multiple resources, you define a single MongoDBCluster resource that captures your intent:
apiVersion: database.example.com/v1kind: MongoDBClustermetadata: name: production-clusterspec: members: 3 version: "7.0" storage: size: 100Gi storageClassName: fast-ssd backup: enabled: true schedule: "0 2 * * *" retention: 7dThe CRD schema defines validation rules, default values, and the structure of both spec (desired state) and status (observed state). Users interact with your Operator through kubectl apply, and the Kubernetes API server handles authentication, authorization, and persistence.
This approach provides several advantages over manual configuration. First, the API server validates resources against your schema before storing them, catching configuration errors immediately. Second, RBAC rules can control who modifies your custom resources just like built-in types. Third, the entire state of your application lives in etcd alongside other Kubernetes resources, making it part of standard backup and disaster recovery procedures.
The separation between spec and status is fundamental. The spec represents user intent—what they want the cluster to look like. The status represents observed reality—what the cluster actually looks like right now. Users write to spec; controllers write to status. This separation makes it easy to see whether the system has converged to the desired state or whether reconciliation is still in progress.
The Control Loop: Observe, Analyze, Act
Controllers implement the reconciliation loop that drives all of Kubernetes. The pattern is deceptively simple: watch for changes to resources, compare desired state against actual state, and take action to close the gap. This loop runs continuously, driven by a work queue that receives notifications whenever watched resources change.
func (r *MongoDBClusterReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) { log := log.FromContext(ctx)
// Observe: fetch the current state var cluster databasev1.MongoDBCluster if err := r.Get(ctx, req.NamespacedName, &cluster); err != nil { if apierrors.IsNotFound(err) { return ctrl.Result{}, nil // Resource deleted, nothing to do } return ctrl.Result{}, err }
// Analyze: determine what needs to change currentMembers, err := r.getReplicaSetMembers(ctx, &cluster) if err != nil { return ctrl.Result{}, err }
// Act: reconcile toward desired state if len(currentMembers) < cluster.Spec.Members { if err := r.addReplicaSetMember(ctx, &cluster); err != nil { return ctrl.Result{}, err } return ctrl.Result{RequeueAfter: 30 * time.Second}, nil }
return ctrl.Result{}, nil}The controller-runtime library abstracts the complexity of watching resources, managing work queues, and handling the Kubernetes API. Your code focuses on the reconciliation logic—what should exist and how to make it so.
The return values from Reconcile control what happens next. Returning ctrl.Result{} with no error indicates successful reconciliation—the controller won’t re-examine this resource until something changes. Returning an error triggers an immediate retry with exponential backoff. Returning ctrl.Result{RequeueAfter: duration} schedules a future reconciliation, useful for polling external systems or waiting for eventually-consistent operations to complete.
Level-Triggered vs Edge-Triggered Reconciliation
Kubernetes controllers are level-triggered, not edge-triggered. Your reconciliation function receives the current desired state and must determine the correct action regardless of what changed or how many events occurred since the last run. This design makes controllers resilient to missed events, duplicate deliveries, and restarts.
The distinction matters for correctness. An edge-triggered system asks “what event happened?” and responds to that specific change. A level-triggered system asks “what is the current state, and what should it be?” and acts accordingly. If your controller restarts and misses the event that changed replica count from 3 to 5, it doesn’t matter—the next reconciliation will observe the desired count is 5, the actual count is 3, and take action.
The reconciliation function should be idempotent—calling it multiple times with the same input produces the same result. Check whether the action has already been taken before taking it. Create resources only if they don’t exist. Update configurations only if they differ from desired state. This idempotency guarantee means controllers can safely retry operations without fear of duplicate side effects.
This level-triggered model simplifies reasoning about correctness. You don’t need to track sequences of events or maintain complex state machines for event ordering. The controller examines current reality, compares it to desired state, and acts accordingly. When in doubt about system state, you can always trust the reconciliation loop to converge toward correctness.
Implementing Your First Operator: A MongoDB Replica Set Manager
Let’s build a working Operator that manages MongoDB replica sets. We’ll use Kubebuilder to scaffold the project and implement reconciliation logic that handles cluster initialization, member management, and failure recovery. This example demonstrates patterns that apply to any stateful application.
Scaffolding with Kubebuilder
Initialize a new Kubebuilder project and create the API resources:
kubebuilder init --domain example.com --repo github.com/example/mongodb-operatorkubebuilder create api --group database --version v1 --kind MongoDBClusterThis generates the project structure, including the CRD types, controller skeleton, and build configuration. The generated code provides a foundation—you’ll customize the types and implement the reconciliation logic.
Kubebuilder follows established conventions that make your Operator consistent with others in the ecosystem. The generated Makefile includes targets for building, testing, and deploying. The project layout separates API definitions from controller logic. RBAC manifests are generated from markers in your code, reducing the chance of permission misconfigurations.
Take time to understand the generated files before modifying them. The main.go file sets up the manager that runs your controllers. The api/v1 directory contains your CRD types. The controllers directory contains your reconciliation logic. The config directory contains Kubernetes manifests for deployment.
Defining the MongoDBCluster CRD
Define the spec and status fields that capture your domain model:
type MongoDBClusterSpec struct { // Members is the number of replica set members // +kubebuilder:validation:Minimum=1 // +kubebuilder:validation:Maximum=7 Members int32 `json:"members"`
// Version is the MongoDB version to deploy // +kubebuilder:validation:Pattern=`^\d+\.\d+$` Version string `json:"version"`
// Storage configuration for data volumes Storage StorageSpec `json:"storage"`}
type MongoDBClusterStatus struct { // Phase represents the current lifecycle phase Phase ClusterPhase `json:"phase,omitempty"`
// ReadyMembers is the count of healthy replica set members ReadyMembers int32 `json:"readyMembers,omitempty"`
// Conditions represent detailed status information Conditions []metav1.Condition `json:"conditions,omitempty"`
// CurrentPrimary is the pod name of the current primary CurrentPrimary string `json:"currentPrimary,omitempty"`}
// +kubebuilder:validation:Enum=Pending;Initializing;Running;Failedtype ClusterPhase stringThe validation markers (+kubebuilder:validation:*) generate OpenAPI schema validation that the API server enforces. Invalid resources are rejected before your controller ever sees them. This shifts error detection left—users get immediate feedback rather than discovering configuration errors during reconciliation.
Design your spec fields around user intent, not implementation details. Users want “3 members” and “version 7.0”—they don’t want to specify pod templates and volume claim templates directly. The Operator translates high-level intent into the low-level Kubernetes resources required.
The status fields should provide enough information to diagnose issues without reading logs. Include the current phase, counts of ready members, the identity of the current primary, and detailed conditions that explain why the cluster is in its current state.
Writing the Reconciliation Logic
The reconciler manages the full lifecycle: creating StatefulSets, initializing the replica set, and handling member changes:
func (r *MongoDBClusterReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) { log := log.FromContext(ctx)
var cluster databasev1.MongoDBCluster if err := r.Get(ctx, req.NamespacedName, &cluster); err != nil { return ctrl.Result{}, client.IgnoreNotFound(err) }
// Ensure StatefulSet exists sts := r.buildStatefulSet(&cluster) if err := controllerutil.SetControllerReference(&cluster, sts, r.Scheme); err != nil { return ctrl.Result{}, err }
found := &appsv1.StatefulSet{} err := r.Get(ctx, types.NamespacedName{Name: sts.Name, Namespace: sts.Namespace}, found) if apierrors.IsNotFound(err) { log.Info("Creating StatefulSet", "name", sts.Name) if err := r.Create(ctx, sts); err != nil { return ctrl.Result{}, err } return ctrl.Result{RequeueAfter: 10 * time.Second}, nil } else if err != nil { return ctrl.Result{}, err }
// Check if replica set needs initialization if cluster.Status.Phase == "" || cluster.Status.Phase == databasev1.PhasePending { if found.Status.ReadyReplicas >= 1 { if err := r.initializeReplicaSet(ctx, &cluster); err != nil { log.Error(err, "Failed to initialize replica set") return ctrl.Result{RequeueAfter: 30 * time.Second}, nil } cluster.Status.Phase = databasev1.PhaseRunning if err := r.Status().Update(ctx, &cluster); err != nil { return ctrl.Result{}, err } } return ctrl.Result{RequeueAfter: 10 * time.Second}, nil }
// Reconcile member count if *found.Spec.Replicas != cluster.Spec.Members { found.Spec.Replicas = &cluster.Spec.Members if err := r.Update(ctx, found); err != nil { return ctrl.Result{}, err } return ctrl.Result{RequeueAfter: 30 * time.Second}, nil }
// Update status cluster.Status.ReadyMembers = found.Status.ReadyReplicas if err := r.Status().Update(ctx, &cluster); err != nil { return ctrl.Result{}, err }
return ctrl.Result{RequeueAfter: 60 * time.Second}, nil}The SetControllerReference call establishes ownership—when the MongoDBCluster is deleted, Kubernetes garbage collection automatically removes the StatefulSet. This ownership chain ensures that deleting the custom resource cleans up all dependent resources without requiring explicit cleanup logic.
The RequeueAfter returns schedule future reconciliation to handle eventually-consistent operations. After creating a StatefulSet, we can’t immediately initialize the replica set because pods take time to become ready. By requeuing, we check back later when pods are likely available.
Notice the pattern of making one change and then returning with a requeue. This keeps each reconciliation focused and makes the controller easier to debug. If you try to do everything in one pass, failures become harder to diagnose because you don’t know which step failed.
Production Patterns: Status Reporting, Conditions, and Observability
A production Operator must be observable. When something goes wrong at 3 AM, the on-call engineer needs to diagnose the issue from kubectl describe output and Prometheus dashboards, not by reading controller logs. The status subresource is your primary interface with operators.
Implementing Meaningful Status Conditions
Conditions provide a standardized way to communicate detailed status information. Each condition has a type, status (True/False/Unknown), reason, and message. The Kubernetes API conventions define a format that tooling understands:
func (r *MongoDBClusterReconciler) updateCondition( cluster *databasev1.MongoDBCluster, condType string, status metav1.ConditionStatus, reason, message string,) { condition := metav1.Condition{ Type: condType, Status: status, ObservedGeneration: cluster.Generation, LastTransitionTime: metav1.Now(), Reason: reason, Message: message, } meta.SetStatusCondition(&cluster.Status.Conditions, condition)}
// Usage in reconciliation:if err := r.initializeReplicaSet(ctx, &cluster); err != nil { r.updateCondition(&cluster, "ReplicaSetInitialized", metav1.ConditionFalse, "InitializationFailed", err.Error()) return ctrl.Result{RequeueAfter: 30 * time.Second}, nil}r.updateCondition(&cluster, "ReplicaSetInitialized", metav1.ConditionTrue, "Initialized", "Replica set initialized successfully")The ObservedGeneration field indicates which version of the spec the condition reflects. If observedGeneration is less than metadata.generation, the controller hasn’t yet processed the latest spec changes. This helps users understand whether their changes have been applied.
Pro Tip: Design conditions around what operators need to diagnose issues. “Ready” is useful, but specific conditions like “ReplicaSetInitialized”, “BackupSucceeded”, and “AllMembersHealthy” tell a clearer story. When a cluster is unhealthy, operators can immediately see which specific check is failing.
Consider including conditions for each major capability: initialization status, backup status, connectivity to each member, replication lag within acceptable bounds, and version consistency across members. Each condition should be actionable—if it’s False, the message should indicate what needs to happen.
Exposing Prometheus Metrics
controller-runtime integrates with the Prometheus client library. Register custom metrics to expose Operator-specific information:
var ( clusterMembersGauge = prometheus.NewGaugeVec( prometheus.GaugeOpts{ Name: "mongodb_cluster_members", Help: "Number of members in the MongoDB cluster", }, []string{"cluster", "namespace"}, )
reconcileErrorsCounter = prometheus.NewCounterVec( prometheus.CounterOpts{ Name: "mongodb_operator_reconcile_errors_total", Help: "Total number of reconciliation errors", }, []string{"cluster", "namespace", "error_type"}, )
reconcileDurationHistogram = prometheus.NewHistogramVec( prometheus.HistogramOpts{ Name: "mongodb_operator_reconcile_duration_seconds", Help: "Duration of reconciliation loops", Buckets: prometheus.ExponentialBuckets(0.01, 2, 10), }, []string{"cluster", "namespace"}, ))
func init() { metrics.Registry.MustRegister(clusterMembersGauge, reconcileErrorsCounter, reconcileDurationHistogram)}Update metrics during reconciliation to provide real-time visibility into cluster state. Alert on error rate increases and member count anomalies. The reconcile duration histogram helps identify performance regressions in your controller.
Consider what questions operators will ask when investigating issues: How many reconciliations are happening? How long do they take? What errors are occurring? Which clusters are having problems? Design metrics to answer these questions directly.
Structured Logging for Debugging
controller-runtime uses structured logging through logr. Add context to log messages that helps trace issues back to specific resources:
func (r *MongoDBClusterReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) { log := log.FromContext(ctx).WithValues( "cluster", req.Name, "namespace", req.Namespace, )
log.V(1).Info("Starting reconciliation")
// ... reconciliation logic ...
if err != nil { log.Error(err, "Reconciliation failed", "phase", cluster.Status.Phase, "readyMembers", cluster.Status.ReadyMembers) return ctrl.Result{}, err }
log.V(1).Info("Reconciliation complete", "phase", cluster.Status.Phase, "readyMembers", cluster.Status.ReadyMembers)
return ctrl.Result{}, nil}Use log levels appropriately: Info for significant state changes, V(1) for routine operations useful during debugging, and Error only for actual errors. Include enough context to correlate log entries with specific resources and operations.
Leader Election for HA Operators
Production Operators run with multiple replicas for availability. Leader election ensures only one instance actively reconciles at a time:
mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{ Scheme: scheme, LeaderElection: true, LeaderElectionID: "mongodb-operator-leader", HealthProbeBindAddress: ":8081",})The non-leader replicas stay ready to take over if the leader fails, providing automatic failover for the Operator itself. Leader election uses a Kubernetes Lease object, so failover happens within seconds when the leader becomes unavailable.
Advanced Reconciliation: Handling Upgrades, Rollbacks, and State Transitions
Production workloads require more than basic CRUD operations. Upgrades must preserve availability, failures need graceful handling, and resource cleanup must be thorough. These scenarios require careful state management and ordered operations.
Safe Rolling Upgrades for Stateful Workloads
Upgrading a database cluster requires orchestrating changes in the correct order. For MongoDB, you upgrade secondaries first, then step down and upgrade the primary. This order ensures the cluster maintains a healthy primary throughout the upgrade:
func (r *MongoDBClusterReconciler) reconcileUpgrade( ctx context.Context, cluster *databasev1.MongoDBCluster,) (ctrl.Result, error) { members, err := r.getReplicaSetStatus(ctx, cluster) if err != nil { return ctrl.Result{}, err }
// Find members needing upgrade for _, member := range members { if member.Version == cluster.Spec.Version { continue }
// Never upgrade primary first if member.IsPrimary { continue }
// Upgrade one secondary at a time if err := r.upgradeMember(ctx, cluster, member); err != nil { return ctrl.Result{}, err } return ctrl.Result{RequeueAfter: 60 * time.Second}, nil }
// All secondaries upgraded, handle primary primary := findPrimary(members) if primary != nil && primary.Version != cluster.Spec.Version { // Step down triggers election among upgraded secondaries if err := r.stepDownPrimary(ctx, cluster); err != nil { return ctrl.Result{}, err } return ctrl.Result{RequeueAfter: 30 * time.Second}, nil }
return ctrl.Result{}, nil}The reconciler handles one member at a time, requeuing after each step to verify the cluster stabilized before continuing. This incremental approach limits blast radius—if an upgraded member fails to start, you haven’t affected the rest of the cluster.
Consider implementing upgrade gates: checks that must pass before proceeding with upgrades. For databases, verify replication is healthy, there’s no significant lag, and all members are reachable. These pre-flight checks prevent upgrades during periods of instability.
State Machine Patterns for Complex Lifecycles
Some operations span multiple reconciliation cycles and require tracking progress. Implement a state machine in your status to manage these transitions:
type UpgradePhase string
const ( UpgradePhaseNone UpgradePhase = "" UpgradePhasePreparing UpgradePhase = "Preparing" UpgradePhaseUpgradingSecondaries UpgradePhase = "UpgradingSecondaries" UpgradePhaseSteppingDown UpgradePhase = "SteppingDown" UpgradePhaseUpgradingPrimary UpgradePhase = "UpgradingPrimary" UpgradePhaseCompleted UpgradePhase = "Completed")
func (r *MongoDBClusterReconciler) reconcileUpgradeStateMachine( ctx context.Context, cluster *databasev1.MongoDBCluster,) (ctrl.Result, error) { switch cluster.Status.UpgradePhase { case UpgradePhaseNone: if cluster.Spec.Version != cluster.Status.CurrentVersion { cluster.Status.UpgradePhase = UpgradePhasePreparing return r.updateStatusAndRequeue(ctx, cluster) } case UpgradePhasePreparing: if err := r.validateUpgradePrerequisites(ctx, cluster); err != nil { return ctrl.Result{RequeueAfter: 30 * time.Second}, nil } cluster.Status.UpgradePhase = UpgradePhaseUpgradingSecondaries return r.updateStatusAndRequeue(ctx, cluster) // ... additional phases } return ctrl.Result{}, nil}The state machine approach makes upgrade progress visible in the status and enables graceful handling of controller restarts mid-upgrade. When the controller restarts, it reads the current phase and continues from where it left off.
Finalizers for Clean Resource Teardown
When your Operator creates external resources—cloud storage buckets, DNS records, or entries in external systems—you need finalizers to ensure cleanup:
const finalizerName = "database.example.com/cleanup"
func (r *MongoDBClusterReconciler) reconcileFinalizer( ctx context.Context, cluster *databasev1.MongoDBCluster,) (ctrl.Result, error) { if cluster.DeletionTimestamp.IsZero() { // Resource not being deleted, ensure finalizer exists if !controllerutil.ContainsFinalizer(cluster, finalizerName) { controllerutil.AddFinalizer(cluster, finalizerName) if err := r.Update(ctx, cluster); err != nil { return ctrl.Result{}, err } } return ctrl.Result{}, nil }
// Resource being deleted, clean up external resources if controllerutil.ContainsFinalizer(cluster, finalizerName) { if err := r.deleteExternalResources(ctx, cluster); err != nil { // Log error but don't block deletion forever log.Error(err, "Failed to delete external resources") // Consider implementing retry logic with eventual manual cleanup } controllerutil.RemoveFinalizer(cluster, finalizerName) if err := r.Update(ctx, cluster); err != nil { return ctrl.Result{}, err } } return ctrl.Result{}, nil}The finalizer blocks resource deletion until cleanup completes. This ensures external resources like backup storage or monitoring configurations are removed when the cluster is deleted.
Warning: Always implement finalizers from day one if your Operator creates external resources. Retrofitting cleanup logic after orphaned resources accumulate is painful. Consider what happens if your Operator is uninstalled—any resources it created outside Kubernetes will remain orphaned.
Design finalizer logic to be resilient. If external cleanup fails, decide whether to retry indefinitely (risking stuck deletions) or proceed after logging (risking orphaned resources). Some Operators implement a timeout after which they remove the finalizer with a warning, allowing manual cleanup of orphaned resources.
Testing Strategies: Unit Tests, Integration Tests, and Envtest
Operators require testing at multiple levels: unit tests for business logic, integration tests for controller behavior, and end-to-end tests for full system validation. A comprehensive testing strategy catches bugs before they affect production clusters.
Unit Testing Reconciliation Logic
Extract business logic into testable functions that don’t depend on the Kubernetes client:
func TestShouldUpgradeMember(t *testing.T) { tests := []struct { name string member MemberStatus targetVersion string want bool }{ { name: "secondary needs upgrade", member: MemberStatus{Version: "6.0", IsPrimary: false}, targetVersion: "7.0", want: true, }, { name: "primary should not upgrade directly", member: MemberStatus{Version: "6.0", IsPrimary: true}, targetVersion: "7.0", want: false, }, { name: "already at target version", member: MemberStatus{Version: "7.0", IsPrimary: false}, targetVersion: "7.0", want: false, }, }
for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { got := shouldUpgradeMember(tt.member, tt.targetVersion) if got != tt.want { t.Errorf("shouldUpgradeMember() = %v, want %v", got, tt.want) } }) }}Unit tests run in milliseconds and provide fast feedback during development. Structure your code to maximize the logic that can be unit tested without Kubernetes dependencies. Pure functions that transform data are easiest to test.
Integration Tests with Envtest
Envtest provides a real API server and etcd for testing controller behavior without a full cluster:
var _ = Describe("MongoDBCluster Controller", func() { Context("When creating a new cluster", func() { It("Should create the StatefulSet", func() { cluster := &databasev1.MongoDBCluster{ ObjectMeta: metav1.ObjectMeta{ Name: "test-cluster", Namespace: "default", }, Spec: databasev1.MongoDBClusterSpec{ Members: 3, Version: "7.0", }, } Expect(k8sClient.Create(ctx, cluster)).To(Succeed())
stsKey := types.NamespacedName{Name: "test-cluster", Namespace: "default"} Eventually(func() error { return k8sClient.Get(ctx, stsKey, &appsv1.StatefulSet{}) }, timeout, interval).Should(Succeed()) }) })
Context("When updating member count", func() { It("Should scale the StatefulSet", func() { // Create cluster with 3 members cluster := createTestCluster(3)
// Update to 5 members cluster.Spec.Members = 5 Expect(k8sClient.Update(ctx, cluster)).To(Succeed())
// Verify StatefulSet is updated Eventually(func() int32 { sts := &appsv1.StatefulSet{} k8sClient.Get(ctx, stsKey, sts) return *sts.Spec.Replicas }, timeout, interval).Should(Equal(int32(5))) }) })})Envtest runs in seconds rather than minutes, making it practical to run in CI on every commit. It tests the full reconciliation loop including API server interactions, watch mechanisms, and controller-runtime machinery. The tests verify that your controller responds correctly to resource changes.
Use Ginkgo and Gomega for expressive test assertions. The Eventually function handles the asynchronous nature of controller reconciliation, polling until the expected state is reached or the timeout expires.
End-to-End Testing with Kind
For full integration validation, deploy your Operator to a kind cluster and verify behavior with real workloads:
jobs: e2e: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: helm/kind-action@v1 - name: Deploy operator run: make deploy IMG=test:latest - name: Run e2e tests run: make test-e2eEnd-to-end tests validate networking, storage, and the full controller lifecycle in a realistic environment. They’re slower than envtest but catch issues that only appear with real infrastructure: volume provisioning, network policies, and resource limits.
Consider testing failure scenarios: what happens when a pod is killed? When a node becomes unavailable? When the API server is temporarily unreachable? These chaos tests build confidence in your Operator’s resilience.
Deployment and Distribution: OLM, Helm Charts, and GitOps Integration
Building an Operator is half the challenge. Distributing and managing it across environments requires thoughtful packaging and deployment strategies that fit into existing workflows.
Operator Lifecycle Manager (OLM)
OLM provides a framework for managing Operator installations, upgrades, and dependencies. Create a ClusterServiceVersion (CSV) that describes your Operator’s requirements and capabilities:
apiVersion: operators.coreos.com/v1alpha1kind: ClusterServiceVersionmetadata: name: mongodb-operator.v1.0.0spec: displayName: MongoDB Operator description: Manages MongoDB replica sets with automated failover and backups installModes: - type: OwnNamespace supported: true - type: AllNamespaces supported: true customresourcedefinitions: owned: - name: mongodbclusters.database.example.com version: v1 kind: MongoDBClusterThe CSV declares which CRDs your Operator provides, what permissions it needs, and how to install it. OLM handles the rest—creating the deployment, setting up RBAC, and managing upgrades. The OperatorHub makes your Operator discoverable to users running OLM-enabled clusters.
Helm Charts for Flexible Deployment
Helm charts offer more flexibility than OLM for teams that don’t run the Operator Hub. Structure your chart to expose the configuration options operators need:
replicaCount: 2
image: repository: ghcr.io/example/mongodb-operator tag: v1.0.0 pullPolicy: IfNotPresent
resources: limits: cpu: 200m memory: 256Mi requests: cpu: 100m memory: 128Mi
serviceMonitor: enabled: false interval: 30s
rbac: create: true namespaced: falseKey configuration options include image repository and tag overrides for air-gapped environments, resource limits and replica counts for sizing, RBAC configuration for restricted namespaces, and Prometheus ServiceMonitor creation for monitoring integration.
GitOps with ArgoCD
Operators themselves benefit from GitOps workflows. Store your Operator deployment manifests in Git and let ArgoCD sync them:
apiVersion: argoproj.io/v1alpha1kind: Applicationmetadata: name: mongodb-operatorspec: project: infrastructure source: repoURL: https://github.com/example/mongodb-operator path: deploy/helm targetRevision: v1.2.0 destination: server: https://kubernetes.default.svc namespace: mongodb-operator-system syncPolicy: automated: prune: true selfHeal: trueVersion your Operator with semantic versioning. Document breaking changes in CRD schemas and provide migration paths. Your Operator is infrastructure—treat its releases with the same rigor as the applications it manages.
Consider CRD versioning carefully. Kubernetes supports conversion webhooks for migrating between CRD versions, but these add complexity. Where possible, maintain backward compatibility by adding optional fields rather than changing existing ones.
Key Takeaways
- Start with Kubebuilder scaffolding and implement the minimal reconciliation loop before adding complexity—resist the urge to handle every edge case upfront
- Design your CRD status conditions as the primary debugging interface for operators; if you can’t diagnose issues from kubectl describe, your Operator isn’t production-ready
- Use envtest for controller integration tests in CI—it’s faster than spinning up real clusters and catches most reconciliation bugs before deployment
- Implement finalizers from day one for any Operator that creates external resources; cleaning up orphaned resources manually is a reliability nightmare
- Keep reconciliation functions idempotent and level-triggered—your controller should always converge to the correct state regardless of what events it may have missed
- Expose meaningful Prometheus metrics and structured logs; observability is not optional for production operators managing critical workloads