Building Production-Ready Kubernetes Operators: From CRD Design to Reconciliation Loops
Your team just spent three weeks building a custom deployment pipeline on top of Kubernetes. It works—deployments roll out, databases get provisioned, configurations sync across environments. But the implementation tells a different story: a growing collection of shell scripts triggered by webhooks, a Python service polling the API server every thirty seconds, and a ConfigMap that started as twelve lines of YAML and now spans four hundred lines of JSON-encoded state that nobody wants to touch.
The pattern is familiar. You needed to represent something Kubernetes doesn’t understand natively—maybe a database cluster, a tenant configuration, or a complex deployment strategy with custom health checks. ConfigMaps seemed like the obvious choice. They’re built-in, they’re simple, and they store arbitrary data. So you encoded your domain model as a JSON string, wrote some controllers to watch for changes, and bolted on validation logic in an admission webhook you now have to maintain separately.
Six months later, kubectl get configmaps returns a wall of text that tells operators nothing useful. Your validation webhook has drifted from your actual schema. Version upgrades require careful migration scripts because there’s no built-in versioning strategy. And when something breaks at 3 AM, your on-call engineer has to understand not just Kubernetes, but your entire custom orchestration layer that lives outside of it.
There’s a better way. Custom Resource Definitions let you extend the Kubernetes API itself, turning your domain concepts into first-class objects with native validation, versioning, and the same watch semantics that power every built-in controller. Your database cluster becomes a resource that kubectl describe understands. Your deployment strategy gets a schema that the API server enforces. Your operators can finally stop parsing JSON blobs and start working with typed objects.
The question isn’t whether CRDs are powerful—it’s whether they’re worth the investment for your use case.
When ConfigMaps Aren’t Enough: The Case for Custom Resources
Every Kubernetes journey includes a moment of reckoning with ConfigMaps. You start simple—a few configuration values, maybe some connection strings. Then requirements grow. Before long, you’re encoding complex application state in YAML strings, building validation logic in init containers, and hoping nobody fat-fingers a critical field.

This is the ConfigMap antipattern: treating a key-value store as a schema-less database for complex operational state.
The Limitations of String Blobs
ConfigMaps offer no structure beyond string keys and values. When you need to represent a database cluster with replicas, storage classes, backup schedules, and connection pooling settings, you’re left with two options: a massive YAML blob stored as a single string, or dozens of individual keys with naming conventions you pray everyone follows.
Neither scales. Neither validates. Neither integrates cleanly with the tools your platform teams already use.
What CRDs Actually Provide
Custom Resource Definitions solve this by extending the Kubernetes API itself. When you create a CRD, you gain:
Schema validation at admission time. Define your fields with OpenAPI v3 schemas. Kubernetes rejects malformed resources before they touch your controller. No more debugging why your operator crashed parsing unexpected input.
API versioning built-in. Ship v1alpha1, iterate to v1beta1, graduate to v1. Kubernetes handles conversion webhooks between versions, letting you evolve your API without breaking existing deployments.
Native kubectl integration. kubectl get databases works immediately. Add printer columns and your operators get the same first-class experience as Deployments and Services.
Watch semantics for free. Every CRD automatically supports the Kubernetes watch API. Your controller receives events for creates, updates, and deletes without polling or custom pub-sub infrastructure.
The Decision Framework
Build a CRD when you need to encode operational knowledge that spans multiple Kubernetes resources. A PostgreSQL cluster isn’t just a StatefulSet—it’s primary election logic, replica lag monitoring, backup orchestration, and connection pooling. That operational complexity deserves a dedicated API.
Stick with existing abstractions when Helm charts or Kustomize overlays adequately capture your requirements. Not every application needs a custom operator.
💡 Pro Tip: If you find yourself writing shell scripts that parse ConfigMap values and create other Kubernetes resources, you’re building a poor operator. Make it explicit.
The patterns are clear in production: cert-manager’s Certificate resources, Strimzi’s Kafka clusters, Crossplane’s cloud provider bindings. Each encodes domain expertise into a declarative API that Kubernetes can reason about.
With the case for CRDs established, the next question is structure. A well-designed CRD separates desired state from observed reality through careful spec and status design.
Anatomy of a Well-Designed CRD
A CRD is an API contract. Every design decision—from naming conventions to validation rules—shapes how users interact with your operator for years to come. Get it right, and your resource feels like a native Kubernetes citizen. Get it wrong, and you’ll spend more time explaining your API than building features.
Naming Conventions That Scale
Your API group, version, and kind form the identity of your resource. Choose them with future maintainability in mind:
apiVersion: apiextensions.k8s.io/v1kind: CustomResourceDefinitionmetadata: name: postgresclusters.database.example.comspec: group: database.example.com names: kind: PostgresCluster listKind: PostgresClusterList plural: postgresclusters singular: postgrescluster shortNames: - pg - pgc scope: Namespaced versions: - name: v1alpha1 served: true storage: trueUse a domain you control for the API group—database.example.com rather than generic names that risk collision. This practice mirrors how Go packages use reverse domain notation and prevents conflicts when multiple operators coexist in a cluster. Start with v1alpha1 for new APIs; this signals to users that breaking changes are expected. Progress through v1beta1 when the API stabilizes, and graduate to v1 only when you’re confident in long-term compatibility.
The shortNames field transforms verbose kubectl get postgresclusters commands into ergonomic kubectl get pg invocations. Choose abbreviations that are intuitive but unlikely to conflict with existing resources—check against built-in Kubernetes short names before committing.
Spec vs Status: The Declarative Contract
The separation between spec and status is fundamental to Kubernetes’ declarative model. Your users declare intent in spec; your operator reports reality in status. Never let users write to status, and never let your operator modify spec. This boundary isn’t merely convention—it’s the foundation of the reconciliation pattern that makes Kubernetes resilient.
apiVersion: database.example.com/v1alpha1kind: PostgresClustermetadata: name: orders-db namespace: productionspec: version: "15.4" replicas: 3 storage: size: 100Gi storageClass: fast-ssd backup: schedule: "0 2 * * *" retention: 7status: ready: true currentVersion: "15.4" replicas: 3 readyReplicas: 3 lastBackup: "2024-01-15T02:00:00Z" conditions: - type: Available status: "True" lastTransitionTime: "2024-01-14T10:30:00Z" reason: AllReplicasReady message: "All 3 replicas are running and healthy"The conditions array follows Kubernetes conventions: each condition has a type, status (True, False, or Unknown), and timestamps tracking state changes. Include reason and message fields to provide machine-parseable codes and human-readable explanations respectively.
💡 Pro Tip: Design your
specfields as the what, never the how. Users should specifyreplicas: 3, not the individual pod configurations. Your operator handles implementation details.
Schema Validation That Catches Mistakes Early
OpenAPI v3 schema validation moves error detection from reconciliation time to admission time. Define types, constraints, and defaults directly in your CRD:
openAPIV3Schema: type: object required: - spec properties: spec: type: object required: - version - replicas properties: version: type: string pattern: '^\d+\.\d+$' description: PostgreSQL major.minor version replicas: type: integer minimum: 1 maximum: 7 default: 3 storage: type: object properties: size: type: string pattern: '^\d+(Gi|Ti)$' default: 10GiThe pattern field accepts regular expressions for string validation—useful for version strings, resource quantities, and identifiers. Use minimum and maximum for numeric bounds that match your operational constraints. The default field populates omitted values, reducing boilerplate in user manifests while ensuring sensible behavior.
For complex validation logic that exceeds OpenAPI’s capabilities—cross-field dependencies, external lookups, or business rules—consider implementing a validating admission webhook. The CRD schema handles structural validation; webhooks handle semantic validation.
Printer Columns for Operational Visibility
Custom printer columns surface critical status information in kubectl get output without requiring describe or JSON parsing:
additionalPrinterColumns: - name: Ready type: boolean jsonPath: .status.ready - name: Version type: string jsonPath: .status.currentVersion - name: Replicas type: string jsonPath: .status.readyReplicas priority: 0 - name: Age type: date jsonPath: .metadata.creationTimestampThis transforms the default output into actionable operational data:
NAME READY VERSION REPLICAS AGEorders-db true 15.4 3 5dThe priority field controls column visibility: priority 0 columns appear in standard output, while higher values require -o wide. Use this to balance information density against terminal width constraints.
With your CRD structure defined and validation in place, the real work begins: implementing the reconciliation loop that transforms declared state into running infrastructure.
The Reconciliation Loop: Where Your Business Logic Lives
The reconciliation loop is the heart of every Kubernetes operator. While your CRD defines what users can express, the reconciler determines how those desires become reality. Understanding the design principles behind Kubernetes reconciliation separates operators that work in demos from those that survive production.

Level-Triggered vs Edge-Triggered: A Deliberate Choice
Kubernetes controllers are level-triggered, not edge-triggered. This distinction matters profoundly for reliability.
An edge-triggered system reacts to changes: “the replica count changed from 3 to 5.” A level-triggered system reacts to state: “the desired replica count is 5, but only 3 exist.” The difference becomes critical during failures. If an edge-triggered controller crashes mid-operation, it misses the event and never recovers. A level-triggered controller simply wakes up, observes the current state, compares it to the desired state, and acts accordingly.
This design mandates idempotent reconciliation. Your Reconcile function will be called repeatedly for the same resource—after restarts, after network partitions, after watch reconnections. Running it twice must produce the same result as running it once. This means every operation your reconciler performs should be safe to repeat: checking whether a resource exists before creating it, using server-side apply for updates, and ensuring cleanup operations handle already-deleted resources gracefully.
The level-triggered model also influences how you structure your reconciliation logic. Rather than tracking what changed and reacting to specific deltas, you declaratively compare the entire desired state against the entire actual state. This approach is more verbose but dramatically more robust—your controller self-heals from any inconsistency, not just the ones triggered by events it observed.
The controller-runtime Machinery
The controller-runtime library provides the scaffolding for building robust reconcilers. At its core, you implement a single method:
func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) { log := log.FromContext(ctx)
// Fetch the Database instance var database myappv1.Database if err := r.Get(ctx, req.NamespacedName, &database); err != nil { if apierrors.IsNotFound(err) { // Resource deleted - nothing to reconcile return ctrl.Result{}, nil } return ctrl.Result{}, err }
// Your business logic here: compare desired vs actual state // and take action to reconcile the difference
return ctrl.Result{}, nil}The ctrl.Request contains only a namespace and name—deliberately minimal. The controller fetches fresh state on every reconciliation rather than trusting cached event data. This reinforces the level-triggered model: always work from current truth.
Behind the scenes, controller-runtime manages a work queue, watch connections, and event filtering. When you register your controller with predicates, you control which events trigger reconciliation. For high-frequency resources, filtering out status-only updates prevents unnecessary work. The framework also handles leader election for high-availability deployments, ensuring only one replica actively reconciles at a time.
Finalizers: Cleaning Up External Resources
When your operator manages external resources (cloud databases, DNS records, certificates), you need finalizers to prevent orphaned infrastructure. Without them, Kubernetes deletes your custom resource immediately, giving your controller no opportunity to clean up.
const databaseFinalizer = "database.myapp.io/finalizer"
func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) { var database myappv1.Database if err := r.Get(ctx, req.NamespacedName, &database); err != nil { return ctrl.Result{}, client.IgnoreNotFound(err) }
// Check if resource is being deleted if !database.ObjectMeta.DeletionTimestamp.IsZero() { if controllerutil.ContainsFinalizer(&database, databaseFinalizer) { // Perform cleanup of external resources if err := r.deleteExternalDatabase(ctx, &database); err != nil { return ctrl.Result{}, err }
// Remove finalizer to allow deletion to proceed controllerutil.RemoveFinalizer(&database, databaseFinalizer) if err := r.Update(ctx, &database); err != nil { return ctrl.Result{}, err } } return ctrl.Result{}, nil }
// Add finalizer if not present if !controllerutil.ContainsFinalizer(&database, databaseFinalizer) { controllerutil.AddFinalizer(&database, databaseFinalizer) if err := r.Update(ctx, &database); err != nil { return ctrl.Result{}, err } }
// Normal reconciliation logic return r.reconcileDatabase(ctx, &database)}The finalizer pattern creates a two-phase deletion process. When a user deletes the resource, Kubernetes sets the DeletionTimestamp but does not remove the object. Your controller detects this state, performs cleanup, removes the finalizer, and only then does the API server complete the deletion. This guarantees your cleanup logic executes even if the controller was offline when the delete request arrived.
💡 Pro Tip: Always add finalizers before creating external resources. If your controller crashes between resource creation and finalizer addition, you’ll have orphaned infrastructure with no cleanup mechanism.
Requeue Strategies for Resilient Operations
The ctrl.Result return value controls when your reconciler runs again. Choose your strategy based on the failure mode:
// Immediate requeue: transient error, retry nowreturn ctrl.Result{Requeue: true}, nil
// Delayed requeue: wait for external system to stabilizereturn ctrl.Result{RequeueAfter: 30 * time.Second}, nil
// Error requeue: let controller-runtime handle backoffreturn ctrl.Result{}, fmt.Errorf("failed to provision: %w", err)
// Success, no requeue: desired state achievedreturn ctrl.Result{}, nilReturning an error triggers controller-runtime’s exponential backoff, which prevents hammering a failing external service. The default backoff starts at a few milliseconds and grows to several minutes, with jitter to prevent thundering herds. For operations that require polling (waiting for a cloud database to become available), use RequeueAfter with a reasonable interval—typically 15 to 60 seconds depending on the expected provisioning time.
For long-running provisioning operations, implement a state machine in your resource’s status. Each reconciliation advances the state one step, with appropriate requeue delays between transitions. This pattern survives controller restarts and provides clear observability into progress. A database resource might transition through Pending, Provisioning, ConfiguringNetwork, Ready—with each state persisted before the next step begins.
With reconciliation logic in place, your operator can create and manage resources—but users need visibility into what’s happening. Status conditions provide that window into operational state.
Status Reporting and Condition Management
A Kubernetes resource without proper status reporting is a black box. Users resort to kubectl describe, log diving, and tribal knowledge to understand what’s happening. Well-designed status reporting transforms your operator from a mystery into a self-documenting system that integrates with existing Kubernetes tooling and monitoring infrastructure.
The Condition Type: A Standard Vocabulary
Kubernetes establishes a convention for expressing resource health through Conditions—a slice of structs that answer “what is the current state of this resource?” in a machine-readable format. This pattern, borrowed from core resources like Pods and Deployments, provides consistency across the ecosystem.
type DatabaseStatus struct { Conditions []metav1.Condition `json:"conditions,omitempty"` ObservedGeneration int64 `json:"observedGeneration,omitempty"` ReadyReplicas int32 `json:"readyReplicas,omitempty"` Endpoint string `json:"endpoint,omitempty"`}
const ( ConditionTypeReady = "Ready" ConditionTypeProvisioned = "Provisioned" ConditionTypeDegraded = "Degraded")Each condition carries a Type, Status (True/False/Unknown), Reason (machine-readable), Message (human-readable), and LastTransitionTime. This structure enables kubectl wait --for=condition=Ready out of the box and integrates with GitOps tools like Argo CD for health assessments. The Reason field should be a CamelCase identifier suitable for programmatic consumption, while Message provides the human-friendly context that appears in kubectl describe output.
Choose your condition types carefully. The Ready condition should represent the resource’s ability to serve its primary function. Supplementary conditions like Provisioned, Degraded, or Progressing communicate nuanced states that help operators diagnose issues without resorting to logs.
ObservedGeneration: The Staleness Guard
When you update a resource’s spec, Kubernetes increments .metadata.generation. Your reconciler must track this in .status.observedGeneration to signal “I’ve processed your latest changes.” This simple integer prevents a class of subtle bugs where stale status misleads users and automation.
func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) { var db myapiv1.Database if err := r.Get(ctx, req.NamespacedName, &db); err != nil { return ctrl.Result{}, client.IgnoreNotFound(err) }
// Perform reconciliation logic...
// Update observed generation and conditions db.Status.ObservedGeneration = db.Generation meta.SetStatusCondition(&db.Status.Conditions, metav1.Condition{ Type: ConditionTypeReady, Status: metav1.ConditionTrue, Reason: "ReconcileSuccess", Message: "Database is ready and accepting connections", ObservedGeneration: db.Generation, })
return ctrl.Result{}, r.Status().Update(ctx, &db)}Consumers check status.observedGeneration == metadata.generation to know whether the status reflects the current spec. Without this, your status lies—it describes a previous version of the resource. CI/CD pipelines and GitOps controllers rely on this comparison to determine deployment completion. A missing or stale observedGeneration causes rollouts to hang indefinitely or, worse, report success prematurely.
Aggregating Status from Child Resources
Operators typically manage multiple child resources: StatefulSets, Services, ConfigMaps, Secrets. Your status must synthesize their states into a coherent picture. This aggregation shields users from implementation details while surfacing actionable information.
func (r *DatabaseReconciler) aggregateChildStatus(ctx context.Context, db *myapiv1.Database) error { var sts appsv1.StatefulSet if err := r.Get(ctx, types.NamespacedName{ Name: db.Name + "-db", Namespace: db.Namespace, }, &sts); err != nil { return err }
db.Status.ReadyReplicas = sts.Status.ReadyReplicas
if sts.Status.ReadyReplicas < *sts.Spec.Replicas { meta.SetStatusCondition(&db.Status.Conditions, metav1.Condition{ Type: ConditionTypeDegraded, Status: metav1.ConditionTrue, Reason: "ReplicasUnavailable", Message: fmt.Sprintf("%d/%d replicas ready", sts.Status.ReadyReplicas, *sts.Spec.Replicas), }) }
return nil}When aggregating, apply the principle of escalation: if any child resource is unhealthy, the parent should reflect that degradation. Consider implementing a priority system for conditions—a Degraded state from a critical component should override Ready conditions from healthy peripherals. This prevents false confidence when partial failures occur.
Pro Tip: Use
meta.SetStatusConditionfromsigs.k8s.io/controller-runtime/pkg/metarather than manually managing the conditions slice. It handles deduplication and timestamp updates correctly, ensuringLastTransitionTimeonly changes when the condition’s status actually transitions.
Events: The Audit Trail
Conditions capture current state; events capture history. Emit events for significant state transitions, configuration changes, and errors. Events provide the breadcrumb trail operators need when troubleshooting at 3 AM.
func (r *DatabaseReconciler) handleProvisioningComplete(ctx context.Context, db *myapiv1.Database) { r.Recorder.Event(db, corev1.EventTypeNormal, "Provisioned", fmt.Sprintf("Database %s provisioned with endpoint %s", db.Name, db.Status.Endpoint))}
func (r *DatabaseReconciler) handleProvisioningFailed(ctx context.Context, db *myapiv1.Database, err error) { r.Recorder.Event(db, corev1.EventTypeWarning, "ProvisioningFailed", fmt.Sprintf("Failed to provision database: %v", err))}Events appear in kubectl describe, feed into cluster-level event aggregators, and integrate with monitoring systems that watch for warning events. Be judicious with event emission—too many events create noise, while too few leave gaps in the operational narrative. Focus on state transitions, errors, and actions that modify external systems.
The combination of conditions, observed generation, and events creates a complete observability surface for your custom resource. Users can query current state through conditions, verify freshness through observed generation, and reconstruct history through events. But status reporting is only valuable if your reconciler behaves correctly—and proving correctness requires testing. Let’s explore how to test operators without spinning up a full cluster.
Testing Operators Without a Cluster
Production operators demand the same testing rigor as any critical infrastructure code. The challenge: Kubernetes controllers are deeply integrated with the API server. Fortunately, the ecosystem provides tools that let you test reconciliation logic without spinning up a full cluster.
Unit Testing with Fake Clients
Start with the fastest feedback loop. Controller-runtime’s fake client lets you test reconciliation logic in isolation:
func TestDatabaseReconciler_CreatesStatefulSet(t *testing.T) { scheme := runtime.NewScheme() _ = clientgoscheme.AddToScheme(scheme) _ = dbv1alpha1.AddToScheme(scheme)
database := &dbv1alpha1.Database{ ObjectMeta: metav1.ObjectMeta{ Name: "orders-db", Namespace: "production", }, Spec: dbv1alpha1.DatabaseSpec{ Engine: "postgres", Version: "15.4", Replicas: 3, }, }
fakeClient := fake.NewClientBuilder(). WithScheme(scheme). WithObjects(database). Build()
reconciler := &DatabaseReconciler{ Client: fakeClient, Scheme: scheme, }
_, err := reconciler.Reconcile(context.Background(), ctrl.Request{ NamespacedName: types.NamespacedName{ Name: "orders-db", Namespace: "production", }, }) require.NoError(t, err)
var sts appsv1.StatefulSet err = fakeClient.Get(context.Background(), types.NamespacedName{ Name: "orders-db", Namespace: "production", }, &sts) require.NoError(t, err) assert.Equal(t, int32(3), *sts.Spec.Replicas)}Fake clients excel at testing happy paths and verifying that reconciliation creates expected resources. They run in milliseconds and require no external dependencies. However, be aware of their limitations: fake clients don’t enforce resource validation, don’t maintain realistic field defaults, and won’t catch issues with watch semantics or caching behavior. Use them for testing pure business logic, not API interactions.
Integration Testing with envtest
Unit tests with fake clients miss critical behaviors: webhook validation, real API semantics, and race conditions. Enter envtest, which spins up a real API server and etcd binary:
var ( testEnv *envtest.Environment k8sClient client.Client)
func TestMain(m *testing.M) { testEnv = &envtest.Environment{ CRDDirectoryPaths: []string{ filepath.Join("..", "config", "crd", "bases"), }, ErrorIfCRDPathMissing: true, }
cfg, err := testEnv.Start() if err != nil { panic(err) }
k8sClient, _ = client.New(cfg, client.Options{ Scheme: scheme.Scheme, })
code := m.Run() _ = testEnv.Stop() os.Exit(code)}With envtest running, your integration tests exercise the full reconciliation cycle against a real API server. Test multi-step workflows, watch behavior, and status updates with confidence that they’ll work identically in production.
For complex workflows involving multiple reconciliation cycles, structure tests to wait for expected state rather than assuming synchronous behavior:
func TestDatabaseReconciler_FullLifecycle(t *testing.T) { ctx := context.Background() database := createTestDatabase("lifecycle-test", "default")
require.NoError(t, k8sClient.Create(ctx, database))
// Wait for StatefulSet creation Eventually(func() error { var sts appsv1.StatefulSet return k8sClient.Get(ctx, client.ObjectKeyFromObject(database), &sts) }, 10*time.Second, 100*time.Millisecond).Should(Succeed())
// Simulate pod readiness and verify status propagation updateStatefulSetStatus(ctx, database, 3)
Eventually(func() int32 { var db dbv1alpha1.Database _ = k8sClient.Get(ctx, client.ObjectKeyFromObject(database), &db) return db.Status.ReadyReplicas }, 5*time.Second).Should(Equal(int32(3)))}This pattern—create, wait, verify—prevents flaky tests caused by reconciliation timing.
Property-Based Testing for CRD Validation
OpenAPI schemas in your CRD catch invalid inputs at admission time. Verify your validation rules actually reject malformed resources using property-based testing:
func TestDatabaseSpec_Validation(t *testing.T) { rapid.Check(t, func(t *rapid.T) { replicas := rapid.Int32Range(-100, 100).Draw(t, "replicas") version := rapid.StringMatching(`[a-z0-9.-]{0,20}`).Draw(t, "version")
spec := dbv1alpha1.DatabaseSpec{ Replicas: replicas, Version: version, }
err := spec.Validate()
if replicas < 1 || replicas > 10 { assert.Error(t, err, "should reject replicas=%d", replicas) } if !semverPattern.MatchString(version) { assert.Error(t, err, "should reject version=%s", version) } })}Property-based tests generate hundreds of edge cases you’d never think to write manually. They’re particularly valuable for catching boundary conditions in numeric ranges and string patterns. Consider testing combinations of fields that might have interdependencies—for example, certain engine types might only support specific version formats.
Pro Tip: Run envtest-based tests in CI with
setup-envtestto automatically download the correct API server binaries for your Kubernetes version. Pin the version to match your target clusters.
Building a Layered Test Strategy
A comprehensive test suite combines all three approaches: fake client tests for fast iteration on business logic, envtest for integration confidence, and property-based tests for validation coverage. Structure your test pyramid accordingly—hundreds of unit tests, dozens of integration tests, and property-based tests covering each validation rule.
Run fake client tests on every save during development. Run envtest tests before commits and in CI. Run property-based tests with higher iteration counts in nightly builds. This layered strategy catches bugs before they reach any cluster—staging or production.
With testing infrastructure in place, you’ll eventually need to evolve your CRD schema. Managing those changes without breaking existing resources requires careful version planning.
Versioning and Upgrades: Living with Your API
Your CRD is a contract. The moment you ship v1alpha1, you’ve made promises to every manifest in every Git repository that references it. Breaking those promises means broken deployments, angry on-call engineers, and lengthy migration projects. The good news: Kubernetes provides robust mechanisms for evolving APIs gracefully—but only if you understand and implement them correctly from the start.
Storage Versions and Conversion Webhooks
Kubernetes stores all custom resources in a single version—the storage version—regardless of which API version clients use. When you introduce v1beta1 alongside v1alpha1, you need a conversion webhook to translate between them:
apiVersion: apiextensions.k8s.io/v1kind: CustomResourceDefinitionmetadata: name: databases.mycompany.iospec: group: mycompany.io names: kind: Database plural: databases scope: Namespaced conversion: strategy: Webhook webhook: conversionReviewVersions: ["v1"] clientConfig: service: name: database-operator namespace: database-system path: /convert port: 443 versions: - name: v1beta1 served: true storage: true # New storage version schema: openAPIV3Schema: # ... v1beta1 schema - name: v1alpha1 served: true storage: false # Still served, but not stored schema: openAPIV3Schema: # ... v1alpha1 schemaThe conversion webhook receives objects in their stored version and converts them to whichever version the client requested. This bidirectional conversion must be lossless—data written via v1alpha1 must survive a round-trip through v1beta1 and back. Failing to maintain lossless conversion leads to subtle data corruption that may not surface until a critical recovery scenario.
When implementing your conversion webhook, handle unknown fields gracefully by preserving them in annotations. This forward-compatibility pattern allows newer clients to store data that older versions don’t understand, preventing data loss during rolling upgrades when different operator versions coexist temporarily.
Adding Fields Safely
New optional fields with sensible defaults are always safe. Required fields are breaking changes—full stop. When you need to add required functionality, introduce it as optional first, then enforce it through validation webhooks after a deprecation period:
apiVersion: mycompany.io/v1beta1kind: Databasemetadata: name: orders-dbspec: engine: postgres version: "15" # New in v1beta1: optional with defaulting webhook backup: enabled: true schedule: "0 2 * * *" retention: 7d💡 Pro Tip: Use a mutating webhook to inject defaults for new fields. This keeps your CRD schema clean while ensuring all resources have complete specifications.
For field removals, never delete fields outright. Instead, mark them as deprecated in your documentation, stop reading them in your controller logic, and remove them only in the next major version. This gives users a full version cycle to update their manifests.
Deprecation Timelines
Follow Kubernetes’ own deprecation policy: deprecated API versions remain available for at least three minor releases. This provides users adequate time to migrate without emergency changes. Announce deprecations through:
- CRD annotations marking deprecated versions
- Operator logs warning when deprecated versions are used
- Admission webhooks that emit warnings (not rejections) for deprecated resources
- Release notes and migration guides published well before removal
Consider providing automated migration tools—a CLI command or controller mode that rewrites resources from deprecated versions to current ones. This dramatically reduces user friction and accelerates adoption of your newer APIs.
Helm and ArgoCD Considerations
CRD lifecycle creates ordering challenges that catch many teams off guard. Helm installs CRDs before other resources but won’t upgrade them by default—a deliberate safety measure that prevents accidental breaking changes. For production deployments, manage CRDs separately from your operator Helm chart:
apiVersion: kustomize.config.k8s.io/v1beta1kind: Kustomizationresources: - crds/ # Applied first, managed independently - operator/ArgoCD users should enable ServerSideApply for CRDs to handle large schemas that exceed annotation size limits, and configure sync waves to ensure CRDs exist before the operator attempts to watch them. Set Replace=true for CRD sync options when schema changes would otherwise fail validation during updates.
With versioning patterns established, the final step is hardening your operator for production traffic and failure scenarios.
Production Hardening Checklist
Your operator passes tests and handles reconciliation correctly. Before deploying to production, verify these non-functional requirements that separate demo-quality operators from production-grade infrastructure.
RBAC: Least Privilege by Default
Grant your operator only the permissions it needs. Start with an empty ClusterRole and add verbs incrementally as your controller requires them. If your operator manages Deployments, it needs get, list, watch, create, update, and patch on Deployments—not a blanket * on all resources. Use namespaced Roles when your operator works within a single namespace, reserving ClusterRoles for truly cluster-wide concerns.
Audit your RBAC configuration with kubectl auth can-i --list --as=system:serviceaccount:your-ns:your-operator to verify the effective permissions match your expectations.
Leader Election for High Availability
Running multiple operator replicas without leader election causes duplicate reconciliations, resource conflicts, and unpredictable behavior. Enable leader election using the controller-runtime’s built-in mechanism. The leader acquires a Lease object and periodically renews it; if the leader crashes, another replica takes over within seconds.
Configure reasonable lease durations: 15-second lease duration with 10-second renewal and 2-second retry period works well for most deployments.
Resource Limits and Watch Caching
Operators watching thousands of resources consume significant memory. Set appropriate resource requests and limits based on your cluster size. Enable watch caching in the manager to reduce API server load—shared informers cache watched objects locally, avoiding redundant API calls across controllers.
For large clusters, consider namespace selectors or label filters to reduce the working set your operator must track.
Observability Endpoints
Expose /healthz and /readyz endpoints for Kubernetes probes. The controller-runtime provides these by default; ensure your Deployment references them. Export Prometheus metrics on /metrics covering reconciliation duration, error counts, and queue depth. Add distributed tracing spans around external calls to debug latency in production.
💡 Pro Tip: Set up alerts on reconciliation error rates and queue saturation before your first production deployment. Discovering observability gaps during an incident wastes precious debugging time.
With these foundations in place, your operator is ready for production workloads. The patterns covered in this guide—from CRD design through reconciliation logic to operational hardening—form a complete framework for encoding operational knowledge into Kubernetes-native APIs.
Key Takeaways
- Start with your API design: write example YAML manifests before any Go code to validate the user experience
- Implement status conditions and observedGeneration from day one—debugging operators without proper status is painful
- Use envtest in CI to catch reconciler bugs before they hit staging; it’s faster than you think
- Plan for version migrations early: adding a conversion webhook later is much harder than shipping with one