ECS vs EKS vs Self-Managed Kubernetes: Choosing Your AWS Container Strategy
Your team just got the greenlight to containerize a monolith, and now you’re staring at three viable AWS options—ECS, EKS, and self-managed Kubernetes on EC2—each with its own operational cost, lock-in profile, and learning curve. The AWS documentation won’t tell you which one to pick, and your last three architecture meetings ended in a draw. Here’s how to break the tie.
The frustrating truth is that AWS designed these options to serve genuinely different needs, not to exist on a beginner-to-expert spectrum. Picking EKS over ECS doesn’t make your architecture more sophisticated—it makes it more expensive to operate if your team doesn’t need what Kubernetes actually provides. And spinning up self-managed Kubernetes on EC2 is a legitimate choice, but only in a narrow set of circumstances where control outweighs the cost of owning the control plane yourself.
Most teams default to Kubernetes because it’s what the industry talks about. That instinct produces a lot of over-engineered infrastructure for workloads that would run cleanly on ECS with a fraction of the operational surface area. The container orchestration decision is really three separate questions compressed into one: How much AWS lock-in are you comfortable with? How much portability do you actually need? And how much operational overhead is your team willing to own long-term?
Those three dimensions—lock-in, portability, and operational overhead—cut through the noise faster than any feature comparison matrix. Before choosing a platform, let’s establish what the landscape actually looks like and why the choice is less obvious than AWS’s own documentation implies.
The Container Orchestration Landscape on AWS: Why the Choice Isn’t Obvious
Running containers on AWS means choosing between three fundamentally different operational models—and the right answer depends on your team’s maturity, your tolerance for cloud lock-in, and how much infrastructure complexity you’re willing to own. This is not a question of beginner versus advanced. It’s a question of which trade-offs you’re actually willing to live with at 2 AM when something breaks in production.

AWS gives you three distinct paths:
Amazon ECS is AWS’s native container orchestrator. It has no upstream open-source equivalent, integrates directly with IAM, ALB, CloudWatch, and Fargate, and abstracts away the control plane entirely. You trade portability for operational simplicity.
Amazon EKS runs upstream, unmodified Kubernetes with a managed control plane. AWS handles etcd, the API server, and availability—you handle everything else: node groups, add-ons, networking, upgrades, and the operational surface that comes with Kubernetes. You get portability and the full Kubernetes ecosystem, but you also inherit Kubernetes’s complexity.
Self-managed Kubernetes on EC2 gives you complete control over every layer of the stack—control plane configuration, etcd tuning, custom admission webhooks, and network plugins. You also take on complete responsibility for availability, upgrades, and security patching. Almost no team at scale finds this trade-off worth it unless they have hard regulatory or customization requirements that EKS cannot satisfy.
The “Just Use Kubernetes” Trap
The gravitational pull toward Kubernetes is real. It’s the industry default, the ecosystem is enormous, and it signals technical sophistication. But defaulting to Kubernetes without interrogating the actual requirements routinely produces infrastructure that is three times harder to operate than the workload demands. A team running five services with predictable traffic patterns on ECS with Fargate ships features. The same team spending two sprints debugging CoreDNS or IRSA propagation delays is not.
The decision has three actual axes:
- Lock-in: How portable does your infrastructure need to be across clouds or on-premises?
- Portability: Do your workloads need Helm charts, Kubernetes-native tooling, or CRDs from the broader ecosystem?
- Operational overhead: What is the fully-loaded cost—in engineering hours—of running and maintaining this platform?
💡 Pro Tip: Portability is often cited as the primary reason to choose Kubernetes, but most teams that start on ECS never actually migrate to another cloud. Weigh hypothetical portability against concrete operational cost before making it the deciding factor.
The following sections break down each option in depth—starting with ECS, where AWS lock-in is a deliberate architectural feature, not a limitation to work around.
Amazon ECS: When AWS Lock-In Is a Feature, Not a Bug
For teams operating entirely within the AWS ecosystem, ECS’s deep native integration is not a compromise—it is the architecture. Every component in the AWS stack that your service needs to touch works with ECS out of the box: IAM task roles for fine-grained credential scoping, Application Load Balancer for service discovery and traffic routing, CloudWatch Container Insights for metrics and logs, and ECR for image management. There is no Helm chart to write, no Ingress controller to configure, and no third-party operator to maintain. That is the trade-off, and for the right team, it is the correct one.
Native AWS Integration Without the Glue
When you deploy a service on ECS, IAM roles attach directly to the task definition. Your container receives short-lived credentials via the ECS metadata endpoint with no Kubernetes service account annotations, no IRSA setup, and no Vault sidecar required. ALB target group registration is handled by the ECS service scheduler automatically—add a container port to your task definition and ECS registers and deregisters targets as tasks scale up and down.
This eliminates an entire class of operational complexity that EKS teams manage as routine work. The cost is portability: your ECS task definitions are AWS-specific constructs with no equivalent outside this ecosystem.
A Production-Ready Fargate Task Definition
The following task definition represents a realistic microservices configuration: a Node.js API with CloudWatch logging, Secrets Manager integration, and resource limits appropriate for a moderate-traffic service.
family: payments-apinetworkMode: awsvpcrequiresCompatibilities: - FARGATEcpu: "512"memory: "1024"taskRoleArn: arn:aws:iam::123456789012:role/payments-api-task-roleexecutionRoleArn: arn:aws:iam::123456789012:role/ecs-task-execution-role
containerDefinitions: - name: payments-api image: 123456789012.dkr.ecr.us-east-1.amazonaws.com/payments-api:1.4.2 portMappings: - containerPort: 3000 protocol: tcp environment: - name: NODE_ENV value: production - name: AWS_REGION value: us-east-1 secrets: - name: DB_PASSWORD valueFrom: arn:aws:secretsmanager:us-east-1:123456789012:secret:payments/db-password-xK93mA - name: STRIPE_SECRET_KEY valueFrom: arn:aws:secretsmanager:us-east-1:123456789012:secret:payments/stripe-key-pQ71nB logConfiguration: logDriver: awslogs options: awslogs-group: /ecs/payments-api awslogs-region: us-east-1 awslogs-stream-prefix: payments healthCheck: command: ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"] interval: 30 timeout: 5 retries: 3 startPeriod: 60 readonlyRootFilesystem: true user: "1000"Notice what is absent: there is no sidecar for secret injection, no init container for credential bootstrapping, and no cluster-level RBAC policy to configure. The secrets block in the container definition handles Secrets Manager retrieval at task startup, with the execution role providing the necessary secretsmanager:GetSecretValue permission.
The Fargate Cost Equation
For a typical microservices workload—16 services averaging 512 CPU units and 1 GB memory, running at steady state across two availability zones—the operational cost comparison between Fargate and EKS managed nodes is instructive.
EKS requires a minimum of two m5.large worker nodes ($0.096/hour each) plus the EKS control plane fee ($0.10/hour), totaling approximately $230/month before any workload costs. Fargate pricing for the same 16 services at the above specifications runs approximately $195/month in compute—with no node management overhead, no OS patching cycle, and no capacity planning for bin packing.
The crossover point arrives when you operate at scale: above roughly 50 services or when workloads require GPU access, spot instance strategies, or custom kernel parameters, EKS managed node groups become more cost-effective. Below that threshold, Fargate on ECS is frequently both cheaper and operationally simpler.
💡 Pro Tip: Enable ECS Exec (
enableExecuteCommand: trueon your service) from day one. Once a container is running in Fargate without shell access, debugging a subtle startup failure requires a full redeploy. ECS Exec gives you interactive shell access into running Fargate tasks without opening inbound network rules.
When ECS Is the Right Answer
ECS is the correct choice when three conditions hold simultaneously: your team has no dedicated platform engineering function, your roadmap has no multi-cloud or on-premises requirements, and your workloads fit comfortably within Fargate’s resource ceiling (16 vCPU, 120 GB memory per task). If any of these conditions change—particularly the portability requirement—the migration path to EKS is well-documented, since both services consume the same container images and can share the same ECR repositories and IAM patterns.
EKS solves a different problem: it gives you Kubernetes’s API surface and ecosystem when the operational investment is justified by scale, team capability, or platform requirements that extend beyond what ECS exposes. That trade-off is the subject of the next section.
Amazon EKS: Managed Kubernetes Without the Control Plane Burden
Amazon EKS removes the most painful part of running Kubernetes—operating the control plane. AWS manages etcd, the API server, controller managers, and scheduler across multiple availability zones, handles control plane upgrades, and backs the whole thing with a 99.95% SLA. What you retain is everything else: worker nodes, networking configuration, add-on lifecycle, and the operational surface that comes with running upstream Kubernetes.
That last point is the defining characteristic of EKS. Your manifests are portable. A Deployment, HorizontalPodAutoscaler, or NetworkPolicy written for EKS runs on GKE, AKS, or an on-premises cluster with minimal modification. For organizations pursuing a multi-cloud strategy or hedging against vendor lock-in, this portability is the primary reason to absorb EKS’s complexity.
Node Compute Models
EKS offers three compute strategies, each with a different operational trade-off:
Managed node groups provision and lifecycle EC2 instances inside your VPC. AWS handles AMI updates and graceful node draining during upgrades—you define instance types, scaling boundaries, and that’s largely it. This is the right default for most teams.
Self-managed nodes give you full control over the EC2 configuration: custom AMIs, specialized instance families, or configurations that managed node groups don’t expose. You own the upgrade process entirely.
Fargate profiles eliminate node management completely. Pods matching a namespace or label selector run on AWS-managed infrastructure with no EC2 instances in your account. The trade-off is a constrained feature set—no DaemonSets, no privileged containers, no host networking—and cold start latency that makes it unsuitable for latency-sensitive workloads.
Add-Ons: What EKS Provides vs. What You Own
EKS ships with a minimal viable cluster. To reach production-readiness, you install add-ons—and understanding the ownership boundary here prevents operational surprises.
The VPC CNI plugin assigns native VPC IP addresses to pods, enabling direct routing without overlay networks. AWS manages the plugin version when installed as a managed add-on, but you own the IP address planning. Under-allocating your VPC CIDR is a common mistake that’s expensive to fix later.
The AWS Load Balancer Controller provisions Application and Network Load Balancers from Kubernetes Ingress and Service resources. The EBS CSI driver handles persistent volume provisioning. Both require IAM permissions to call AWS APIs—which brings us to the most important EKS security primitive.
ECR Image Pulling via IRSA
IAM Roles for Service Accounts (IRSA) maps a Kubernetes service account to an AWS IAM role using OIDC federation. This eliminates node-level IAM permissions and gives each workload the minimum AWS access it needs.
To pull images from ECR, attach ecr:GetAuthorizationToken and ecr:BatchGetImage to the role, then annotate your service account:
apiVersion: v1kind: ServiceAccountmetadata: name: app-service-account namespace: production annotations: eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/eks-ecr-pull-role---apiVersion: apps/v1kind: Deploymentmetadata: name: api-service namespace: productionspec: replicas: 3 selector: matchLabels: app: api-service template: metadata: labels: app: api-service spec: serviceAccountName: app-service-account containers: - name: api image: 123456789012.dkr.ecr.us-east-1.amazonaws.com/api-service:v1.4.2 ports: - containerPort: 8080The kubelet’s credential provider fetches a short-lived ECR token automatically. No image pull secrets, no long-lived credentials stored in Kubernetes secrets.
💡 Pro Tip: Enable the
imagePullPolicy: Alwaysflag in staging environments to catch stale cached images early. In production,IfNotPresentcombined with immutable image tags (commit SHAs, notlatest) gives you both performance and deployment determinism.
When EKS Complexity Pays Off
EKS justifies its operational overhead in specific circumstances: your engineering organization already has Kubernetes expertise and would spend more time re-learning ECS primitives than operating EKS; you have scheduling requirements—GPU node affinity, topology spread constraints, custom schedulers—that ECS task placement strategies can’t express; or portability across cloud providers is a concrete business requirement, not a theoretical hedge.
For teams without existing Kubernetes investment, EKS’s learning curve is real and front-loaded. The next section examines the far end of the control spectrum: running self-managed Kubernetes on EC2, where you own the control plane entirely.
Self-Managed Kubernetes on EC2: Full Control, Full Responsibility
In 2025, the vast majority of teams running Kubernetes on AWS use EKS. But a meaningful minority still manages their own control planes on EC2—and they have specific, defensible reasons for doing so.

Why Teams Still Go Self-Managed
The motivations fall into four categories:
Compliance and data sovereignty. Some regulated industries require that no component of the control plane runs on infrastructure you don’t own. Air-gapped environments in defense, government, or financial services can’t delegate etcd to a managed service.
Custom networking requirements. Teams running DPDK workloads, SR-IOV, or latency-sensitive HPC jobs sometimes need CNI configurations that EKS doesn’t support. When your networking stack is the product, you can’t abstract it away.
Version control. EKS lags Kubernetes upstream by several months. If your platform team builds tooling against specific alpha APIs or needs a version that isn’t yet available in EKS, self-managed is your only path.
Cost at extreme scale. The EKS control plane fee is $0.10/hour per cluster—roughly $876/year. That sounds trivial until you’re running 200 clusters for multi-tenant isolation. At that point, the arithmetic changes.
The Three Realistic Paths
kubeadm is the lowest-level option: you provision EC2 instances, run kubeadm init on a control plane node, and join workers manually. It gives you complete visibility into every configuration decision, but every upgrade, certificate rotation, and etcd backup is a manual operation.
kops automates cluster provisioning and lifecycle management on AWS. It generates Terraform or CloudFormation, manages Auto Scaling Groups for control plane nodes, and handles rolling upgrades. It’s the closest thing to “managed” you get without EKS, but the tooling is opinionated and the operational burden doesn’t disappear—it shifts.
Cluster API is the modern approach: Kubernetes-native cluster lifecycle management where clusters themselves are Kubernetes resources. It integrates cleanly with GitOps workflows and supports multi-cloud topologies, but the learning curve is steep and the AWS provider requires careful tuning.
What You’re Actually Signing Up For
Self-managed Kubernetes means owning the entire operational surface: etcd backups and point-in-time restore procedures, control plane node replacement when an instance fails, TLS certificate rotation before expiry, CNI upgrades coordinated with node drain cycles, and Kubernetes version upgrades that must be tested against your workloads before rollout.
A realistic estimate for a three-node HA control plane is four to eight engineering hours per month in steady state—more during upgrade cycles or incident response.
💡 Pro Tip: Before committing to self-managed, price out the actual engineering cost. At a fully-loaded rate of $150/hour, eight hours per month per cluster costs $14,400/year. EKS at $876/year plus the operational delta is almost always cheaper until you cross 15–20 clusters.
Self-managed Kubernetes is the right call in a narrow set of conditions: air-gapped deployments, exotic hardware requirements, or scale economics that genuinely favor it. For everyone else, the question is whether EKS or ECS better fits their operational model—and that comes down to team maturity and how much Kubernetes surface area you actually need.
Decision Framework: Mapping Team Maturity to Orchestration Choice
The right orchestration choice is not a feature matrix exercise—it is an organizational fit problem. Four axes determine where your team lands: Kubernetes expertise, portability requirements, operational maturity, and workload complexity. Score your team honestly on each before committing.
The Four Decision Axes
Kubernetes expertise is the most decisive axis. ECS has a learning curve measured in days; EKS in weeks; self-managed Kubernetes in months before a team operates it confidently in production. If your platform team has zero Kubernetes background, EKS will consume engineering capacity that ships no product value for the first quarter.
Portability requirements cut the other way. If your architecture runs workloads across GCP or Azure, or if an acquisition forces a hybrid-cloud consolidation in your near-term roadmap, Kubernetes manifests travel. ECS task definitions do not.
Operational maturity determines whether managed control planes are a ceiling or a floor. Teams that have never run a Kubernetes upgrade cycle should not own their own etcd. Teams that need to patch the control plane on a 24-hour SLA for compliance reasons have no choice but to.
Workload complexity covers scheduling requirements: GPU node pools, spot interruption handling, bin-packing across heterogeneous instance types, custom admission webhooks. ECS handles the common case well. When the uncommon case becomes the common case, ECS abstractions become friction.
Decision Matrix at a Glance
| Scenario | Optimal Choice |
|---|---|
| AWS-only, team < 5 engineers | ECS |
| Greenfield, tight 90-day launch | ECS |
| Existing K8s investment, multi-cloud | EKS |
| Advanced scheduling (GPU, spot, affinity) | EKS |
| FIPS compliance, air-gapped environment | Self-managed |
| Hyperscale with aggressive cost targets | Self-managed |
| On-prem Kubernetes extending to AWS | Self-managed |
ECS to EKS: What Migration Actually Costs
The trigger for migrating from ECS to EKS is almost always one of three things: a multi-cloud mandate, the need for custom schedulers, or Kubernetes-native tooling that your SRE team has already standardized on (Argo CD, Keda, Karpenter). When that trigger arrives, the migration is mechanical but non-trivial.
The core translation work maps ECS task definitions to Kubernetes Deployments and Services. A representative migration of a single service looks like this:
#!/usr/bin/env bash## Retrieve the current ECS task definition and stand up an equivalent EKS workload
CLUSTER="production-eks"REGION="us-east-1"ACCOUNT_ID="123456789012"SERVICE="payments-api"
## Export current ECS task definition for referenceaws ecs describe-task-definition \ --task-definition "${SERVICE}" \ --region "${REGION}" \ --output json > "/tmp/${SERVICE}-task-def.json"
## Update kubeconfig for target EKS clusteraws eks update-kubeconfig \ --name "${CLUSTER}" \ --region "${REGION}"
## Apply translated Kubernetes manifestskubectl apply -f "./k8s/${SERVICE}/deployment.yaml"kubectl apply -f "./k8s/${SERVICE}/service.yaml"kubectl apply -f "./k8s/${SERVICE}/hpa.yaml"
## Verify rollout before cutting over DNS/ALB weightskubectl rollout status deployment/"${SERVICE}" -n production💡 Pro Tip: Run ECS and EKS in parallel behind a weighted ALB target group during cutover. Shift 10% of traffic to the EKS target first, validate error rates and p99 latency, then complete the shift. This eliminates the big-bang risk and gives you a one-command rollback.
Budget two to four weeks of engineering time per service cluster for this migration, excluding IAM re-mapping and secret store migration from AWS Secrets Manager to Kubernetes-native External Secrets Operator.
Self-managed Kubernetes is rarely a migration destination from ECS or EKS—it is typically a deliberate greenfield choice made under specific regulatory or cost constraints. Treat it as a separate architectural decision, not an upgrade path.
With your orchestration choice made, the next practical question is how to wire production-grade workloads into it: service mesh configuration, observability pipelines, and database connectivity patterns that work across all three options.
Production Patterns: Service Mesh, Observability, and Database Connectivity
The orchestration platform you choose shapes how you implement production concerns, but it doesn’t eliminate them. Database connectivity, distributed tracing, and image hygiene are non-negotiable regardless of whether your containers run on ECS or EKS. What differs is the tooling surface and the operational overhead each platform imposes.
Database Connectivity: IAM Auth vs. Secrets Manager
Both ECS and EKS support IAM-based authentication to RDS and Aurora, but the implementation paths diverge. On ECS, the task role grants the container an IAM identity automatically—no sidecar, no token rotation logic required. On EKS, you need IRSA (IAM Roles for Service Accounts) to bind a Kubernetes service account to an IAM role before the pod can authenticate.
apiVersion: v1kind: ServiceAccountmetadata: name: api-service namespace: production annotations: eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/rds-iam-auth-role---apiVersion: apps/v1kind: Deploymentmetadata: name: api-service namespace: productionspec: template: spec: serviceAccountName: api-service containers: - name: api image: 123456789012.dkr.ecr.us-east-1.amazonaws.com/api-service:v2.1.0 env: - name: DB_HOST value: "aurora-cluster.cluster-cxyz1234abcd.us-east-1.rds.amazonaws.com" - name: DB_AUTH_MODE value: "iam"For workloads where IAM auth isn’t viable—legacy applications expecting a static password—AWS Secrets Manager with automatic rotation is the correct pattern on both platforms. On EKS, the Secrets Store CSI Driver mounts secrets as volumes, keeping credentials out of environment variables and out of etcd.
💡 Pro Tip: Avoid injecting database credentials as Kubernetes Secrets without envelope encryption. Enable KMS encryption for your EKS cluster’s etcd secrets store before your first production deployment, not after.
Observability: CloudWatch Container Insights vs. the Prometheus Stack
CloudWatch Container Insights deploys as a DaemonSet on EKS and as a built-in agent on ECS Fargate. For teams already invested in the AWS ecosystem, it delivers CPU, memory, network, and storage metrics with zero additional infrastructure. The trade-off is limited cardinality and weaker alerting ergonomics compared to Prometheus.
On EKS, the Prometheus/Grafana stack gives you full metric cardinality, PromQL-based alerting, and ecosystem integrations with tools like Loki and Tempo. The operational cost is real: you own the Prometheus retention strategy, the Alertmanager routing, and the Grafana datasource configuration. Amazon Managed Service for Prometheus (AMP) removes the self-hosting burden while preserving the PromQL interface—a reasonable middle ground for teams that want flexibility without running stateful monitoring workloads.
Service Mesh: When the Complexity Is Justified
AWS App Mesh integrates natively with ECS service discovery and requires no sidecar injection webhooks. It handles mTLS, traffic shaping, and observability at the mesh layer with minimal configuration overhead. For ECS-native shops, App Mesh is the right default.
On EKS, Istio provides a significantly richer feature set—fine-grained traffic policies, WASM extensibility, and deeper observability—but the operational surface is substantial. Linkerd is the pragmatic alternative: lower resource overhead, simpler certificate management, and a smaller attack surface.
Adopt a service mesh when you have more than a handful of services exchanging sensitive data or when you need circuit-breaking and retry logic that your application code doesn’t implement. A two-service deployment doesn’t need Istio.
ECR as a Shared Foundation
ECR lifecycle policies and image scanning apply identically whether your consumer is ECS or EKS. Define lifecycle rules at repository creation to prevent unbounded storage costs, and enable Inspector-based enhanced scanning to catch CVEs before images reach production.
With production patterns established, the final section translates this decision framework into a concrete sequence of steps—from initial infrastructure provisioning to running your first production workload.
Actionable Next Steps: From Decision to Running Workload
You have the framework. Now turn it into a running workload before the analysis paralysis sets in.
Run This Checklist Before Writing a Single Line of Infrastructure Code
Answer these five questions honestly:
- Do you have Kubernetes expertise on the team today? If no one can explain a
PodDisruptionBudgetor debug aCrashLoopBackOffwithout Googling, EKS or ECS Fargate is your starting point—not self-managed. - Are you building for multi-cloud portability in the next 18 months? If yes, invest in EKS. If no, ECS removes an entire category of operational overhead.
- How many services are you running? Under ten services with straightforward traffic patterns is ECS territory. Over fifty with complex routing, canary deployments, and cross-service auth—EKS earns its complexity.
- What does your on-call rotation look like? Self-managed Kubernetes requires engineers who are willing to own 3am control plane alerts.
- What’s your compliance posture? Specific data residency or audit requirements sometimes mandate self-managed, but verify this before assuming it.
Bootstrap Your Choice This Week
Each option has a fast path to a working cluster:
- EKS:
eksctl create cluster --name my-cluster --region us-east-1 --nodegroup-name standard-workers --node-type t3.medium --nodes 3gets you a production-grade cluster in under 20 minutes. - ECS: The AWS CDK
aws-ecs-patternslibrary provides battle-tested constructs—ApplicationLoadBalancedFargateServicedeploys a load-balanced service in roughly 40 lines of TypeScript. - Self-managed: Cluster API on EC2 with the CAPA (Cluster API Provider AWS) provider gives you reproducible, GitOps-friendly cluster lifecycle management from day one.
The One Trap to Avoid
Teams routinely spend three weeks debating orchestration while their containerization project stalls entirely. Pick the option that fits your current team maturity, ship something to production, and treat the decision as revisable. Organizations graduate from ECS to EKS as their Kubernetes fluency grows—this migration path is well-worn and tooled. The worst container strategy is the one that never runs.
💡 Pro Tip: Set a calendar reminder for six months out to re-evaluate your choice against team growth, service count, and operational load. The right answer at 5 engineers and 8 services is rarely the right answer at 25 engineers and 60 services.
The preceding sections gave you the architectural depth to understand why these trade-offs exist—the production patterns for service mesh, observability, and database connectivity apply regardless of which orchestrator you choose.
Key Takeaways
- Default to ECS with Fargate if your stack is AWS-only and your team lacks dedicated Kubernetes expertise—you’ll ship faster and operate less infrastructure.
- Choose EKS when portability across clouds or on-prem is a real requirement, or when your org already has Kubernetes investment that would cost more to abandon than to manage.
- Run the actual TCO calculation before self-managing Kubernetes on EC2: include engineering hours for upgrades, etcd ops, and certificate rotation, not just EC2 instance costs.
- Use IRSA (IAM Roles for Service Accounts) on EKS to authenticate pods to AWS services—never mount long-lived credentials in container images or environment variables.
- Treat your container orchestration choice as revisable: start with the simplest option that meets your current requirements, and migrate when a specific capability gap forces the conversation.