Hardening Container Images: A Defense-in-Depth Approach for Production Workloads
Your container passed the vulnerability scan, yet attackers still compromised your production cluster. The scan found zero CVEs, but the image ran as root, included a shell, and exposed unnecessary capabilities—three attack vectors that scanners routinely miss. This scenario plays out in production environments more often than security teams want to admit.
Container security requires defense in depth: multiple overlapping controls that assume any single layer will fail. A vulnerability scanner catches known CVEs. A minimal base image eliminates attack tools. Non-root execution prevents privilege escalation. Runtime policies enforce security even when CI/CD gates are bypassed. Each layer compensates for gaps in the others.
This guide walks through building hardened container images from the ground up, integrating security gates into your CI/CD pipeline, and enforcing policies at runtime. You’ll leave with a concrete checklist and working examples you can adapt to your own infrastructure.
Why Vulnerability Scanning Alone Fails
Security teams love vulnerability scanners because they produce quantifiable results. “Zero high-severity CVEs” looks great in a compliance report. But this metric creates dangerous false confidence.
CVE databases lag behind real-world exploits by weeks or months. The median time from vulnerability discovery to CVE publication is 35 days. Attackers actively exploit this window. Your scanner reports a clean bill of health while known exploits circulate in the wild.
More fundamentally, scanners focus on known vulnerabilities in known packages. They miss entire categories of security issues:
Configuration weaknesses: A container running as root with all Linux capabilities enabled presents a massive attack surface—but triggers zero CVE alerts. The CAP_SYS_ADMIN capability alone grants near-complete control over the host system.
Supply chain attacks: Scanners check package versions against vulnerability databases. They don’t detect malicious code injected into otherwise legitimate packages. The xz utils backdoor in 2024 demonstrated how sophisticated actors can compromise build infrastructure.
Unnecessary attack surface: Your Python application ships with curl, wget, a shell, and a package manager. An attacker who gains code execution can use these tools to download additional malware, establish persistence, or pivot to other systems. Scanners won’t flag these as vulnerabilities.
Runtime behavior: Static analysis can’t predict how an application will behave. A container with minimal CVEs can still make dangerous system calls, access sensitive file paths, or establish unexpected network connections.
The pattern is consistent: organizations achieve “zero critical CVEs” in their scanning dashboards while leaving fundamental security gaps unaddressed. Scanning is necessary but nowhere near sufficient.
A robust container security posture requires controls at every layer: build-time hardening, CI/CD policy enforcement, and runtime protection. Each layer catches issues the others miss.
Building Minimal Base Images That Attackers Can’t Exploit
The most effective security control is removing attack surface entirely. An attacker can’t exploit a shell that doesn’t exist. They can’t download tools through a package manager that isn’t installed.
Choosing Your Base Image
Three primary options exist for minimal base images:
| Base Type | Size | Contents | Best For |
|---|---|---|---|
| scratch | 0 MB | Nothing | Statically compiled Go binaries |
| distroless | 2-20 MB | Runtime only (libc, SSL certs) | Python, Java, Node.js applications |
| Alpine | 5 MB | musl libc, busybox, apk | Apps requiring minimal tooling |
Scratch contains literally nothing—no shell, no libc, no SSL certificates. Use it for statically compiled Go binaries that embed all dependencies.
Distroless images from Google include only the runtime your application needs. The Python distroless image contains the Python interpreter and standard library but no shell, package manager, or debugging tools. This is the sweet spot for most production workloads.
Alpine provides a minimal Linux userland with a package manager. It’s useful when you need to install additional packages but want to keep the image small. However, that package manager becomes an attack vector—consider removing apk after installation.
Multi-Stage Builds
Multi-stage builds separate your build environment from your runtime environment. Build tools, compilers, and development dependencies never reach production.
# Build stage: includes all build dependenciesFROM python:3.12-slim AS builder
WORKDIR /app
# Install build dependenciesRUN apt-get update && apt-get install -y --no-install-recommends \ gcc \ libpq-dev \ && rm -rf /var/lib/apt/lists/*
# Create virtual environmentRUN python -m venv /opt/venvENV PATH="/opt/venv/bin:$PATH"
# Install Python dependenciesCOPY requirements.txt .RUN pip install --no-cache-dir -r requirements.txt
# Copy application codeCOPY src/ ./src/
# Runtime stage: minimal image with only what's neededFROM gcr.io/distroless/python3-debian12
WORKDIR /app
# Copy virtual environment from builderCOPY --from=builder /opt/venv /opt/venv
# Copy application codeCOPY --from=builder /app/src ./src
# Set environmentENV PATH="/opt/venv/bin:$PATH"ENV PYTHONUNBUFFERED=1
# Run as non-root user (distroless default is nonroot)USER nonroot
# Application entrypointENTRYPOINT ["python", "-m", "src.main"]This Dockerfile produces an image with no shell, no package manager, no gcc, and no apt. An attacker who achieves code execution has no tools to work with.
Removing Attack Tools
If you must use a base image with a shell (sometimes necessary for debugging during development), explicitly remove dangerous tools before shipping to production:
FROM python:3.12-alpine AS builder# ... build steps ...
FROM python:3.12-alpine AS production
# Copy application from builderCOPY --from=builder /app /app
# Remove attack surfaceRUN apk del apk-tools && \ rm -rf /sbin/apk /usr/share/apk /etc/apk /var/cache/apk && \ rm -f /bin/sh /bin/ash /bin/busybox && \ rm -f /usr/bin/wget /usr/bin/curl
USER 10001:10001ENTRYPOINT ["python", "/app/main.py"]⚠️ Warning: Removing the shell makes debugging extremely difficult. Use this approach only for production images, and maintain a separate debug variant for troubleshooting.
Non-Root Containers and Linux Capability Restrictions
Running containers as root is the default—and the default is dangerous. A root process inside a container can exploit kernel vulnerabilities to escape to the host. Even without an escape, root can modify files owned by other processes, access secrets in environment variables, and interfere with other containers sharing the same node.
Explicit User Configuration
Define a non-root user in your Dockerfile and switch to it before the entrypoint:
FROM node:20-slim
# Create application directoryWORKDIR /app
# Create non-root user with explicit UID/GIDRUN groupadd --gid 10001 appgroup && \ useradd --uid 10001 --gid appgroup --shell /usr/sbin/nologin appuser
# Install dependencies as root (needs write access to node_modules)COPY package*.json ./RUN npm ci --only=production && npm cache clean --force
# Copy application codeCOPY --chown=appuser:appgroup src/ ./src/
# Switch to non-root userUSER 10001:10001
# Expose non-privileged portEXPOSE 8080
CMD ["node", "src/server.js"]Use numeric UIDs rather than usernames. Some base images don’t include /etc/passwd, causing username lookups to fail.
Dropping Linux Capabilities
Linux capabilities divide root privileges into distinct units. By default, Docker grants containers a subset of capabilities including CAP_CHOWN, CAP_SETUID, and CAP_NET_BIND_SERVICE. Your application probably needs none of these.
In Kubernetes, specify security context at both pod and container levels:
apiVersion: apps/v1kind: Deploymentmetadata: name: secure-app namespace: productionspec: replicas: 3 selector: matchLabels: app: secure-app template: metadata: labels: app: secure-app spec: securityContext: runAsNonRoot: true runAsUser: 10001 runAsGroup: 10001 fsGroup: 10001 seccompProfile: type: RuntimeDefault containers: - name: app image: myregistry/secure-app:v1.2.3 securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true capabilities: drop: - ALL resources: limits: memory: "256Mi" cpu: "500m" requests: memory: "128Mi" cpu: "100m" ports: - containerPort: 8080 protocol: TCP volumeMounts: - name: tmp mountPath: /tmp - name: cache mountPath: /app/.cache volumes: - name: tmp emptyDir: {} - name: cache emptyDir: {}Key security settings in this manifest:
runAsNonRoot: trueprevents the container from running as UID 0, even if the Dockerfile specifies rootallowPrivilegeEscalation: falseblocks setuid binaries and ptrace-based attacksreadOnlyRootFilesystem: trueprevents attackers from writing malicious scriptscapabilities.drop: [ALL]removes every Linux capabilityseccompProfile.type: RuntimeDefaultapplies the container runtime’s default seccomp profile, blocking dangerous system calls
📝 Note: The
readOnlyRootFilesystemsetting requires mounting writable volumes for directories your application writes to—typically/tmpand application-specific cache directories.
Custom Seccomp Profiles
The default seccomp profile blocks approximately 44 dangerous system calls. For highly sensitive workloads, create a custom profile that allows only the specific syscalls your application uses:
{ "defaultAction": "SCMP_ACT_ERRNO", "architectures": ["SCMP_ARCH_X86_64"], "syscalls": [ { "names": [ "read", "write", "close", "fstat", "mmap", "mprotect", "munmap", "brk", "rt_sigaction", "rt_sigprocmask", "ioctl", "access", "pipe", "select", "sched_yield", "dup2", "getpid", "socket", "connect", "accept", "sendto", "recvfrom", "bind", "listen", "getsockname", "getpeername", "setsockopt", "getsockopt", "clone", "execve", "exit", "wait4", "fcntl", "getdents64", "getcwd", "chdir", "openat", "newfstatat", "futex", "epoll_create1", "epoll_ctl", "epoll_wait", "exit_group" ], "action": "SCMP_ACT_ALLOW" } ]}Generate application-specific profiles using tools like strace or Falco to trace syscalls during normal operation.
Automating Security Gates in Your CI/CD Pipeline
Manual security reviews don’t scale. By the time a human reviews a configuration, the code has already shipped. Automated gates catch issues at the point of commit—before they reach production.
Vulnerability Scanning with Trivy
Trivy scans container images, filesystems, and Git repositories for vulnerabilities and misconfigurations. Integrate it into GitHub Actions to fail builds that introduce high-severity CVEs:
name: Container Security Scan
on: push: branches: [main] pull_request: branches: [main]
env: REGISTRY: ghcr.io IMAGE_NAME: ${{ github.repository }}
jobs: build-and-scan: runs-on: ubuntu-latest permissions: contents: read packages: write security-events: write
steps: - name: Checkout repository uses: actions/checkout@v4
- name: Set up Docker Buildx uses: docker/setup-buildx-action@v3
- name: Build image for scanning uses: docker/build-push-action@v5 with: context: . load: true tags: ${{ env.IMAGE_NAME }}:scan cache-from: type=gha cache-to: type=gha,mode=max
- name: Run Trivy vulnerability scanner uses: aquasecurity/trivy-action@master with: image-ref: ${{ env.IMAGE_NAME }}:scan format: 'sarif' output: 'trivy-results.sarif' severity: 'CRITICAL,HIGH' exit-code: '1'
- name: Upload Trivy scan results uses: github/codeql-action/upload-sarif@v3 if: always() with: sarif_file: 'trivy-results.sarif'
- name: Run Trivy config scanner uses: aquasecurity/trivy-action@master with: scan-type: 'config' scan-ref: '.' exit-code: '1' severity: 'CRITICAL,HIGH'This workflow scans both the built image for CVEs and the repository for misconfigurations (insecure Dockerfiles, exposed secrets in Kubernetes manifests).
Policy Enforcement with OPA Conftest
Vulnerability scanners check for known CVEs. Policy engines enforce organizational standards: no root containers, required resource limits, approved base images.
policy-check: runs-on: ubuntu-latest steps: - name: Checkout repository uses: actions/checkout@v4
- name: Setup Conftest uses: instrumenta/conftest-action@master with: version: '0.46.0'
- name: Test Dockerfile policies run: conftest test Dockerfile --policy policies/
- name: Test Kubernetes manifest policies run: conftest test k8s/ --policy policies/Define policies in Rego, OPA’s policy language:
package main
# Deny containers running as rootdeny[msg] { input.kind == "Deployment" container := input.spec.template.spec.containers[_] not container.securityContext.runAsNonRoot msg := sprintf("Container '%s' must set runAsNonRoot: true", [container.name])}
# Deny containers without resource limitsdeny[msg] { input.kind == "Deployment" container := input.spec.template.spec.containers[_] not container.resources.limits msg := sprintf("Container '%s' must specify resource limits", [container.name])}
# Deny containers with privileged security contextdeny[msg] { input.kind == "Deployment" container := input.spec.template.spec.containers[_] container.securityContext.privileged == true msg := sprintf("Container '%s' must not run as privileged", [container.name])}
# Require approved base imagesdeny[msg] { input.kind == "Dockerfile" not startswith(input.stages[0].from, "gcr.io/distroless/") not startswith(input.stages[0].from, "cgr.dev/chainguard/") msg := "Base image must be from distroless or chainguard"}Image Signing with Cosign
Supply chain attacks target the path between your build system and production. An attacker who compromises your registry can replace legitimate images with malicious ones. Cryptographic signing ensures images haven’t been tampered with.
sign-and-push: needs: [build-and-scan, policy-check] runs-on: ubuntu-latest permissions: contents: read packages: write id-token: write # Required for keyless signing
steps: - name: Checkout repository uses: actions/checkout@v4
- name: Install Cosign uses: sigstore/cosign-installer@v3
- name: Log in to registry uses: docker/login-action@v3 with: registry: ${{ env.REGISTRY }} username: ${{ github.actor }} password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push image id: build uses: docker/build-push-action@v5 with: context: . push: true tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
- name: Sign image with Cosign env: COSIGN_EXPERIMENTAL: "true" run: | cosign sign --yes \ ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}@${{ steps.build.outputs.digest }}Configure your Kubernetes cluster to verify signatures before pulling images using a policy engine like Kyverno or Sigstore’s policy controller.
Runtime Security with Pod Security Standards
CI/CD gates prevent insecure configurations from reaching production—in theory. In practice, engineers bypass pipelines during incidents, deployments fail silently, and manual kubectl apply commands circumvent all automation. Runtime enforcement catches what CI/CD misses.
Pod Security Admission
Kubernetes 1.25+ includes Pod Security Admission (PSA), a built-in admission controller that enforces Pod Security Standards. Three profiles provide increasing levels of restriction:
| Profile | Purpose | Key Restrictions |
|---|---|---|
| Privileged | Unrestricted | None—allows everything |
| Baseline | Minimal restrictions | Blocks hostNetwork, hostPID, privileged containers |
| Restricted | Hardened | Requires non-root, drops capabilities, blocks privilege escalation |
Apply PSA at the namespace level using labels:
apiVersion: v1kind: Namespacemetadata: name: production labels: # Enforce restricted profile - reject non-compliant pods pod-security.kubernetes.io/enforce: restricted pod-security.kubernetes.io/enforce-version: latest
# Warn on baseline violations (useful during migration) pod-security.kubernetes.io/warn: restricted pod-security.kubernetes.io/warn-version: latest
# Audit all violations to cluster logs pod-security.kubernetes.io/audit: restricted pod-security.kubernetes.io/audit-version: latestThe restricted profile requires:
- Running as non-root
- Dropping all capabilities (only
NET_BIND_SERVICEcan be added back) - Read-only root filesystem
- No privilege escalation
- Seccomp profile set to RuntimeDefault or Localhost
💡 Pro Tip: Start with
warnmode to identify non-compliant workloads, then switch toenforceafter fixing violations. The audit logs show exactly which pods would be rejected.
Handling Exceptions
Some workloads legitimately require elevated privileges—CNI plugins, log collectors, security agents. Create dedicated namespaces with appropriate PSA profiles rather than weakening security cluster-wide:
apiVersion: v1kind: Namespacemetadata: name: kube-system-extensions labels: # Baseline allows necessary privileges while blocking the worst offenders pod-security.kubernetes.io/enforce: baseline pod-security.kubernetes.io/warn: restricted---apiVersion: v1kind: Namespacemetadata: name: monitoring labels: pod-security.kubernetes.io/enforce: baseline pod-security.kubernetes.io/warn: restrictedDocument why each exception exists. Review exceptions quarterly—system components often reduce their privilege requirements in newer versions.
Network Policies and Secrets Management for Container Isolation
A compromised container is contained only if network segmentation limits its blast radius. Default Kubernetes networking allows any pod to communicate with any other pod—an attacker who compromises one service can probe the entire cluster.
Default-Deny Network Policies
Start with a default-deny policy that blocks all ingress and egress, then explicitly whitelist required connections:
apiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: default-deny-all namespace: productionspec: podSelector: {} # Applies to all pods in namespace policyTypes: - Ingress - Egress
---apiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: api-server-policy namespace: productionspec: podSelector: matchLabels: app: api-server policyTypes: - Ingress - Egress ingress: # Allow traffic from ingress controller - from: - namespaceSelector: matchLabels: name: ingress-nginx podSelector: matchLabels: app: ingress-nginx ports: - protocol: TCP port: 8080 egress: # Allow connections to PostgreSQL - to: - podSelector: matchLabels: app: postgresql ports: - protocol: TCP port: 5432 # Allow DNS resolution - to: - namespaceSelector: {} podSelector: matchLabels: k8s-app: kube-dns ports: - protocol: UDP port: 53⚠️ Warning: Forgetting to allow DNS (port 53 to kube-dns) is the most common network policy mistake. Your pods will fail to resolve any hostnames, including internal service names.
Secrets Management
Kubernetes secrets stored in etcd are base64-encoded, not encrypted. Anyone with API access can read them. Environment variables appear in process listings and crash dumps. Production secrets require better protection.
Mount secrets as files instead of environment variables:
apiVersion: v1kind: Podmetadata: name: app-with-secretsspec: containers: - name: app image: myregistry/app:v1.0.0 volumeMounts: - name: db-credentials mountPath: /secrets/db readOnly: true - name: api-keys mountPath: /secrets/api readOnly: true volumes: - name: db-credentials secret: secretName: database-credentials defaultMode: 0400 # Read-only for owner - name: api-keys secret: secretName: external-api-keys defaultMode: 0400Use external secrets operators to sync secrets from dedicated secret managers:
apiVersion: external-secrets.io/v1beta1kind: ExternalSecretmetadata: name: database-credentials namespace: productionspec: refreshInterval: 1h secretStoreRef: name: aws-secrets-manager kind: ClusterSecretStore target: name: database-credentials creationPolicy: Owner data: - secretKey: username remoteRef: key: production/database property: username - secretKey: password remoteRef: key: production/database property: passwordThis approach keeps secrets out of your Git repository, etcd, and Kubernetes API—they’re fetched directly from AWS Secrets Manager (or Vault, GCP Secret Manager, etc.) at runtime.
Putting It All Together: A Security Checklist for Production
Security is an ongoing process, not a one-time implementation. Use this checklist before every production deployment and during quarterly security reviews.
Pre-Deployment Checklist
Build Phase:
- Base image is distroless, chainguard, or scratch
- Multi-stage build separates build and runtime dependencies
- No shells, package managers, or debugging tools in final image
- Dockerfile includes explicit USER directive with numeric UID
- Image tagged with immutable digest, not
latest
Scan Phase:
- Trivy or Grype scan passes with zero high/critical CVEs
- OPA Conftest validates Dockerfile and Kubernetes manifests
- SBOM generated and stored for audit trail
Sign Phase:
- Image signed with Cosign using keyless signing
- Signature verification configured in cluster admission policy
Deploy Phase:
- Pod security context specifies runAsNonRoot, drops all capabilities
- Read-only root filesystem enabled
- Resource limits defined for CPU and memory
- Network policies restrict ingress and egress
- Secrets mounted as files from external secrets operator
Monitoring and Alerting
Security controls only work if you know when they’re violated:
- Audit logs: Kubernetes audit logs capture all API requests. Alert on pod creations that trigger PSA warnings.
- Runtime detection: Tools like Falco detect anomalous behavior—shells spawned in containers, unexpected network connections, sensitive file access.
- Image drift: Alert when running images don’t match signed manifests in your deployment repository.
Incident Response Considerations
When a container is compromised:
- Isolate immediately: Apply a network policy that blocks all egress from the affected pod
- Preserve evidence: Don’t delete the pod—snapshot its filesystem and memory for forensics
- Check lateral movement: Review network flow logs for connections to other pods
- Rotate secrets: Assume any secret mounted in the container is compromised
- Rebuild from source: Don’t trust the running image—rebuild and redeploy from your verified source
Key Takeaways
- Start every Dockerfile FROM a distroless or scratch base and use multi-stage builds to exclude build tools
- Add USER directives with explicit non-root UIDs and drop all Linux capabilities except those explicitly required
- Implement automated security gates using Trivy, OPA Conftest, and Cosign in your CI/CD pipeline
- Enable Kubernetes Pod Security Admission with the ‘restricted’ profile as your default enforcement level
- Apply default-deny NetworkPolicies to every namespace and whitelist only required pod-to-pod communication