Feb 3, 2026

Hardening Container Images: A Defense-in-Depth Approach for Production Workloads

Your container passed the vulnerability scan, yet attackers still compromised your production cluster. The scan found zero CVEs, but the image ran as root, included a shell, and exposed unnecessary capabilities—three attack vectors that scanners routinely miss. This scenario plays out in production environments more often than security teams want to admit.

Container security requires defense in depth: multiple overlapping controls that assume any single layer will fail. A vulnerability scanner catches known CVEs. A minimal base image eliminates attack tools. Non-root execution prevents privilege escalation. Runtime policies enforce security even when CI/CD gates are bypassed. Each layer compensates for gaps in the others.

This guide walks through building hardened container images from the ground up, integrating security gates into your CI/CD pipeline, and enforcing policies at runtime. You’ll leave with a concrete checklist and working examples you can adapt to your own infrastructure.

Why Vulnerability Scanning Alone Fails

Security teams love vulnerability scanners because they produce quantifiable results. “Zero high-severity CVEs” looks great in a compliance report. But this metric creates dangerous false confidence.

CVE databases lag behind real-world exploits by weeks or months. The median time from vulnerability discovery to CVE publication is 35 days. Attackers actively exploit this window. Your scanner reports a clean bill of health while known exploits circulate in the wild.

More fundamentally, scanners focus on known vulnerabilities in known packages. They miss entire categories of security issues:

Configuration weaknesses: A container running as root with all Linux capabilities enabled presents a massive attack surface—but triggers zero CVE alerts. The CAP_SYS_ADMIN capability alone grants near-complete control over the host system.

Supply chain attacks: Scanners check package versions against vulnerability databases. They don’t detect malicious code injected into otherwise legitimate packages. The xz utils backdoor in 2024 demonstrated how sophisticated actors can compromise build infrastructure.

Unnecessary attack surface: Your Python application ships with curl, wget, a shell, and a package manager. An attacker who gains code execution can use these tools to download additional malware, establish persistence, or pivot to other systems. Scanners won’t flag these as vulnerabilities.

Runtime behavior: Static analysis can’t predict how an application will behave. A container with minimal CVEs can still make dangerous system calls, access sensitive file paths, or establish unexpected network connections.

The pattern is consistent: organizations achieve “zero critical CVEs” in their scanning dashboards while leaving fundamental security gaps unaddressed. Scanning is necessary but nowhere near sufficient.

A robust container security posture requires controls at every layer: build-time hardening, CI/CD policy enforcement, and runtime protection. Each layer catches issues the others miss.

Building Minimal Base Images That Attackers Can’t Exploit

The most effective security control is removing attack surface entirely. An attacker can’t exploit a shell that doesn’t exist. They can’t download tools through a package manager that isn’t installed.

Choosing Your Base Image

Three primary options exist for minimal base images:

Base Type	Size	Contents	Best For
scratch	0 MB	Nothing	Statically compiled Go binaries
distroless	2-20 MB	Runtime only (libc, SSL certs)	Python, Java, Node.js applications
Alpine	5 MB	musl libc, busybox, apk	Apps requiring minimal tooling

Scratch contains literally nothing—no shell, no libc, no SSL certificates. Use it for statically compiled Go binaries that embed all dependencies.

Distroless images from Google include only the runtime your application needs. The Python distroless image contains the Python interpreter and standard library but no shell, package manager, or debugging tools. This is the sweet spot for most production workloads.

Alpine provides a minimal Linux userland with a package manager. It’s useful when you need to install additional packages but want to keep the image small. However, that package manager becomes an attack vector—consider removing apk after installation.

Multi-Stage Builds

Multi-stage builds separate your build environment from your runtime environment. Build tools, compilers, and development dependencies never reach production.

# Build stage: includes all build dependencies
FROM python:3.12-slim AS builder

WORKDIR /app

# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc \
    libpq-dev \
    && rm -rf /var/lib/apt/lists/*

# Create virtual environment
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY src/ ./src/

# Runtime stage: minimal image with only what's needed
FROM gcr.io/distroless/python3-debian12

WORKDIR /app

# Copy virtual environment from builder
COPY --from=builder /opt/venv /opt/venv

# Copy application code
COPY --from=builder /app/src ./src

# Set environment
ENV PATH="/opt/venv/bin:$PATH"
ENV PYTHONUNBUFFERED=1

# Run as non-root user (distroless default is nonroot)
USER nonroot

# Application entrypoint
ENTRYPOINT ["python", "-m", "src.main"]

This Dockerfile produces an image with no shell, no package manager, no gcc, and no apt. An attacker who achieves code execution has no tools to work with.

Removing Attack Tools

If you must use a base image with a shell (sometimes necessary for debugging during development), explicitly remove dangerous tools before shipping to production:

FROM python:3.12-alpine AS builder
# ... build steps ...

FROM python:3.12-alpine AS production

# Copy application from builder
COPY --from=builder /app /app

# Remove attack surface
RUN apk del apk-tools && \
    rm -rf /sbin/apk /usr/share/apk /etc/apk /var/cache/apk && \
    rm -f /bin/sh /bin/ash /bin/busybox && \
    rm -f /usr/bin/wget /usr/bin/curl

USER 10001:10001
ENTRYPOINT ["python", "/app/main.py"]

⚠️ Warning: Removing the shell makes debugging extremely difficult. Use this approach only for production images, and maintain a separate debug variant for troubleshooting.

Non-Root Containers and Linux Capability Restrictions

Running containers as root is the default—and the default is dangerous. A root process inside a container can exploit kernel vulnerabilities to escape to the host. Even without an escape, root can modify files owned by other processes, access secrets in environment variables, and interfere with other containers sharing the same node.

Explicit User Configuration

Define a non-root user in your Dockerfile and switch to it before the entrypoint:

FROM node:20-slim

# Create application directory
WORKDIR /app

# Create non-root user with explicit UID/GID
RUN groupadd --gid 10001 appgroup && \
    useradd --uid 10001 --gid appgroup --shell /usr/sbin/nologin appuser

# Install dependencies as root (needs write access to node_modules)
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force

# Copy application code
COPY --chown=appuser:appgroup src/ ./src/

# Switch to non-root user
USER 10001:10001

# Expose non-privileged port
EXPOSE 8080

CMD ["node", "src/server.js"]

Use numeric UIDs rather than usernames. Some base images don’t include /etc/passwd, causing username lookups to fail.

Dropping Linux Capabilities

Linux capabilities divide root privileges into distinct units. By default, Docker grants containers a subset of capabilities including CAP_CHOWN, CAP_SETUID, and CAP_NET_BIND_SERVICE. Your application probably needs none of these.

In Kubernetes, specify security context at both pod and container levels:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: secure-app
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: secure-app
  template:
    metadata:
      labels:
        app: secure-app
    spec:
      securityContext:
        runAsNonRoot: true
        runAsUser: 10001
        runAsGroup: 10001
        fsGroup: 10001
        seccompProfile:
          type: RuntimeDefault
      containers:
        - name: app
          image: myregistry/secure-app:v1.2.3
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            capabilities:
              drop:
                - ALL
          resources:
            limits:
              memory: "256Mi"
              cpu: "500m"
            requests:
              memory: "128Mi"
              cpu: "100m"
          ports:
            - containerPort: 8080
              protocol: TCP
          volumeMounts:
            - name: tmp
              mountPath: /tmp
            - name: cache
              mountPath: /app/.cache
      volumes:
        - name: tmp
          emptyDir: {}
        - name: cache
          emptyDir: {}

Key security settings in this manifest:

runAsNonRoot: true prevents the container from running as UID 0, even if the Dockerfile specifies root
allowPrivilegeEscalation: false blocks setuid binaries and ptrace-based attacks
readOnlyRootFilesystem: true prevents attackers from writing malicious scripts
capabilities.drop: [ALL] removes every Linux capability
seccompProfile.type: RuntimeDefault applies the container runtime’s default seccomp profile, blocking dangerous system calls

📝 Note: The readOnlyRootFilesystem setting requires mounting writable volumes for directories your application writes to—typically /tmp and application-specific cache directories.

Custom Seccomp Profiles

The default seccomp profile blocks approximately 44 dangerous system calls. For highly sensitive workloads, create a custom profile that allows only the specific syscalls your application uses:

{
  "defaultAction": "SCMP_ACT_ERRNO",
  "architectures": ["SCMP_ARCH_X86_64"],
  "syscalls": [
    {
      "names": [
        "read", "write", "close", "fstat", "mmap",
        "mprotect", "munmap", "brk", "rt_sigaction",
        "rt_sigprocmask", "ioctl", "access", "pipe",
        "select", "sched_yield", "dup2", "getpid",
        "socket", "connect", "accept", "sendto",
        "recvfrom", "bind", "listen", "getsockname",
        "getpeername", "setsockopt", "getsockopt",
        "clone", "execve", "exit", "wait4", "fcntl",
        "getdents64", "getcwd", "chdir", "openat",
        "newfstatat", "futex", "epoll_create1",
        "epoll_ctl", "epoll_wait", "exit_group"
      ],
      "action": "SCMP_ACT_ALLOW"
    }
  ]
}

Generate application-specific profiles using tools like strace or Falco to trace syscalls during normal operation.

Automating Security Gates in Your CI/CD Pipeline

Manual security reviews don’t scale. By the time a human reviews a configuration, the code has already shipped. Automated gates catch issues at the point of commit—before they reach production.

Vulnerability Scanning with Trivy

Trivy scans container images, filesystems, and Git repositories for vulnerabilities and misconfigurations. Integrate it into GitHub Actions to fail builds that introduce high-severity CVEs:

name: Container Security Scan

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  build-and-scan:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
      security-events: write

    steps:
      - name: Checkout repository
        uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Build image for scanning
        uses: docker/build-push-action@v5
        with:
          context: .
          load: true
          tags: ${{ env.IMAGE_NAME }}:scan
          cache-from: type=gha
          cache-to: type=gha,mode=max

      - name: Run Trivy vulnerability scanner
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: ${{ env.IMAGE_NAME }}:scan
          format: 'sarif'
          output: 'trivy-results.sarif'
          severity: 'CRITICAL,HIGH'
          exit-code: '1'

      - name: Upload Trivy scan results
        uses: github/codeql-action/upload-sarif@v3
        if: always()
        with:
          sarif_file: 'trivy-results.sarif'

      - name: Run Trivy config scanner
        uses: aquasecurity/trivy-action@master
        with:
          scan-type: 'config'
          scan-ref: '.'
          exit-code: '1'
          severity: 'CRITICAL,HIGH'

This workflow scans both the built image for CVEs and the repository for misconfigurations (insecure Dockerfiles, exposed secrets in Kubernetes manifests).

Policy Enforcement with OPA Conftest

Vulnerability scanners check for known CVEs. Policy engines enforce organizational standards: no root containers, required resource limits, approved base images.

  policy-check:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4

      - name: Setup Conftest
        uses: instrumenta/conftest-action@master
        with:
          version: '0.46.0'

      - name: Test Dockerfile policies
        run: conftest test Dockerfile --policy policies/

      - name: Test Kubernetes manifest policies
        run: conftest test k8s/ --policy policies/

Define policies in Rego, OPA’s policy language:

package main

# Deny containers running as root
deny[msg] {
    input.kind == "Deployment"
    container := input.spec.template.spec.containers[_]
    not container.securityContext.runAsNonRoot
    msg := sprintf("Container '%s' must set runAsNonRoot: true", [container.name])
}

# Deny containers without resource limits
deny[msg] {
    input.kind == "Deployment"
    container := input.spec.template.spec.containers[_]
    not container.resources.limits
    msg := sprintf("Container '%s' must specify resource limits", [container.name])
}

# Deny containers with privileged security context
deny[msg] {
    input.kind == "Deployment"
    container := input.spec.template.spec.containers[_]
    container.securityContext.privileged == true
    msg := sprintf("Container '%s' must not run as privileged", [container.name])
}

# Require approved base images
deny[msg] {
    input.kind == "Dockerfile"
    not startswith(input.stages[0].from, "gcr.io/distroless/")
    not startswith(input.stages[0].from, "cgr.dev/chainguard/")
    msg := "Base image must be from distroless or chainguard"
}

Image Signing with Cosign

Supply chain attacks target the path between your build system and production. An attacker who compromises your registry can replace legitimate images with malicious ones. Cryptographic signing ensures images haven’t been tampered with.

  sign-and-push:
    needs: [build-and-scan, policy-check]
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
      id-token: write  # Required for keyless signing

    steps:
      - name: Checkout repository
        uses: actions/checkout@v4

      - name: Install Cosign
        uses: sigstore/cosign-installer@v3

      - name: Log in to registry
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Build and push image
        id: build
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}

      - name: Sign image with Cosign
        env:
          COSIGN_EXPERIMENTAL: "true"
        run: |
          cosign sign --yes \
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}@${{ steps.build.outputs.digest }}

Configure your Kubernetes cluster to verify signatures before pulling images using a policy engine like Kyverno or Sigstore’s policy controller.

Runtime Security with Pod Security Standards

CI/CD gates prevent insecure configurations from reaching production—in theory. In practice, engineers bypass pipelines during incidents, deployments fail silently, and manual kubectl apply commands circumvent all automation. Runtime enforcement catches what CI/CD misses.

Pod Security Admission

Kubernetes 1.25+ includes Pod Security Admission (PSA), a built-in admission controller that enforces Pod Security Standards. Three profiles provide increasing levels of restriction:

Profile	Purpose	Key Restrictions
Privileged	Unrestricted	None—allows everything
Baseline	Minimal restrictions	Blocks hostNetwork, hostPID, privileged containers
Restricted	Hardened	Requires non-root, drops capabilities, blocks privilege escalation

Apply PSA at the namespace level using labels:

apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    # Enforce restricted profile - reject non-compliant pods
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest

    # Warn on baseline violations (useful during migration)
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: latest

    # Audit all violations to cluster logs
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/audit-version: latest

The restricted profile requires:

Running as non-root
Dropping all capabilities (only NET_BIND_SERVICE can be added back)
Read-only root filesystem
No privilege escalation
Seccomp profile set to RuntimeDefault or Localhost

💡 Pro Tip: Start with warn mode to identify non-compliant workloads, then switch to enforce after fixing violations. The audit logs show exactly which pods would be rejected.

Handling Exceptions

Some workloads legitimately require elevated privileges—CNI plugins, log collectors, security agents. Create dedicated namespaces with appropriate PSA profiles rather than weakening security cluster-wide:

apiVersion: v1
kind: Namespace
metadata:
  name: kube-system-extensions
  labels:
    # Baseline allows necessary privileges while blocking the worst offenders
    pod-security.kubernetes.io/enforce: baseline
    pod-security.kubernetes.io/warn: restricted
---
apiVersion: v1
kind: Namespace
metadata:
  name: monitoring
  labels:
    pod-security.kubernetes.io/enforce: baseline
    pod-security.kubernetes.io/warn: restricted

Document why each exception exists. Review exceptions quarterly—system components often reduce their privilege requirements in newer versions.

Network Policies and Secrets Management for Container Isolation

A compromised container is contained only if network segmentation limits its blast radius. Default Kubernetes networking allows any pod to communicate with any other pod—an attacker who compromises one service can probe the entire cluster.

Default-Deny Network Policies

Start with a default-deny policy that blocks all ingress and egress, then explicitly whitelist required connections:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}  # Applies to all pods in namespace
  policyTypes:
    - Ingress
    - Egress

---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-server-policy
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: api-server
  policyTypes:
    - Ingress
    - Egress
  ingress:
    # Allow traffic from ingress controller
    - from:
        - namespaceSelector:
            matchLabels:
              name: ingress-nginx
          podSelector:
            matchLabels:
              app: ingress-nginx
      ports:
        - protocol: TCP
          port: 8080
  egress:
    # Allow connections to PostgreSQL
    - to:
        - podSelector:
            matchLabels:
              app: postgresql
      ports:
        - protocol: TCP
          port: 5432
    # Allow DNS resolution
    - to:
        - namespaceSelector: {}
          podSelector:
            matchLabels:
              k8s-app: kube-dns
      ports:
        - protocol: UDP
          port: 53

⚠️ Warning: Forgetting to allow DNS (port 53 to kube-dns) is the most common network policy mistake. Your pods will fail to resolve any hostnames, including internal service names.

Secrets Management

Kubernetes secrets stored in etcd are base64-encoded, not encrypted. Anyone with API access can read them. Environment variables appear in process listings and crash dumps. Production secrets require better protection.

Mount secrets as files instead of environment variables:

apiVersion: v1
kind: Pod
metadata:
  name: app-with-secrets
spec:
  containers:
    - name: app
      image: myregistry/app:v1.0.0
      volumeMounts:
        - name: db-credentials
          mountPath: /secrets/db
          readOnly: true
        - name: api-keys
          mountPath: /secrets/api
          readOnly: true
  volumes:
    - name: db-credentials
      secret:
        secretName: database-credentials
        defaultMode: 0400  # Read-only for owner
    - name: api-keys
      secret:
        secretName: external-api-keys
        defaultMode: 0400

Use external secrets operators to sync secrets from dedicated secret managers:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: database-credentials
  namespace: production
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets-manager
    kind: ClusterSecretStore
  target:
    name: database-credentials
    creationPolicy: Owner
  data:
    - secretKey: username
      remoteRef:
        key: production/database
        property: username
    - secretKey: password
      remoteRef:
        key: production/database
        property: password

This approach keeps secrets out of your Git repository, etcd, and Kubernetes API—they’re fetched directly from AWS Secrets Manager (or Vault, GCP Secret Manager, etc.) at runtime.

Putting It All Together: A Security Checklist for Production

Security is an ongoing process, not a one-time implementation. Use this checklist before every production deployment and during quarterly security reviews.

Pre-Deployment Checklist

Build Phase:

Base image is distroless, chainguard, or scratch
Multi-stage build separates build and runtime dependencies
No shells, package managers, or debugging tools in final image
Dockerfile includes explicit USER directive with numeric UID
Image tagged with immutable digest, not latest

Scan Phase:

Trivy or Grype scan passes with zero high/critical CVEs
OPA Conftest validates Dockerfile and Kubernetes manifests
SBOM generated and stored for audit trail

Sign Phase:

Image signed with Cosign using keyless signing
Signature verification configured in cluster admission policy

Deploy Phase:

Pod security context specifies runAsNonRoot, drops all capabilities
Read-only root filesystem enabled
Resource limits defined for CPU and memory
Network policies restrict ingress and egress
Secrets mounted as files from external secrets operator

Monitoring and Alerting

Security controls only work if you know when they’re violated:

Audit logs: Kubernetes audit logs capture all API requests. Alert on pod creations that trigger PSA warnings.
Runtime detection: Tools like Falco detect anomalous behavior—shells spawned in containers, unexpected network connections, sensitive file access.
Image drift: Alert when running images don’t match signed manifests in your deployment repository.

Incident Response Considerations

When a container is compromised:

Isolate immediately: Apply a network policy that blocks all egress from the affected pod
Preserve evidence: Don’t delete the pod—snapshot its filesystem and memory for forensics
Check lateral movement: Review network flow logs for connections to other pods
Rotate secrets: Assume any secret mounted in the container is compromised
Rebuild from source: Don’t trust the running image—rebuild and redeploy from your verified source

Key Takeaways

Start every Dockerfile FROM a distroless or scratch base and use multi-stage builds to exclude build tools
Add USER directives with explicit non-root UIDs and drop all Linux capabilities except those explicitly required
Implement automated security gates using Trivy, OPA Conftest, and Cosign in your CI/CD pipeline
Enable Kubernetes Pod Security Admission with the ‘restricted’ profile as your default enforcement level
Apply default-deny NetworkPolicies to every namespace and whitelist only required pod-to-pod communication