Feb 16, 2026

Building Production-Ready Ingress Controllers: From Basic Routing to TLS and Rate Limiting

Your Kubernetes services are running perfectly in the cluster, but external traffic still hits them through a mess of LoadBalancer services—each with its own IP, no SSL termination, and cloud costs piling up. Every new service means another $20-30/month for a load balancer that does nothing but forward TCP traffic. Your certificate management is scattered across different services, renewals happen manually, and you’re debugging TLS issues by SSH-ing into individual pods. Meanwhile, your monitoring dashboard shows a dozen different ingress points with no unified view of what traffic is actually hitting your cluster.

You know Ingress controllers are the answer, but here’s where most guides fail you: they show you how to route foo.example.com to service A and bar.example.com to service B, declare victory, and move on. That works fine in development. In production, you need TLS certificates that auto-renew, rate limiting to protect your APIs from abuse, request metrics flowing into Prometheus, and the ability to debug traffic issues without grep-ing through pod logs. The gap between “basic routing works” and “this is production-ready” is where teams get stuck.

The good news: modern Ingress controllers like nginx-ingress and Traefik handle all of this out of the box—if you configure them correctly from the start. The patterns for TLS termination, rate limiting, and observability aren’t complex, but they’re rarely documented together. You don’t need to bolt these features on later as your traffic grows; you can build them into your initial setup and deploy with confidence.

Let’s start with why consolidating your external traffic through a single Ingress controller isn’t just cleaner architecture—it’s a fundamental cost and operational advantage over the LoadBalancer-per-service approach.

Why Ingress Controllers Beat Multiple LoadBalancers

Every Kubernetes service exposed via type: LoadBalancer provisions a dedicated cloud load balancer—an L4 device that costs $15-30/month on AWS, GCP, or Azure. For a microservices architecture with 10 externally-facing services, you’re spending $150-300 monthly on load balancers alone before any traffic flows. An Ingress controller consolidates these services behind a single load balancer endpoint, reducing infrastructure costs by 70-90% while enabling capabilities that L4 load balancers cannot provide.

Visual: Comparison of multiple LoadBalancers versus single Ingress controller architecture showing cost savings and consolidated traffic management

Centralized TLS Management

LoadBalancer services terminate connections at Layer 4, forcing you to handle TLS termination inside each application pod. This scatters certificate management across deployments, creating renewal nightmares and inconsistent cipher configurations. Ingress controllers terminate TLS at the edge, centralizing certificate storage in Kubernetes Secrets and enabling automated renewal through cert-manager. A single tls block in your Ingress resource provisions certificates for any domain, with automatic rotation and OCSP stapling handled at the infrastructure layer.

Layer 7 Routing Intelligence

LoadBalancer services operate at the TCP/UDP layer and route based solely on IP and port. They cannot inspect HTTP headers, paths, or cookies. An Ingress controller operates at Layer 7, parsing HTTP requests and routing api.example.com/v1/users to your users service while api.example.com/v1/orders hits your orders service—all through the same IP address. This host-based and path-based routing eliminates the need for multiple DNS records and load balancers, consolidating external endpoints into a single, manageable surface.

Unified Observability

With LoadBalancer services, access logs and traffic metrics scatter across cloud provider consoles and application logs. Ingress controllers emit standardized metrics (request rates, latency percentiles, status code distributions) in Prometheus format, with structured access logs containing headers, user agents, and upstream response times. This centralized telemetry layer provides complete visibility into north-south traffic patterns without instrumenting individual services.

💡 Pro Tip: Compare your current monthly cloud load balancer spend against a single Application Load Balancer ($16/month on AWS) plus NGINX Ingress controller overhead (0.1-0.5 CPU cores). The cost savings typically justify Ingress adoption within the first month.

The operational simplification extends beyond cost. Instead of managing firewall rules, health checks, and SSL policies across dozens of load balancers, you configure Ingress resources declaratively using Kubernetes-native manifests. This shift from infrastructure-as-configuration to infrastructure-as-code enables GitOps workflows and reduces the blast radius of misconfigurations.

With the economic and architectural advantages established, the next step is deploying an Ingress controller into your cluster using Helm and connecting it to your cloud provider’s load balancing infrastructure.

Deploying NGINX Ingress Controller with Helm

The NGINX Ingress Controller is the de facto standard for managing ingress traffic in Kubernetes clusters. Unlike application-specific proxies, it’s designed specifically for the Ingress API and maintains feature parity with upstream NGINX while adding Kubernetes-native configuration. This controller pattern centralizes routing logic, SSL termination, and traffic management in a single component rather than distributing it across individual services.

Installing with Helm

The official ingress-nginx chart from the Kubernetes community provides production-ready defaults. Add the repository and install the controller:

## Add the ingress-nginx repository
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update

## Install the controller in its own namespace
helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace \
  --set controller.metrics.enabled=true \
  --set controller.podAnnotations."prometheus\.io/scrape"=true \
  --set controller.podAnnotations."prometheus\.io/port"=10254

This creates a LoadBalancer service that provisions an external IP and forwards traffic to the NGINX pods. The metrics flags enable Prometheus scraping from the start—retrofitting observability later is painful. The controller runs as a Deployment with configurable replica counts for high availability. By default, it creates two replicas with anti-affinity rules to distribute pods across nodes.

The installation also creates a ConfigMap (ingress-nginx-controller) for global NGINX settings, a ServiceAccount with appropriate RBAC permissions, and a ValidatingWebhookConfiguration to prevent invalid Ingress manifests from being applied. These components work together to provide a robust routing layer with built-in validation.

Cloud vs Bare-Metal Configuration

Cloud providers automatically provision LoadBalancers, but the behavior differs significantly:

AWS/EKS:

helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace \
  --set controller.service.annotations."service\.beta\.kubernetes\.io/aws-load-balancer-type"=nlb \
  --set controller.service.annotations."service\.beta\.kubernetes\.io/aws-load-balancer-cross-zone-load-balancing-enabled"=true

Network Load Balancers (NLB) preserve source IPs and handle millions of requests per second. Classic Load Balancers introduce unnecessary hops and don’t support UDP, which matters for services beyond HTTP/HTTPS. The cross-zone setting ensures traffic distributes evenly across availability zones, preventing hotspots when pod distribution is uneven.

GCP/GKE:

helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace \
  --set controller.service.type=LoadBalancer \
  --set controller.service.externalTrafficPolicy=Local

Setting externalTrafficPolicy=Local prevents SNAT and preserves client IPs, critical for rate limiting and logging. The tradeoff is uneven load distribution if pods are unevenly scheduled across nodes. GKE creates a TCP/UDP LoadBalancer by default, which provides sufficient performance for most workloads. For Global LoadBalancing with CDN integration, consider GKE’s native GCE Ingress instead.

Bare-Metal/On-Premises:

Without cloud LoadBalancers, use NodePort with an external load balancer (HAProxy, F5) or MetalLB:

helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace \
  --set controller.service.type=NodePort \
  --set controller.service.nodePorts.http=30080 \
  --set controller.service.nodePorts.https=30443

NodePort exposes the service on every node at the specified port. Your external load balancer should perform health checks on http://<node-ip>:30080/healthz before routing traffic. For environments with MetalLB, switch to type=LoadBalancer and MetalLB will assign IPs from its configured pool. This provides a cloud-like experience without vendor-specific infrastructure.

Verifying the Installation

Check that the controller pods are running and the LoadBalancer has an external IP:

## Wait for pods to be ready
kubectl wait --namespace ingress-nginx \
  --for=condition=ready pod \
  --selector=app.kubernetes.io/component=controller \
  --timeout=120s

## Get the external IP (may take 2-3 minutes on cloud providers)
kubectl get service ingress-nginx-controller -n ingress-nginx

You’ll see output like:

NAME                       TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)
ingress-nginx-controller   LoadBalancer   10.100.45.123   203.0.113.42     80:32080/TCP,443:32443/TCP

💡 Pro Tip: If EXTERNAL-IP shows <pending> for more than 5 minutes, check your cloud provider’s service quotas. AWS accounts often have ELB limits that require support tickets to increase.

Verify the controller is processing configuration by checking its logs:

kubectl logs -n ingress-nginx -l app.kubernetes.io/component=controller --tail=50

You should see startup messages indicating successful NGINX configuration reload and webhook registration. Any errors here typically indicate RBAC issues or invalid default configurations.

Understanding IngressClass

Kubernetes 1.18+ introduced IngressClass resources to support multiple ingress controllers in the same cluster. The NGINX Helm chart automatically creates an IngressClass named nginx:

kubectl get ingressclass

When you create Ingress resources in the next section, you’ll reference this class with ingressClassName: nginx. Clusters with multiple ingress controllers (NGINX for public traffic, Traefik for internal) use IngressClass to route requests to the correct controller. The IngressClass also carries default configuration like setting ingressclass.kubernetes.io/is-default-class: "true" to automatically assign the class to Ingresses without an explicit ingressClassName.

This multi-controller pattern is common in enterprises where different teams require different routing capabilities. For example, a platform team might maintain NGINX for production HTTP traffic while a data team uses Istio Gateway for gRPC services. IngressClass ensures clean separation without routing conflicts.

With the controller running and verified, you’re ready to define routing rules that direct traffic to your services.

Creating Your First Ingress Rules: Path-Based and Host-Based Routing

With your Ingress controller running, you’re ready to define routing rules that direct external traffic to your backend services. Ingress resources use declarative YAML to specify how HTTP(S) requests map to Kubernetes services based on hostnames and URL paths.

Understanding the Ingress Resource Structure

An Ingress resource defines routing rules independently of the controller implementation. Here’s a basic path-based routing example that directs traffic to different services based on URL paths:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
  namespace: production
spec:
  ingressClassName: nginx
  rules:
  - http:
      paths:
      - path: /api/users
        pathType: Prefix
        backend:
          service:
            name: user-service
            port:
              number: 8080
      - path: /api/orders
        pathType: Prefix
        backend:
          service:
            name: order-service
            port:
              number: 8080
      - path: /
        pathType: Prefix
        backend:
          service:
            name: frontend-service
            port:
              number: 3000

This configuration routes /api/users/* requests to the user service, /api/orders/* to the order service, and everything else to the frontend. The ingressClassName field explicitly associates this resource with your NGINX Ingress controller, preventing conflicts in clusters with multiple controllers.

The backend block defines the target service and port. Unlike traditional reverse proxy configurations that require IP addresses, Kubernetes Ingress uses service names, letting the cluster’s DNS and service mesh handle endpoint resolution. When you scale the user service from three to ten pods, the Ingress controller automatically distributes traffic across all healthy endpoints without configuration changes.

Path Matching Strategies

Kubernetes supports three pathType values with distinct matching behaviors:

Prefix matches any request starting with the specified path. /api/users matches /api/users, /api/users/123, and /api/users/search?q=active. This flexible approach works best for API routing where you want entire subtrees directed to a service. Prefix matching is case-sensitive and doesn’t normalize paths—/api/users and /API/Users are different paths.

Exact requires the path to match precisely, including trailing slashes. /health matches only /health, not /health/ or /health/detailed. Use this for specific endpoints like health checks or webhooks where you need strict control. Exact matching prevents unintended routing when you have overlapping path structures.

ImplementationSpecific delegates matching to the Ingress controller, allowing controller-specific features like regex patterns. NGINX Ingress supports regex through annotations when using this type, enabling advanced scenarios like routing based on file extensions or complex URL patterns.

💡 Pro Tip: Order your path rules from most specific to least specific. Ingress controllers evaluate rules in order, and the first match wins. Place exact paths before prefix paths to avoid unexpected routing.

Handling Path Rewrites and Redirects

Many applications expect requests at their root path but are exposed under a subpath in your Ingress rules. The /api/users prefix gets forwarded to the user service, which might not handle /api/users/123 correctly if it expects /123. NGINX Ingress provides annotations for path rewriting:

metadata:
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /$2
spec:
  rules:
  - http:
      paths:
      - path: /api/users(/|$)(.*)
        pathType: ImplementationSpecific

This regex-based rewrite strips /api/users from the request before forwarding it to the backend service. Requests to /api/users/123 arrive at the service as /123, maintaining compatibility with applications designed to serve from the root path.

Host-Based Routing for Multi-Tenant Applications

Host-based routing directs traffic based on the HTTP Host header, enabling multiple applications to share a single load balancer IP:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: multi-tenant-ingress
  namespace: production
spec:
  ingressClassName: nginx
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 8080
  - host: dashboard.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: dashboard-service
            port:
              number: 8080
  - host: staging.example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: staging-api-service
            port:
              number: 8080
      - path: /
        pathType: Prefix
        backend:
          service:
            name: staging-frontend-service
            port:
              number: 3000

Each host field creates a virtual host with independent routing rules. Requests to api.example.com route to the API service, while dashboard.example.com routes to the dashboard, all sharing the same external IP address. You can combine host-based and path-based routing, as shown in the staging environment configuration.

Host-based routing requires DNS configuration. Point all hostnames to the Ingress controller’s external IP address using A records or CNAME records. The Ingress controller inspects the Host header in incoming requests and applies the appropriate routing rules. Requests with unrecognized hostnames receive a 404 response unless you define a default backend.

Verifying and Troubleshooting Your Ingress Rules

After applying these Ingress resources with kubectl apply -f, verify the routing rules with kubectl describe ingress api-ingress -n production. The output shows the configured rules and the assigned load balancer address. Check the Rules section to confirm your path and host configurations are correct, and verify the Address field contains your load balancer IP.

Test your routing with curl, specifying the Host header explicitly: curl -H "Host: api.example.com" http://<INGRESS_IP>/api/users. This validates routing before DNS propagation completes. If requests don’t reach the expected service, check the Ingress controller logs with kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx for routing decisions and error messages.

These routing fundamentals work identically across different Ingress controller implementations, but production traffic requires encrypted connections. Next, we’ll add TLS termination to secure these endpoints with automatic certificate management.

Implementing TLS Termination with cert-manager

Exposing services without TLS in production is no longer acceptable. Manual certificate management doesn’t scale—certificates expire, renewal processes fail, and tracking dozens of domains becomes operational overhead. cert-manager automates the entire lifecycle: provisioning, renewal, and rotation of TLS certificates using ACME providers like Let’s Encrypt.

Installing cert-manager

cert-manager runs as a set of Kubernetes controllers that watch for Certificate resources and handle the ACME challenge process automatically. Install it using the official manifests:

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.0/cert-manager.yaml

This creates the cert-manager namespace and deploys three controllers: the main cert-manager controller, the webhook for validation, and the cainjector for CA bundle management. Verify the installation:

kubectl get pods -n cert-manager

All three pods must reach Running status before proceeding. The webhook requires valid TLS certificates to function, which cert-manager generates on first boot.

Configuring a Let’s Encrypt ClusterIssuer

ClusterIssuers define how cert-manager obtains certificates. Let’s Encrypt provides free, automated certificates with 90-day validity. Create a production ClusterIssuer using the HTTP-01 challenge method:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: [email protected]
    privateKeySecretRef:
      name: letsencrypt-prod-account-key
    solvers:
    - http01:
        ingress:
          class: nginx

The email address receives expiration warnings if automatic renewal fails. The privateKeySecretRef stores your Let’s Encrypt account credentials—cert-manager creates this secret automatically on first use. The HTTP-01 solver creates temporary Ingress rules to prove domain ownership during certificate issuance.

Apply the ClusterIssuer:

kubectl apply -f letsencrypt-issuer.yaml

💡 Pro Tip: Start with the Let’s Encrypt staging server (https://acme-staging-v02.api.letsencrypt.org/directory) during initial testing. The staging environment has higher rate limits and prevents you from hitting production quota while debugging configuration issues.

Enabling TLS on Ingress Resources

With cert-manager installed and a ClusterIssuer configured, enabling TLS requires two additions to your Ingress resource: an annotation specifying the issuer and a tls block defining the hostname and secret name:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - api.example.com
    secretName: api-tls-cert
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 8080

When you apply this Ingress, cert-manager detects the annotation and creates a Certificate resource. It then initiates the ACME challenge with Let’s Encrypt, creates the temporary validation Ingress rules, and stores the issued certificate in the api-tls-cert secret. The NGINX Ingress Controller reads this secret and configures TLS termination automatically.

Monitor certificate issuance:

kubectl get certificate
kubectl describe certificate api-tls-cert

The Ready condition shows True when the certificate is successfully issued and installed. Initial certificate requests take 30-90 seconds depending on DNS propagation and Let’s Encrypt’s response time.

Automatic Renewal

cert-manager monitors certificates and renews them automatically when they have less than 30 days remaining. Renewal uses the same ACME challenge process as initial issuance. You can verify renewal configuration by checking the Certificate resource’s renewal time:

kubectl get certificate api-tls-cert -o jsonpath='{.status.renewalTime}'

No manual intervention is required—cert-manager handles the complete lifecycle. If renewal fails, cert-manager retries with exponential backoff and emits Kubernetes events visible through kubectl describe certificate.

With automated TLS in place, the next step is implementing rate limiting to protect your services from traffic spikes and potential abuse.

Rate Limiting and Traffic Control with Annotations

Ingress controllers protect backend services from traffic spikes, abuse, and resource exhaustion through annotation-based policies. NGINX Ingress Controller exposes fine-grained rate limiting, connection management, and request handling controls that integrate directly into your Ingress resource definitions.

Per-IP and Global Rate Limiting

Rate limiting prevents individual clients or entire endpoints from overwhelming your services. NGINX Ingress uses the leaky bucket algorithm with two primary annotations:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
  namespace: production
  annotations:
    nginx.ingress.kubernetes.io/rate-limit: "20"
    nginx.ingress.kubernetes.io/limit-rps: "10"
    nginx.ingress.kubernetes.io/limit-connections: "5"
    nginx.ingress.kubernetes.io/limit-burst-multiplier: "2"
spec:
  ingressClassName: nginx
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /v1/users
        pathType: Prefix
        backend:
          service:
            name: user-service
            port:
              number: 8080

The limit-rps annotation restricts requests per second per IP address, while limit-connections caps concurrent connections from a single source. The limit-burst-multiplier allows temporary bursts at twice the rate limit before rejecting requests with HTTP 503.

Rate limiting operates at two distinct scopes. Per-IP limits track individual client addresses using NGINX’s limit_req_zone directive with a binary remote address key. This prevents a single malicious actor from monopolizing resources while allowing legitimate distributed traffic to proceed. Global endpoint limits apply regardless of source IP, protecting backends from coordinated attacks or legitimate flash crowds that exceed your infrastructure capacity.

Configure limits based on your service’s actual capacity, not arbitrary values. If your backend handles 1000 requests per second across 100 pods, setting limit-rps: "10" allows each client to consume 1% of total capacity. Burst multipliers accommodate legitimate spikes—a mobile app refreshing after network reconnection might send 3-4 requests simultaneously. Setting limit-burst-multiplier: "3" permits these bursts without false positives.

The rate-limit annotation defines the zone size in megabytes for tracking client state. Each megabyte stores approximately 16,000 IP addresses. For high-traffic services with diverse client populations, allocate sufficient memory to prevent premature eviction of tracking data, which would effectively reset rate limit counters and allow circumvention.

Request Size and Timeout Controls

Oversized payloads and slow clients consume server resources. Configure limits that align with your application requirements:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: upload-service
  namespace: production
  annotations:
    nginx.ingress.kubernetes.io/proxy-body-size: "10m"
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "10"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "30"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "30"
    nginx.ingress.kubernetes.io/client-body-buffer-size: "1m"
spec:
  ingressClassName: nginx
  rules:
  - host: uploads.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: upload-processor
            port:
              number: 9000

The proxy-body-size annotation rejects requests exceeding 10MB. Timeout values prevent slow clients from holding connections indefinitely—proxy-connect-timeout limits backend connection establishment, while proxy-send-timeout and proxy-read-timeout govern request transmission and response reception.

Request size limits serve dual purposes: preventing memory exhaustion from unbounded buffering and rejecting attacks that attempt to overwhelm parsing logic. The client-body-buffer-size annotation controls how much request data NGINX buffers in memory before spilling to disk. Smaller buffers (128k-1m) reduce memory pressure but increase disk I/O for large legitimate uploads. Match this value to your expected payload distribution—if 95% of requests are under 1MB, buffering that amount optimizes for the common case.

Timeout configurations require understanding your application’s performance characteristics. A proxy-connect-timeout of 10 seconds tolerates network latency and pod scheduling delays during scale-up events. The proxy-read-timeout must exceed your slowest legitimate operation—if report generation takes 45 seconds, configure 60-second read timeouts to prevent premature termination. However, excessively long timeouts allow slowloris-style attacks to exhaust connection pools.

Custom Error Pages and Connection Management

When rate limits trigger or backends fail, default error pages leak infrastructure details. Configure custom responses:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: hardened-api
  namespace: production
  annotations:
    nginx.ingress.kubernetes.io/limit-rps: "100"
    nginx.ingress.kubernetes.io/custom-http-errors: "429,503"
    nginx.ingress.kubernetes.io/default-backend: custom-error-service
    nginx.ingress.kubernetes.io/upstream-keepalive-connections: "50"
    nginx.ingress.kubernetes.io/upstream-keepalive-timeout: "60"
spec:
  ingressClassName: nginx
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: api-backend
            port:
              number: 8080

The custom-http-errors annotation redirects 429 (rate limited) and 503 (unavailable) responses to your error service. Upstream keepalive settings maintain connection pools to backends, reducing latency for subsequent requests.

Connection pooling through upstream-keepalive-connections eliminates TCP handshake overhead for repeat requests. Set this value to match expected concurrent connections per NGINX worker process. With 4 worker processes and 200 target connections to your backend, configure 50 keepalive connections per worker. The upstream-keepalive-timeout should align with your backend’s keepalive settings—mismatches cause one side to close connections the other expects to reuse.

Custom error services return user-friendly messages with appropriate retry guidance. A 429 response should include Retry-After headers indicating when the client can retry. Your error service can implement exponential backoff recommendations or direct clients to cached content while rate limits reset.

Testing Rate Limits

Validate configurations with load testing tools before production traffic hits your limits:

## Test per-IP rate limiting
for i in {1..50}; do
  curl -w "%{http_code}\n" -s -o /dev/null https://api.example.com/v1/users
done

## Verify request size limits
dd if=/dev/zero bs=1M count=15 | curl -X POST \
  --data-binary @- \
  https://uploads.example.com/files

Monitor NGINX metrics for nginx_ingress_controller_requests with status="429" labels to confirm rate limiting activates correctly. Adjust limits based on legitimate traffic patterns observed in your metrics dashboards.

Use tools like hey or vegeta for sophisticated load testing that simulates realistic traffic distributions. Test from multiple source IPs to validate that per-IP limits don’t interfere with each other and that global limits engage when aggregate traffic exceeds thresholds. Verify that burst multipliers accommodate legitimate spikes without triggering false positives during normal operation.

With traffic controls configured, the next critical production requirement is observability—exposing metrics, structured logs, and distributed traces that reveal how your ingress layer performs under real-world conditions.

Observability: Metrics, Logs, and Tracing

Visibility into your ingress layer is critical for detecting performance degradation, identifying security threats, and optimizing resource allocation. NGINX Ingress Controller provides comprehensive observability through metrics, structured logs, and distributed tracing integration. A well-instrumented ingress layer enables proactive incident response, capacity planning, and root cause analysis across your entire application stack.

Enabling Prometheus Metrics

NGINX Ingress Controller exposes Prometheus metrics by default on port 10254. Enable metric collection by deploying a ServiceMonitor for Prometheus Operator:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: nginx-ingress-controller
  namespace: ingress-nginx
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: ingress-nginx
  endpoints:
  - port: metrics
    interval: 30s
    path: /metrics

Key metrics to alert on include nginx_ingress_controller_requests (request rate), nginx_ingress_controller_request_duration_seconds (latency percentiles), and nginx_ingress_controller_ssl_expire_time_seconds (certificate expiration). Monitor nginx_ingress_controller_nginx_process_connections to detect connection exhaustion before it impacts users.

Additional critical metrics include nginx_ingress_controller_config_last_reload_successful (configuration reload failures), nginx_ingress_controller_ingress_upstream_latency_seconds (backend service latency), and nginx_ingress_controller_bytes_sent_sum (egress bandwidth utilization). Set up alerts for sustained increases in 5xx error rates (nginx_ingress_controller_requests{status=~"5.."}), which often indicate backend service degradation or misconfiguration.

For advanced monitoring, track per-ingress metrics using label filters. The ingress and service labels allow you to isolate traffic patterns and error rates for specific applications, enabling targeted alerts and SLA tracking at the service level.

Structured Access Logs

Configure JSON-formatted access logs for easier parsing by log aggregation systems. Update the ConfigMap to enable structured logging:

apiVersion: v1
kind: ConfigMap
metadata:
  name: ingress-nginx-controller
  namespace: ingress-nginx
data:
  log-format-escape-json: "true"
  log-format-upstream: '{"time": "$time_iso8601", "remote_addr": "$remote_addr",
    "request_method": "$request_method", "request_uri": "$request_uri",
    "status": $status, "request_time": $request_time, "upstream_addr": "$upstream_addr",
    "upstream_response_time": "$upstream_response_time", "user_agent": "$http_user_agent",
    "request_id": "$req_id"}'

The $req_id variable provides request correlation across services. Configure your applications to extract this header (X-Request-ID) and propagate it through your microservices for end-to-end request tracking. Include additional fields like $http_referer, $ssl_protocol, and $ssl_cipher for security auditing and debugging TLS-related issues.

Stream logs to centralized aggregation platforms like Elasticsearch, Loki, or CloudWatch. Index on fields such as status, request_id, and upstream_addr to enable fast querying during incident investigations. Set retention policies based on compliance requirements—typically 30-90 days for access logs.

Distributed Tracing Integration

Integrate with Jaeger or Zipkin to trace requests across your service mesh. Enable tracing by configuring the ingress controller:

apiVersion: v1
kind: ConfigMap
metadata:
  name: ingress-nginx-controller
  namespace: ingress-nginx
data:
  enable-opentracing: "true"
  jaeger-collector-host: jaeger-collector.observability.svc.cluster.local
  jaeger-collector-port: "14268"
  jaeger-service-name: nginx-ingress
  jaeger-sampler-type: probabilistic
  jaeger-sampler-param: "0.1"

Set sampling rate (jaeger-sampler-param) based on traffic volume—start at 10% for high-traffic environments and adjust based on trace storage costs and debugging needs. For critical production paths, consider rate-limiting samplers that guarantee trace capture for errors while sampling a percentage of successful requests.

Ensure backend services are instrumented with OpenTelemetry or native tracing SDKs that honor the uber-trace-id header (Jaeger) or X-B3-TraceId header (Zipkin). This creates complete trace spans from ingress through all downstream dependencies, revealing cascading failures and performance bottlenecks invisible to metrics alone.

Building Effective Dashboards

Create a Grafana dashboard combining ingress metrics with application-level metrics. Correlate increased ingress latency with downstream service performance to quickly identify bottlenecks during incidents. Include panels for request rate by status code, P95/P99 latency trends, SSL certificate expiration timeline, and connection pool utilization. Set up dashboard variables for filtering by namespace, ingress name, or service to enable rapid drill-down during troubleshooting.

💡 Pro Tip: Implement composite alerts that trigger only when multiple signals align—for example, alert when P99 latency exceeds thresholds AND error rate increases simultaneously. This reduces false positives from transient spikes while catching genuine degradation patterns early.

With comprehensive observability in place, you’re ready to harden your ingress infrastructure for production resilience through high availability configurations and security best practices.

Production Hardening: HA, Security Headers, and Upgrades

Moving your Ingress controller to production requires addressing availability, security, and operational concerns that don’t appear in development environments.

Visual: Production hardening architecture showing HA deployment across availability zones with security headers and upgrade strategies

High Availability Configuration

Single-replica Ingress controllers create a single point of failure. Run at least three replicas across different nodes to survive node failures and maintenance windows. Configure pod anti-affinity rules to ensure replicas spread across availability zones. Set pod disruption budgets to maintain minimum availability during cluster operations like upgrades or scaling events.

The Ingress controller’s LoadBalancer service automatically distributes traffic across healthy replicas. Health checks ensure traffic only reaches ready pods. For cloud providers, enable connection draining with appropriate timeout values to prevent dropped connections during pod terminations.

Monitor replica health through readiness and liveness probes. Readiness probes should verify the controller can process Ingress resources and communicate with the Kubernetes API. Liveness probes detect hung processes that need restart. Set probe thresholds based on your observed startup times and expected response latencies.

Security Headers and CORS

Modern web applications require strict security headers to protect against common attacks. Configure your Ingress controller to inject headers like Content-Security-Policy, X-Frame-Options, and Strict-Transport-Security globally or per-Ingress resource. These headers prevent clickjacking, enforce HTTPS, and restrict resource loading.

Cross-Origin Resource Sharing (CORS) policies control which domains can access your APIs. Configure CORS through Ingress annotations, specifying allowed origins, methods, and headers. For public APIs, implement restrictive CORS policies rather than wildcard configurations. Remember that CORS enforcement happens in browsers, not at the Ingress level—backend services still need their own authorization.

Blue-Green Controller Upgrades

Ingress controller upgrades carry risk since they handle all incoming traffic. Use blue-green deployments: install the new controller version alongside the existing one with a different IngressClass name. Migrate a test service to the new controller, validate functionality, then progressively move production traffic. This approach provides instant rollback capability if issues surface.

Test upgrades in staging environments that mirror production configuration. Pay attention to annotation compatibility between versions—deprecated annotations may stop working. Review controller changelogs for breaking changes in routing behavior or default settings.

Common Pitfalls

Backend pod restarts often surface as intermittent 502 or 503 errors through the Ingress controller. Check backend readiness probes and ensure graceful shutdown handlers exist. DNS resolution failures cause cryptic connection errors—verify CoreDNS health and service DNS records. Certificate expiration remains the most common TLS failure despite automation—monitor certificate validity through Prometheus metrics.

With these hardening measures in place, your Ingress controller handles production traffic reliably while maintaining security posture and upgrade flexibility.

Key Takeaways

Start with cert-manager integration on day one—retrofitting TLS later adds unnecessary complexity and downtime risk
Use Prometheus metrics and structured logging from the beginning to establish baselines before issues arise
Implement rate limiting early at the ingress layer rather than in each application to reduce duplicate code and enforce consistent policies
Test your Ingress controller’s high availability configuration with chaos engineering before production incidents force you to