Hero image for GitLab CI/CD with Rancher: Building a Pull-Based Deployment Pipeline for Multi-Cluster Kubernetes

GitLab CI/CD with Rancher: Building a Pull-Based Deployment Pipeline for Multi-Cluster Kubernetes


Your deployment pipeline works flawlessly in staging, but production rollouts keep failing silently across your three regional clusters. The kubectl commands in your CI scripts time out randomly, credentials rotate without warning, and you’ve lost track of which cluster is running which version. You’re spending more time debugging deployments than shipping features.

This is the reality of push-based CI/CD at scale. Every GitLab Runner needs direct network access to every cluster. Every cluster credential lives in CI variables, waiting to expire at the worst possible moment. Every firewall change risks breaking deployments to production. And when something fails—and it will—you’re left parsing logs across multiple systems to figure out which cluster diverged from your intended state.

The fundamental problem isn’t your pipeline logic or your Kubernetes manifests. It’s the architecture itself. Push-based deployments treat your CI system as the source of truth for cluster state, but CI runners are ephemeral by design. They spin up, execute commands against remote clusters, and disappear—taking their execution context with them. When a deployment partially succeeds, you have no authoritative record of what actually landed where.

Pull-based GitOps flips this model. Instead of CI pushing changes to clusters, clusters pull their desired state from Git. The source of truth moves from transient runner logs to version-controlled manifests. Credentials stay inside your clusters, never exposed to external systems. Network connectivity requirements reverse—clusters reach out rather than accepting inbound connections.

Combining GitLab CI/CD with Rancher Fleet gives you the best of both worlds: GitLab handles building, testing, and promoting artifacts through environments, while Fleet ensures every cluster converges to its declared state. But before diving into implementation, let’s examine exactly where push-based pipelines break down.

Why Push-Based CI/CD Breaks at Scale

Traditional CI/CD pipelines deploy to Kubernetes clusters by pushing changes directly from runners. A GitLab runner executes kubectl apply, Helm upgrades, or API calls against cluster endpoints. This model works for a single cluster in a controlled environment. It collapses when you manage ten clusters across multiple regions, cloud providers, and security zones.

Visual: Push-based CI/CD architecture showing runners connecting to multiple clusters

The Credentials Problem

Push-based deployments require CI runners to authenticate against every target cluster. Your GitLab runners need kubeconfig files, service account tokens, or cloud provider credentials stored as CI/CD variables. Each cluster adds another set of secrets to manage.

Credential rotation becomes a coordination nightmare. When you rotate a service account token for your production cluster in AWS, you must update the corresponding GitLab variable, verify the change propagated to all runners, and hope no pipeline runs during the transition window. Multiply this by the number of clusters and rotation frequency your security team mandates.

Network Topology Constraints

Push deployments assume network connectivity from runners to cluster API servers. This assumption breaks in common enterprise architectures:

  • Private clusters behind VPNs or bastion hosts
  • Air-gapped environments in regulated industries
  • Clusters in customer-managed infrastructure
  • Edge deployments with intermittent connectivity

You end up punching holes through firewalls, maintaining VPN connections from runner pools, or deploying dedicated runners inside each network boundary. Each workaround increases operational complexity and attack surface.

Silent Failures and Partial Deployments

A push pipeline succeeds or fails based on the runner’s perspective. When deploying to multiple clusters, a network timeout to one cluster might cause a partial rollout—three clusters updated, two stuck on the old version. The pipeline reports failure, but determining actual cluster state requires manual investigation.

Retry logic helps but introduces its own problems. Idempotency becomes critical when a pipeline might apply the same manifests multiple times. Race conditions emerge when multiple pipelines target the same cluster simultaneously.

Configuration Drift Goes Undetected

Push-based systems have no mechanism to detect drift. Someone runs kubectl edit in production to fix an urgent issue. The cluster state diverges from what Git declares. Your next deployment either overwrites the hotfix or fails on conflicts. Without continuous reconciliation, you lose the “Git as source of truth” guarantee that makes GitOps valuable.

💡 Pro Tip: If you find yourself writing scripts to compare live cluster state against your Git repository, you’ve already identified the need for a pull-based architecture.

These limitations point toward an architectural inversion: instead of pushing to clusters, let clusters pull their desired state. This is where the GitLab Agent and Rancher Fleet enter the picture.

Architecture: GitLab Agent + Rancher Fleet

Understanding how GitLab Agent and Rancher Fleet work together transforms how you approach multi-cluster deployments. This architecture separates concerns cleanly: GitLab CI handles building and testing, while Fleet orchestrates deployment across your entire cluster fleet.

Visual: Pull-based architecture with GitLab Agent and Rancher Fleet

The Connection Model

Traditional CI/CD pipelines push deployments by establishing inbound connections to clusters. This requires exposing Kubernetes API servers, managing firewall rules, and distributing kubeconfig files—each cluster adding operational overhead and security surface area.

GitLab Agent inverts this model. The agent runs as a deployment inside each Kubernetes cluster and establishes an outbound-only connection to GitLab. Your clusters initiate contact; GitLab never reaches in. This eliminates the need for public API endpoints and simplifies network security policies. Clusters behind NAT, in private subnets, or across cloud boundaries all connect through the same mechanism.

The agent maintains a persistent gRPC tunnel to GitLab, enabling bidirectional communication over that single outbound connection. GitLab can send deployment instructions, query cluster state, and receive real-time status updates—all without opening a single inbound port.

Fleet as the Orchestration Layer

Rancher Fleet operates as a GitOps controller designed specifically for multi-cluster scenarios. While tools like Flux and Argo CD excel at single-cluster deployments, Fleet provides primitives for managing hundreds of clusters as a unified fleet.

Fleet watches Git repositories for changes and reconciles the desired state across targeted clusters. You define which clusters receive which configurations using labels and selectors. A single commit can trigger deployments across development, staging, and production clusters in multiple regions—or target a specific cluster for testing.

The reconciliation loop runs continuously. Fleet detects drift between Git and cluster state, automatically correcting unauthorized manual changes. This enforcement mechanism ensures your Git repository remains the authoritative source of truth, not a suggestion that operators might override.

Clean Separation of Responsibilities

This architecture creates a clear boundary between CI and CD concerns:

GitLab CI builds container images, runs tests, performs security scans, and commits updated manifests to the GitOps repository. The pipeline never touches kubectl. It never needs cluster credentials.

Rancher Fleet watches the GitOps repository and deploys changes to clusters. It handles rollout strategies, health checks, and drift correction. Fleet operates independently of CI—deployments happen whether triggered by pipeline commits or manual manifest updates.

💡 Pro Tip: This separation means you can test Fleet configurations by committing directly to the GitOps repo, bypassing CI entirely during development. Once validated, integrate the full pipeline.

The GitLab Agent bridges these layers, providing Fleet with secure cluster access while giving GitLab visibility into deployment status and cluster health.

With this mental model established, let’s install the GitLab Agent and establish that secure connection between your clusters and GitLab.

Installing the GitLab Agent for Kubernetes

The GitLab Agent for Kubernetes establishes a secure, persistent connection between your clusters and GitLab. Unlike traditional approaches that require exposing your Kubernetes API server to the internet, the agent initiates an outbound connection to GitLab, eliminating the need for complex firewall rules or VPN configurations.

Registering the Agent in GitLab

Start by creating an agent configuration in your GitLab project. Navigate to Operate > Kubernetes clusters and select Connect a cluster. Choose a descriptive name that identifies both the environment and cluster purpose—for example, production-us-east or staging-fleet-manager.

GitLab requires an agent configuration file in your repository before registration completes. Create this file at the path GitLab expects:

.gitlab/agents/production-us-east/config.yaml
ci_access:
projects:
- id: mygroup/kubernetes-deployments
groups:
- id: mygroup/microservices
default_namespace: production
observability:
logging:
level: info

The ci_access block defines which projects and groups can use this agent for CI/CD jobs. Without this configuration, your pipelines will fail with authentication errors even though the agent appears connected.

After committing this file, complete the registration in the GitLab UI. GitLab generates a unique agent token—copy this immediately, as it cannot be retrieved later.

Deploying the Agent with Helm

With your Rancher-managed cluster selected in the Rancher UI, open the kubectl shell or configure your local kubeconfig. Install the agent using Helm:

gitlab-agent-values.yaml
config:
token: "glagent-xK9mPqR7vL2nYhT4..."
kasAddress: "wss://kas.gitlab.com"
rbac:
create: true
serviceAccount:
create: true
name: gitlab-agent
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi

Deploy the agent to your cluster:

Terminal window
helm repo add gitlab https://charts.gitlab.io
helm repo update
helm upgrade --install gitlab-agent gitlab/gitlab-agent \
--namespace gitlab-agent \
--create-namespace \
--values gitlab-agent-values.yaml

For self-managed GitLab instances, replace kas.gitlab.com with your KAS endpoint, typically wss://gitlab.yourcompany.com/-/kubernetes-agent/.

💡 Pro Tip: Store the agent token in a Kubernetes secret and reference it via config.secretName instead of embedding it directly in your values file. This prevents token exposure in version control and simplifies rotation.

Validating the Connection

Verify the agent pod is running and connected:

Terminal window
kubectl get pods -n gitlab-agent
kubectl logs -n gitlab-agent -l app=gitlab-agent --tail=50

A successful connection shows log entries like:

level=info msg="Feature status" feature=tunnel_connections status=available
level=info msg="Observability endpoint started"

Back in GitLab, the cluster status under Operate > Kubernetes clusters changes from “Never connected” to “Connected” with a timestamp.

Troubleshooting Common Issues

Agent shows “Never connected”: Verify outbound connectivity to kas.gitlab.com on port 443. Corporate proxies often block WebSocket upgrades—configure the config.httpProxy value if needed.

CI jobs fail with “no agent found”: Confirm your ci_access configuration includes the correct project or group ID. Project IDs are numeric; find yours under Settings > General.

Token authentication errors: Agent tokens are single-use during initial registration. If you’ve lost the token, delete the agent in GitLab and re-register to generate a new one.

With the agent connected and validated, you have a secure communication channel between GitLab and your cluster. Next, we’ll configure Rancher Fleet to watch your Git repositories and automatically synchronize deployments across your entire cluster fleet.

Configuring Rancher Fleet for GitOps Deployments

With the GitLab Agent establishing secure connectivity to your clusters, Rancher Fleet becomes the orchestration layer that transforms Git commits into running workloads. Fleet watches your repositories, detects changes, and propagates them across cluster groups—all without CI pipelines needing cluster credentials. This separation of concerns keeps secrets out of your CI environment while enabling consistent, auditable deployments across your entire infrastructure.

Creating GitRepo Resources

Fleet’s fundamental building block is the GitRepo custom resource. Each GitRepo defines a repository to watch, which paths contain deployable manifests, and which clusters should receive them.

fleet-gitrepo.yaml
apiVersion: fleet.cattle.io/v1alpha1
kind: GitRepo
metadata:
name: webapp-deployment
namespace: fleet-default
spec:
repo: https://gitlab.example.com/platform/webapp-manifests.git
branch: main
paths:
- /manifests
clientSecretName: gitlab-repo-credentials
pollingInterval: 30s
targets:
- clusterGroup: production
- clusterGroup: staging

The pollingInterval determines how frequently Fleet checks for new commits. For production workloads, 30 seconds balances responsiveness against API rate limits. The paths field scopes Fleet’s attention to specific directories, allowing you to maintain multiple applications or environments within a single repository. This path-based filtering proves especially valuable in monorepo architectures where dozens of services share a single repository—Fleet only triggers deployments when relevant paths change, reducing unnecessary reconciliation cycles.

Defining Cluster Groups and Targeting Strategies

Cluster groups abstract individual clusters into logical units based on environment, region, or workload type. Define these groups in Fleet’s configuration:

cluster-groups.yaml
apiVersion: fleet.cattle.io/v1alpha1
kind: ClusterGroup
metadata:
name: production
namespace: fleet-default
spec:
selector:
matchLabels:
environment: production
---
apiVersion: fleet.cattle.io/v1alpha1
kind: ClusterGroup
metadata:
name: staging
namespace: fleet-default
spec:
selector:
matchLabels:
environment: staging
region: us-east-1

Labels on your cluster registrations determine group membership. A cluster labeled environment: production automatically joins the production group, enabling dynamic targeting as you add or remove clusters. This label-based approach scales elegantly—when you provision a new production cluster in a different region, simply apply the appropriate labels and Fleet automatically includes it in subsequent deployments without modifying any GitRepo resources.

Consider building a multi-dimensional labeling strategy that captures environment, region, tier, and workload type. This granularity enables precise targeting for canary deployments, regional rollouts, or workload-specific configurations without creating complex conditional logic.

Environment-Specific Overlays with fleet.yaml

The fleet.yaml file within your manifests repository controls how Fleet applies resources to different targets. This file enables Kustomize-style overlays without requiring a separate Kustomize pipeline:

manifests/fleet.yaml
defaultNamespace: webapp
helm:
releaseName: webapp
values:
replicaCount: 2
image:
tag: v1.4.2
targetCustomizations:
- name: production
clusterGroup: production
helm:
values:
replicaCount: 5
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
- name: staging
clusterGroup: staging
helm:
values:
replicaCount: 2
ingress:
host: staging.webapp.example.com

The targetCustomizations array overrides default values per cluster group. Production clusters receive higher replica counts and resource allocations, while staging uses a different ingress hostname. Fleet merges these customizations at deployment time, keeping environment differences declarative and auditable. This declarative approach eliminates the error-prone practice of maintaining separate manifest files for each environment—changes to base configurations automatically propagate everywhere, while environment-specific overrides remain explicit and version-controlled.

💡 Pro Tip: Structure your fleet.yaml to minimize duplication. Define sensible defaults at the top level and use targetCustomizations only for genuine environment differences. This reduces configuration drift and simplifies troubleshooting.

Drift Correction and Sync Behavior

Fleet continuously reconciles cluster state against Git. When someone manually modifies a resource—whether accidentally or through emergency intervention—Fleet detects the drift and restores the declared state:

fleet-gitrepo-with-correction.yaml
spec:
correctDrift:
enabled: true
force: false
keepFailHistory: true

Setting force: false prevents Fleet from overwriting resources with conflicting ownership annotations, avoiding fights with other controllers. The keepFailHistory flag preserves failed deployment attempts for debugging rather than immediately retrying into the same failure. This historical record proves invaluable when troubleshooting intermittent issues or understanding why a particular deployment failed across specific clusters.

For multi-cluster deployments, Fleet’s dependency ordering ensures resources deploy in the correct sequence. Database migrations complete before application pods start, and shared services propagate to all clusters before dependent workloads reference them. You can express these dependencies explicitly using the dependsOn field in your GitRepo resources, creating deployment chains that respect infrastructure prerequisites.

Fleet also supports pausing synchronization for specific GitRepos—useful during maintenance windows or when investigating production issues. Set spec.paused: true temporarily to halt reconciliation without removing the GitRepo entirely, giving your team time to diagnose problems without Fleet continuously reverting manual fixes.

With Fleet watching your manifest repositories and targeting the appropriate clusters, the remaining piece is triggering deployments through GitLab CI when application code changes.

Building the GitLab CI Pipeline

With the GitLab Agent and Rancher Fleet configured, the final piece is a CI pipeline that builds your application, publishes container images, and commits updated manifests to trigger Fleet synchronization. This pipeline embraces the pull-based model: instead of pushing deployments directly to clusters, it updates Git—the single source of truth—and lets Fleet handle the rest.

Pipeline Structure

The pipeline follows four stages that separate concerns and enforce quality gates:

.gitlab-ci.yml
stages:
- build
- test
- publish
- deploy
variables:
IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA
MANIFEST_REPO: gitlab.com/platform-team/fleet-manifests.git
build:
stage: build
image: docker:24.0
services:
- docker:24.0-dind
script:
- docker build -t $IMAGE_TAG .
- docker push $IMAGE_TAG
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
- if: $CI_MERGE_REQUEST_ID
test:
stage: test
image: $IMAGE_TAG
script:
- npm run test:unit
- npm run test:integration
coverage: '/Coverage: \d+\.\d+%/'
artifacts:
reports:
junit: test-results.xml
publish-manifests:
stage: publish
image: alpine/git:2.43
before_script:
- apk add --no-cache yq
- git config --global user.email "[email protected]"
- git config --global user.name "GitLab CI"
script:
- git clone https://oauth2:${MANIFEST_TOKEN}@${MANIFEST_REPO} manifests
- cd manifests
- yq -i '.spec.template.spec.containers[0].image = strenv(IMAGE_TAG)' apps/api/deployment.yaml
- git add .
- git commit -m "Deploy api ${CI_COMMIT_SHORT_SHA} [skip ci]"
- git push origin main
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
deploy-production:
stage: deploy
script:
- echo "Production deployment triggered via Fleet"
environment:
name: production
url: https://api.example.com
when: manual
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH

Leveraging GitLab Container Registry

The pipeline uses GitLab’s built-in container registry, eliminating external dependencies and simplifying authentication. The $CI_REGISTRY_IMAGE variable automatically resolves to your project’s registry path, and authentication happens transparently through the CI_REGISTRY_PASSWORD predefined variable. This tight integration means no additional credential management—runners authenticate automatically when pushing images.

Tagging images with $CI_COMMIT_SHORT_SHA creates an immutable reference—every commit produces a unique, traceable image. This practice simplifies rollbacks and audit trails since you can map any running container directly to its source commit. When investigating production issues, you can immediately identify which code is running and compare it against any previous version.

Consider implementing a retention policy for your container registry to prevent unbounded storage growth. GitLab allows you to configure cleanup policies that automatically remove images older than a specified threshold while preserving tagged releases.

Triggering Fleet Through Manifest Commits

The publish-manifests job is where push-based CI meets pull-based deployment. Rather than executing kubectl apply, the pipeline:

  1. Clones the dedicated manifest repository using a scoped access token
  2. Updates the image reference using yq for precise YAML manipulation
  3. Commits and pushes the change with a descriptive message

Fleet detects this commit within seconds and reconciles the cluster state. The [skip ci] suffix prevents infinite pipeline loops when the manifest repository has its own CI configuration. This separation of concerns—application code in one repository, deployment manifests in another—enables independent versioning and access control for each concern.

💡 Pro Tip: Store the MANIFEST_TOKEN as a protected, masked CI/CD variable. Use a project access token with minimal permissions—only write_repository scope on the manifest repository. Rotate this token periodically and audit its usage through GitLab’s token activity logs.

Implementing Approval Gates

The deploy-production job uses GitLab’s when: manual directive combined with protected environments. Configure the production environment in Settings > CI/CD > Environments to require approval from designated maintainers. This creates an auditable gate: the pipeline runs automatically through staging, but production deployments require explicit human authorization. GitLab records who approved each deployment and when, providing a complete audit trail for compliance requirements.

For multi-cluster scenarios, extend this pattern with environment-specific jobs that enforce promotion workflows:

.gitlab-ci.yml (continued)
deploy-staging:
stage: deploy
environment:
name: staging
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
deploy-production:
stage: deploy
environment:
name: production
needs: [deploy-staging]
when: manual

The needs keyword ensures production deployments only become available after staging succeeds, enforcing your promotion workflow at the pipeline level. Combined with environment-specific approval requirements, this pattern scales to support complex deployment topologies—staging clusters for validation, regional production clusters for gradual rollouts, and disaster recovery clusters for resilience.

You can further enhance this workflow by integrating deployment freezes during critical business periods and scheduled deployment windows that align with your change management policies.

With manifests flowing through Git and Fleet synchronizing clusters automatically, one challenge remains: securely managing the secrets and sensitive configuration that your applications require.

Handling Secrets and Sensitive Configuration

GitOps principles demand that your Git repository serves as the single source of truth for infrastructure state. Secrets create an inherent tension with this principle—storing credentials in Git, even encrypted, introduces risks that compound over time. Repository access changes, encryption keys rotate, and commit history preserves secrets indefinitely. The solution is to separate secret management from manifest storage entirely.

Why Git-Based Secret Storage Fails

Encrypted secrets in Git create operational burden without eliminating risk. SOPS, sealed-secrets, and similar tools require key distribution across clusters, introduce decryption dependencies during deployment, and leave encrypted blobs in your commit history forever. When a key is compromised, you face the impossible task of rotating every secret that was ever encrypted with it.

A pull-based architecture naturally supports external secret injection. Since Fleet controllers run inside your clusters, they can authenticate directly with secret management systems without exposing credentials to your CI pipeline.

Integrating HashiCorp Vault with Fleet

Fleet’s Helm integration supports Vault injection through the Agent Injector pattern. Configure your Fleet GitRepo to deploy the Vault Agent alongside your workloads:

fleet.yaml
helm:
releaseName: api-service
values:
vault:
enabled: true
role: api-service-role
secretPath: secret/data/production/api-service
podAnnotations:
vault.hashicorp.com/agent-inject: "true"
vault.hashicorp.com/role: "api-service-role"
vault.hashicorp.com/agent-inject-secret-config: "secret/data/production/api-service"

The Vault Agent runs as an init container, authenticating via Kubernetes service account tokens and writing secrets to an in-memory volume before your application starts.

External Secrets Operator as an Alternative

For teams already invested in cloud provider secret managers, the External Secrets Operator provides a Kubernetes-native abstraction. Define an ExternalSecret resource that Fleet deploys alongside your application:

external-secret.yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: api-credentials
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: api-credentials
data:
- secretKey: database-url
remoteRef:
key: production/api-service
property: DATABASE_URL

The operator continuously reconciles external secrets with Kubernetes Secret objects, handling rotation automatically.

Build-Time vs. Runtime Secrets

Reserve GitLab CI/CD variables for build-time secrets only—container registry credentials, artifact signing keys, and security scanning tokens. These never touch your Kubernetes clusters. Runtime secrets flow exclusively through Vault or External Secrets, maintaining clear separation between pipeline execution and workload configuration.

💡 Pro Tip: Configure Vault’s Kubernetes auth method with bound service account names matching your workload identities. This eliminates static credentials entirely—pods authenticate using their existing service account tokens.

With secrets flowing securely into your clusters, you need visibility into the entire pipeline. Effective monitoring and troubleshooting practices ensure you catch configuration drift before it impacts production.

Monitoring and Troubleshooting the Pipeline

A GitOps pipeline is only as reliable as your ability to observe it. With deployments spanning multiple clusters through Fleet and GitLab Agents, you need visibility into every layer of the stack.

Fleet Dashboard: Your Deployment Command Center

Rancher’s Fleet dashboard provides a consolidated view of GitRepo synchronization status across all managed clusters. Navigate to Continuous Delivery → GitRepos to see sync state, last update timestamps, and any resources that failed to apply. Each GitRepo entry shows which clusters received the deployment and whether they’re in sync with HEAD.

Pay attention to the Bundle status within each GitRepo. A GitRepo can show as “Active” while individual bundles targeting specific clusters report errors. Drill into bundle details to see per-resource apply status and error messages from the downstream clusters.

GitLab Agent Health Monitoring

The GitLab Agent exposes Prometheus metrics on port 8080 by default. Key metrics to track include gitops_sync_duration_seconds for sync performance, grpc_client_started_total for connection activity, and gitops_resource_apply_errors_total for deployment failures.

In the GitLab UI, navigate to Infrastructure → Kubernetes clusters to view agent connection status. A disconnected agent means the cluster stops receiving updates—Fleet will queue changes, but nothing applies until the connection restores.

Common Failure Modes

Manifest validation failures occur when Fleet successfully pulls from the repository but Kubernetes rejects the resource. Check Fleet’s GitRepo status for apply errors, then validate manifests locally with kubectl apply --dry-run=server.

Agent connectivity issues typically stem from network policies blocking egress to GitLab or expired agent tokens. Verify the kas endpoint is reachable from the cluster and regenerate tokens if authentication fails.

Drift between clusters happens when manual changes bypass GitOps. Fleet detects this during reconciliation—resources modified outside the pipeline get reverted. Enable drift detection alerts in Fleet to catch these incidents.

💡 Pro Tip: Configure Prometheus alerts for gitops_resource_apply_errors_total > 0 sustained over 5 minutes. This catches failed deployments before they become incidents.

With observability in place, your pull-based pipeline becomes a self-documenting system where every deployment is traceable from commit to cluster state.

Key Takeaways

  • Replace kubectl commands in CI pipelines with manifest commits that trigger Fleet sync—eliminating credential sprawl and network dependency
  • Use the GitLab Agent’s outbound-only connection model to deploy to clusters behind firewalls without exposing Kubernetes APIs
  • Implement cluster groups in Fleet to manage environment promotion through Git branches rather than pipeline variables
  • Store deployment manifests in a separate repository from application code to enable independent release cycles and clear audit trails