What is Cluster API Add-on Provider for Fleet (CAAPF)?
Cluster API Add-on Provider for Fleet (CAAPF) is a Cluster API (CAPI) provider that provides integration with Fleet to enable the easy deployment of applications to a CAPI provisioned cluster.
It provides the following functionality:
- Addon provider automatically installs
Fleetin your management cluster. - The provider will register a newly provisioned CAPI cluster with
Fleetso that applications can be automatically deployed to the created cluster via GitOps,BundleorHelmOp. - The provider will automatically create a Fleet Cluster Group for every CAPI ClusterClass. This enables you to deploy the same applications to all clusters created from the same ClusterClass.
CAPICluster,ControlPlaneresources are automatically added to theFleetClusterresource templates, allowing to perform per-cluster configuration templating forHelmbased installations.
Installation
Clusterctl
To install provider with clusterctl:
- Install
clusterctl - Run
clusterctl init --addon rancher-fleet
Cluster API Operator
You can install production instance of CAAPF in your cluster with CAPI Operator.
We need to install cert-manager as a pre-requisite to CAPI Operator, if it is not currently installed:
kubectl apply -f https://github.com/jetstack/cert-manager/releases/latest/download/cert-manager.yaml
To install CAPI Operator, docker infrastructure provider and the fleet addon together:
helm repo add capi-operator https://kubernetes-sigs.github.io/cluster-api-operator
helm repo update
helm upgrade --install capi-operator capi-operator/cluster-api-operator \
--create-namespace -n capi-operator-system \
--set infrastructure.docker.enabled=true --set addon.rancher-fleet.enabled=true
Demo
Calico CNI installation demo
Motivation
Currently, in the CAPI ecosystem, several solutions exist for deploying applications as add-ons on clusters provisioned by CAPI. However, this idea and its alternatives have not been actively explored upstream, particularly in the GitOps space. The need to address this gap was raised in the Cluster API Addon Orchestration proposal.
One of the projects involved in deploying Helm charts on CAPI-provisioned clusters is the CAPI Addon Provider Helm (CAAPH). This solution enables users to automatically install HelmChartProxy on provisioned clusters.
Fleet also supports deploying Helm charts via the (experimental) HelmOp resource, which offers similar capabilities to HelmChartProxy. However, Fleet primarily focuses on providing GitOps capabilities for managing CAPI clusters and application states within these clusters.
Out of the box, Fleet allows users to deploy and maintain the state of arbitrary templates on child clusters using the Fleet Bundle resource. This approach addresses the need for alternatives to ClusterResourceSet while offering full application lifecycle management.
CAAPF is designed to streamline and enhance native Fleet integration with CAPI. It functions as a separate Addon provider that can be installed via clusterctl or the CAPI Operator.
User Stories
User Story 1
As an infrastructure provider, I want to deploy my provisioning application to every provisioned child cluster so that I can provide immediate functionality during and after cluster bootstrap.
User Story 2
As a DevOps engineer, I want to use GitOps practices to deploy CAPI clusters and applications centrally so that I can manage all cluster configurations and deployed applications from a single location.
User Story 3
As a user, I want to deploy applications into my CAPI clusters and configure those applications based on the cluster infrastructure templates so that they are correctly provisioned for the cluster environment.
User Story 4
As a cluster operator, I want to streamline the provisioning of Cluster API child clusters so that they can be successfully provisioned and become Ready from a template without manual intervention.
User Story 5
As a cluster operator, I want to facilitate the provisioning of Cluster API child clusters located behind NAT so that they can be successfully provisioned and establish connectivity with the management cluster.
Getting Started
This section contains guides on how to get started with CAAPF and Fleet
Installation
Clusterctl
To install provider with clusterctl:
- Install
clusterctl - Run
clusterctl init --addon rancher-fleet
Cluster API Operator
You can install production instance of CAAPF in your cluster with CAPI Operator.
We need to install cert-manager as a pre-requisite to CAPI Operator, if it is not currently installed:
kubectl apply -f https://github.com/jetstack/cert-manager/releases/latest/download/cert-manager.yaml
To install CAPI Operator, docker infrastructure provider and the fleet addon together:
helm repo add capi-operator https://kubernetes-sigs.github.io/cluster-api-operator
helm repo update
helm upgrade --install capi-operator capi-operator/cluster-api-operator \
--create-namespace -n capi-operator-system \
--set infrastructure.docker.enabled=true --set addon.rancher-fleet.enabled=true
Configuration
Installing Fleet
By default CAAPF expects your cluster to have Fleet helm chart pre-installed and configured, but it can manage Fleet installation via FleetAddonConfig resource, named fleet-addon-config. To install Fleet helm chart with latest stable Fleet version:
apiVersion: addons.cluster.x-k8s.io/v1alpha1
kind: FleetAddonConfig
metadata:
name: fleet-addon-config
spec:
config:
server:
inferLocal: true # Uses default `kuberenetes` endpoint and secret for APIServerURL configuration
install:
followLatest: true
Alternatively, a specific version can be provided in the spec.install.version:
apiVersion: addons.cluster.x-k8s.io/v1alpha1
kind: FleetAddonConfig
metadata:
name: fleet-addon-config
spec:
config:
server:
inferLocal: true # Uses default `kuberenetes` endpoint and secret for APIServerURL configuration
install:
followLatest: true
Fleet Public URL and Certificate setup
Fleet agent requires direct access to the Fleet server instance running in the management cluster. When provisioning Fleet agent on the downstream cluster using the default manager-initiated registration, the public API server url and certificates will be taken from the current Fleet server configuration.
If a user installaling Fleet via FleetAddonConfig resource, there are fields which allow to configure these settings.
Field config.server allows to specify setting for the Fleet server configuration, such as apiServerURL and certificates.
Using inferLocal: true setting allows to use default kubernetes endpoint and CA secret to configure the Fleet instance.
apiversion: addons.cluster.x-k8s.io/v1alpha1
kind: FleetAddonConfig
metadata:
name: fleet-addon-config
spec:
config:
server:
inferLocal: true # Uses default `kuberenetes` endpoint and secret for APIServerURL configuration
install:
followLatest: true
This scenario works well in a test setup, while using CAPI docker provider and docker clusters.
Here is an example of a manulal API server URL configuration with a reference to certificates ConfigMap or Secret, which contains a ca.crt data key for the Fleet helm chart:
apiversion: addons.cluster.x-k8s.io/v1alpha1
kind: FleetAddonConfig
metadata:
name: fleet-addon-config
spec:
config:
server:
apiServerUrl: "https://public-url.io"
apiServerCaConfigRef:
apiVersion: v1
kind: ConfigMap
name: kube-root-ca.crt
namespace: default
install:
followLatest: true # Installs current latest version of fleet from https://github.com/rancher/fleet-helm-charts
Cluster Import Strategy
Fleet Feature Flags
Fleet includes experimental features that can be enabled or disabled using feature gates in the FleetAddonConfig resource. These flags are configured under .spec.config.featureGates.
To enable experimental features such as OCI storage support and HelmOp support, update the FleetAddonConfig as follows:
apiVersion: addons.cluster.x-k8s.io/v1alpha1
kind: FleetAddonConfig
metadata:
name: fleet-addon-config
spec:
config:
featureGates:
experimentalOciStorage: true # Enables experimental OCI storage support
experimentalHelmOps: true # Enables experimental Helm operations support
By default, if the featureGates field is not present, these feature gates are enabled. To disable these need to explicitly be set to false.
Optionally, the featureGates flags can be synced to a ConfigMap object.
This is useful when Fleet is installed and managed by Rancher.
When a ConfigMap reference is defined, the controller will just sync the featureGates to it, without making any changes to the Fleet helm chart.
apiVersion: addons.cluster.x-k8s.io/v1alpha1
kind: FleetAddonConfig
metadata:
name: fleet-addon-config
spec:
config:
featureGates:
experimentalOciStorage: true # Enables experimental OCI storage support
experimentalHelmOps: true # Enables experimental Helm operations support
configMap:
ref:
apiVersion: v1
kind: ConfigMap
name: rancher-config
namespace: cattle-system
Tutorials
This section contains tutorials, such as quick-start, installation, application deployments and operator guides.
Prerequisites
Requirements
- helm
- CAPI management cluster.
- Features
EXP_CLUSTER_RESOURCE_SETandCLUSTER_TOPOLOGYmust be enabled. - clusterctl.
- Features
Create your local cluster
NOTE: if you prefer to opt for a one-command installation, you can refer to the notes on how to use
justand the project'sjustfilehere.
- Start by adding the helm repositories that are required to proceed with the installation.
helm repo add fleet https://rancher.github.io/fleet-helm-charts/
helm repo update
- Create the local cluster
kind create cluster --config testdata/kind-config.yaml
- Install fleet and specify the
API_SERVER_URLand CA.
# We start by retrieving the CA data from the cluster
kubectl config view -o json --raw | jq -r '.clusters[] | select(.name=="kind-dev").cluster["certificate-authority-data"]' | base64 -d > _out/ca.pem
# Set the API server URL
API_SERVER_URL=`kubectl config view -o json --raw | jq -r '.clusters[] | select(.name=="kind-dev").cluster["server"]'`
# And proceed with the installation via helm
helm -n cattle-fleet-system install --version v0.12.0 --create-namespace --wait fleet-crd fleet/fleet-crd
helm install --create-namespace --version v0.12.0 -n cattle-fleet-system --set apiServerURL=$API_SERVER_URL --set-file apiServerCA=_out/ca.pem fleet fleet/fleet --wait
- Install CAPI with the required experimental features enabled and initialized the Docker provider for testing.
EXP_CLUSTER_RESOURCE_SET=true CLUSTER_TOPOLOGY=true clusterctl init -i docker --addon rancher-fleet
Wait for all pods to become ready and your cluster should be ready to use CAAPF!
Create your downstream cluster
In order to initiate CAAPF autoimport, a CAPI Cluster needs to be created.
To create one, we can either follow quickstart documentation or create a cluster from existing template.
kubectl apply -f testdata/capi-quickstart.yaml
For more advanced cluster import strategy, check the configuration section.
Remember that you can follow along with the video demo to install the provider and get started quickly.
Installing Kindnet CNI using resource Bundle
This section describes steps to install kindnet CNI solution on a CAPI cluster using Fleet Bundle resource.
Deploying Kindnet
We will use Fleet Bundle resource to deploy Kindnet on the docker cluster.
> kubectl get clusters
NAME CLUSTERCLASS PHASE AGE VERSION
docker-demo quick-start Provisioned 35h v1.29.2
First, let's review our targes for the kindnet bundle. They should match labels on the cluster, or the name of the cluster, as in this instance:
targets:
- clusterName: docker-demo
We will apply the resource from the:
kind: Bundle
apiVersion: fleet.cattle.io/v1alpha1
metadata:
name: kindnet-cni
spec:
resources:
# List of all resources that will be deployed
- content: |-
# kindnetd networking manifest
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: kindnet
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- patch
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: kindnet
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kindnet
subjects:
- kind: ServiceAccount
name: kindnet
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kindnet
namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kindnet
namespace: kube-system
labels:
tier: node
app: kindnet
k8s-app: kindnet
spec:
selector:
matchLabels:
app: kindnet
template:
metadata:
labels:
tier: node
app: kindnet
k8s-app: kindnet
spec:
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: kindnet
containers:
- name: kindnet-cni
image: kindest/kindnetd:v20230511-dc714da8
env:
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: POD_SUBNET
value: '10.1.0.0/16'
volumeMounts:
- name: cni-cfg
mountPath: /etc/cni/net.d
- name: xtables-lock
mountPath: /run/xtables.lock
readOnly: false
- name: lib-modules
mountPath: /lib/modules
readOnly: true
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_RAW", "NET_ADMIN"]
volumes:
- name: cni-bin
hostPath:
path: /opt/cni/bin
type: DirectoryOrCreate
- name: cni-cfg
hostPath:
path: /etc/cni/net.d
type: DirectoryOrCreate
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
- name: lib-modules
hostPath:
path: /lib/modules
name: kindnet.yaml
targets:
- clusterName: docker-demo
> kubectl apply -f testdata/cni.yaml
bundle.fleet.cattle.io/kindnet-cni configured
After some time we should see the resource in a ready state:
> kubectl get bundles kindnet-cni
NAME BUNDLEDEPLOYMENTS-READY STATUS
kindnet-cni 1/1
This should result in a kindnet running on the matching cluster:
> kubectl get pods --context docker-demo -A | grep kindnet
kube-system kindnet-dqzwh 1/1 Running 0 2m11s
kube-system kindnet-jbkjq 1/1 Running 0 2m11s
Demo
Installing Calico CNI using HelmOp
Note: For this setup to work, you need to install Fleet and Fleet CRDs charts via
FleetAddonConfig resource. Both need to have version >= v0.12.0,
which provides support for HelmOp resource.
In this tutorial we will deploy Calico CNI using HelmOp resource and Fleet cluster substitution mechanism.
Deploying Calico CNI
Here's an example of how a HelmOp resource can be used in combination with templateValues to deploy application consistently on any matching cluster.
In this scenario we are matching cluster directly by name, using clusterName reference, but a clusterGroup or a label based selection can be used instead or together with clusterName:
targets:
- clusterName: docker-demo
We are deploying HelmOp resource in the default namespace. The namespace should be the same for the CAPI Cluster for fleet to locate it.
apiVersion: fleet.cattle.io/v1alpha1
kind: HelmOp
metadata:
name: calico
spec:
helm:
releaseName: projectcalico
repo: https://docs.tigera.io/calico/charts
chart: tigera-operator
templateValues:
installation: |-
cni:
type: Calico
ipam:
type: HostLocal
calicoNetwork:
bgp: Disabled
mtu: 1350
ipPools:
${- range $cidr := .ClusterValues.Cluster.spec.clusterNetwork.pods.cidrBlocks }
- cidr: "${ $cidr }"
encapsulation: None
natOutgoing: Enabled
nodeSelector: all()${- end}
insecureSkipTLSVerify: true
targets:
- clusterName: docker-demo
- clusterGroup: quick-start.clusterclass
HelmOp supports fleet templating options, otherwise available exclusively to the fleet.yaml configuration, stored in the git repository contents, and applied via the GitRepo resource.
In this example we are using values from the Cluster.spec.clusterNetwork.pods.cidrBlocks list to define ipPools for the calicoNetwork. These chart settings will be unique per each matching cluster, and based on the observed cluster state at any moment.
After appying the resource we will observe the app rollout:
> kubectl apply -f testdata/helm.yaml
helmop.fleet.cattle.io/calico created
> kubectl get helmop
NAME REPO CHART VERSION BUNDLEDEPLOYMENTS-READY STATUS
calico https://docs.tigera.io/calico/charts tigera-operator v3.29.2 0/1 NotReady(1) [Bundle calico]; apiserver.operator.tigera.io default [progressing]
# After some time
> kubectl get helmop
NAME REPO CHART VERSION BUNDLEDEPLOYMENTS-READY STATUS
calico https://docs.tigera.io/calico/charts tigera-operator v3.29.2 1/1
> kubectl get pods -n calico-system --context capi-quickstart
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-9cd68cb75-p46pz 1/1 Running 0 53s
calico-node-bx5b6 1/1 Running 0 53s
calico-node-hftwd 1/1 Running 0 53s
calico-typha-6d9fb6bcb4-qz6kt 1/1 Running 0 53s
csi-node-driver-88jqc 2/2 Running 0 53s
csi-node-driver-mjwxc 2/2 Running 0 53s
Demo
You can follow along with the demo to verify that your deployment is matching expected result:
Installing Calico CNI using GitRepo
Note: For this setup to work, you need have Fleet and Fleet CRDs charts installed
with version >= v0.12.0.
In this tutorial we will deploy Calico CNI using GitRepo resource on RKE2 based docker cluster.
Deploying RKE2 docker cluster
We will first need to create a RKE2 based docker cluster from templates:
> kubectl apply -f testdata/cluster_docker_rke2.yaml
dockercluster.infrastructure.cluster.x-k8s.io/docker-demo created
cluster.cluster.x-k8s.io/docker-demo created
dockermachinetemplate.infrastructure.cluster.x-k8s.io/docker-demo-control-plane created
rke2controlplane.controlplane.cluster.x-k8s.io/docker-demo-control-plane created
dockermachinetemplate.infrastructure.cluster.x-k8s.io/docker-demo-md-0 created
rke2configtemplate.bootstrap.cluster.x-k8s.io/docker-demo-md-0 created
machinedeployment.cluster.x-k8s.io/docker-demo-md-0 created
configmap/docker-demo-lb-config created
In this scenario cluster is located in the default namespace, where the rest of fleet objects will go.
Cluster is labeled with cni: calico in order for the GitRepo to match on it.
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: docker-demo
labels:
cni: calico
Now that cluster is created, GitRepo can be applied which will be evaluated asynchroniously.
Deploying Calico CNI via GitRepo
We will first review the content of our fleet.yaml file:
helm:
releaseName: projectcalico
repo: https://docs.tigera.io/calico/charts
chart: tigera-operator
templateValues:
installation: |-
cni:
type: Calico
ipam:
type: HostLocal
calicoNetwork:
bgp: Disabled
mtu: 1350
ipPools:
${- range $cidr := .ClusterValues.Cluster.spec.clusterNetwork.pods.cidrBlocks }
- cidr: "${ $cidr }"
encapsulation: None
natOutgoing: Enabled
nodeSelector: all()${- end}
diff:
comparePatches:
- apiVersion: operator.tigera.io/v1
kind: Installation
name: default
operations:
- {"op":"remove", "path":"/spec/kubernetesProvider"}
In this scenario we are using helm definition which is consistent with the HelmOp spec from the previous guide, and defines same templating rules.
We also need to resolve conflicts, which happen due to in-place modification of some resources by the calico controllers. For that, the diff section is used, where we remove blocking fields from comparison.
Once everything is ready, we need to apply our GitRepo in the default namespace. In our case, we will match on clusters labeled with cni: calico label:
apiVersion: fleet.cattle.io/v1alpha1
kind: GitRepo
metadata:
name: calico
spec:
branch: main
paths:
- /fleet/applications/calico
repo: https://github.com/rancher/cluster-api-addon-provider-fleet.git
targets:
- clusterSelector:
matchLabels:
cni: calico
> kubectl apply -f testdata/gitrepo-calico.yaml
gitrepo.fleet.cattle.io/calico created
# After some time
> kubectl get gitrepo
NAME REPO COMMIT BUNDLEDEPLOYMENTS-READY STATUS
calico https://github.com/rancher/cluster-api-addon-provider-fleet.git 62b4fe6944687e02afb331b9e1839e33c539f0c7 1/1
Now our cluster have calico installed, and all nodes are marked as Ready:
# exec into one of the CP node containers
> docker exec -it fef3427009f6 /bin/bash
root@docker-demo-control-plane-krtnt:/#
root@docker-demo-control-plane-krtnt:/# kubectl get pods -n calico-system --kubeconfig /var/lib/rancher/rke2/server/cred/api-server.kubeconfig
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-55cbcc7467-j5bbd 1/1 Running 0 3m30s
calico-node-mbrqg 1/1 Running 0 3m30s
calico-node-wlbwn 1/1 Running 0 3m30s
calico-typha-f48c7ddf7-kbq6d 1/1 Running 0 3m30s
csi-node-driver-87tlx 2/2 Running 0 3m30s
csi-node-driver-99pqw 2/2 Running 0 3m30s
Demo
You can follow along with the demo to verify that your deployment is matching expected result:
Reference
This section contains reference guides and information about main CAAPF features and how to use them.
Import Strategy
CAAPF follows a simple import strategy for CAPI clusters:
- Each CAPI cluster has a corresponding Fleet
Clusterobject. - Each CAPI Cluster Class has a corresponding Fleet
ClusterGroupobject. - When a CAPI
Clusterreferences aClusterClassin a different namespace, aClusterGroupis created in theClusternamespace. ThisClusterGrouptargets all clusters in this namespace that reference the sameClusterClass. See the configuration section for details. - If at least one CAPI
Clusterreferences aClusterClassin a different namespace, aBundleNamespaceMappingis created in theClusterClassnamespace. This allows FleetClusterresources to use application sources such asBundles,HelmOps, orGitReposfrom theClusterClassnamespace as if they were deployed in theClusternamespace. See the configuration section for details.
By default, CAAPF imports all CAPI clusters under Fleet management. See the configuration section for details.
Label Synchronization
Fleet relies on Cluster labels, Cluster names, and ClusterGroups for target matching when deploying applications or referenced repository content. To ensure consistency, CAAPF synchronizes resource labels:
- From the CAPI
ClusterClassto the imported FleetClusterresource. - From the CAPI
ClusterClassto the imported FleetClusterGroupresource.
When a CAPI Cluster references a ClusterClass, CAAPF applies two specific labels to both the Cluster and ClusterGroup resources:
clusterclass-name.fleet.addons.cluster.x-k8s.io: <class-name>clusterclass-namespace.fleet.addons.cluster.x-k8s.io: <class-ns>
Templating strategy
The Cluster API Addon Provider Fleet automates application templating for imported CAPI clusters based on matching cluster state.
Functionality
The Addon Provider Fleet ensures that the state of a CAPI cluster and resources is always up-to-date in the spec.templateValues.ClusterValues field of the Fleet cluster resource. This allows users to:
- Reference specific parts of CAPI cluster directly or via Helm substitution patterns referencing
.ClusterValues.Clusterdata. - Substiture based on the state of the control plane resource via
.ClusterValues.ControlPlanefield. - Substiture based on the state of the infrastructure cluster resource via
.ClusterValues.InfrastructureClusterfield. - Maintain a consistent application state across different clusters.
- Use the same template for multiple matching clusters to simplify deployment and management.
Example - templating withing HelmOp
FleetAddonConfig Reference
The FleetAddonConfig Custom Resource Definition (CRD) is used to configure the behavior of the Cluster API Addon Provider for Fleet.
Spec
The spec field of the FleetAddonConfig CRD contains the configuration options.
It is a required field and provides a config for fleet addon functionality.
-
config-
Description: An object that holds various configuration settings.
-
Type:
object -
Optional: Yes
-
config.bootstrapLocalCluster- Description: Enable auto-installation of a fleet agent in the local cluster.
- Type:
boolean - Optional: Yes
When set to
true, the provider will automatically install a Fleet agent in the cluster where the provider is running. This is useful for bootstrapping a local development or management cluster to be managed by Fleet.Example:
spec: config: bootstrapLocalCluster: true -
config.featureGates- Description: Feature gates controlling experimental features.
- Type:
object - Optional: Yes
This section allows enabling or disabling experimental features within the provider.
-
config.featureGates.configMap- Description: References a ConfigMap where to apply feature flags. If a ConfigMap is referenced, the controller will update it instead of upgrading the Fleet chart.
- Type:
object(ObjectReference) - Optional: Yes
Example:
spec: config: featureGates: configMap: ref: apiVersion: v1 kind: ConfigMap name: fleet-feature-flags namespace: fleet-system -
config.featureGates.experimentalHelmOps- Description: Enables experimental Helm operations support.
- Type:
boolean - Optional: No (Required within
featureGates)
Example:
spec: config: featureGates: experimentalHelmOps: true -
config.featureGates.experimentalOciStorage- Description: Enables experimental OCI storage support.
- Type:
boolean - Optional: No (Required within
featureGates)
Example:
spec: config: featureGates: experimentalOciStorage: true
-
config.server- Description: Fleet server URL configuration options.
- Type:
object(oneOfinferLocalorcustom) - Optional: Yes
This section configures how the provider connects to the Fleet server. You must specify either
inferLocalorcustom.-
config.server.inferLocal- Description: Infer the local cluster's API server URL as the Fleet server URL.
- Type:
boolean - Optional: No (Required if
customis not set)
Example:
spec: config: server: inferLocal: true -
config.server.custom-
Description: Custom configuration for the Fleet server URL.
-
Type:
object -
Optional: No (Required if
inferLocalis not set) -
config.server.custom.apiServerCaConfigRef- Description: Reference to a ConfigMap containing the CA certificate for the API server.
- Type:
object(ObjectReference) - Optional: Yes
Example:
spec: config: server: custom: apiServerCaConfigRef: apiVersion: v1 kind: ConfigMap name: fleet-server-ca namespace: fleet-system -
config.server.custom.apiServerUrl- Description: The custom URL for the Fleet API server.
- Type:
string - Optional: Yes
Example:
spec: config: server: custom: apiServerUrl: https://fleet.example.com
-
-
-
cluster- Description: Enable Cluster config functionality. This will create Fleet Cluster for each Cluster with the same name. In case the cluster specifies topology.class, the name of the ClusterClass will be added to the Fleet Cluster labels.
- Type:
object - Optional: Yes
This section configures the behavior for creating Fleet Clusters from Cluster API Clusters.
-
cluster.agentEnvVars- Description: Extra environment variables to be added to the agent deployment.
- Type:
arrayofobject(EnvVar) - Optional: Yes
Example:
spec: cluster: agentEnvVars: - name: HTTP_PROXY value: http://proxy.example.com:8080 - name: NO_PROXY value: localhost,127.0.0.1,.svc -
cluster.agentNamespace- Description: Namespace selection for the fleet agent.
- Type:
string - Optional: Yes
Example:
spec: cluster: agentNamespace: fleet-agents -
cluster.agentTolerations- Description: Agent taint toleration settings for every cluster.
- Type:
arrayofobject(Toleration) - Optional: Yes
Example:
spec: cluster: agentTolerations: - key: "node.kubernetes.io/unreachable" operator: "Exists" effect: "NoExecute" tolerationSeconds: 600 - key: "node.kubernetes.io/not-ready" operator: "Exists" effect: "NoExecute" tolerationSeconds: 600 -
cluster.applyClassGroup- Description: Apply a ClusterGroup for a ClusterClass referenced from a different namespace.
- Type:
boolean - Optional: Yes
When a CAPI
Clusterreferences aClusterClassin a different namespace, a correspondingClusterGroupis created in theClusternamespace. This ensures that all clusters within the namespace that share the sameClusterClassfrom another namespace are grouped together.This
ClusterGroupinheritsClusterClasslabels and applies twoCAAPF-specific labels to uniquely identify the group within the cluster scope:clusterclass-name.fleet.addons.cluster.x-k8s.io: <class-name>clusterclass-namespace.fleet.addons.cluster.x-k8s.io: <class-ns>
Additionally, this configuration enables the creation of a
BundleNamespaceMapping. This mapping selects all available bundles and establishes a link between the namespace of theClusterand the namespace of the referencedClusterClass. This allows the FleetClusterto be evaluated as a target for application sources such asBundles,HelmOps, orGitReposfrom theClusterClassnamespace.When all CAPI
Clusterresources referencing the sameClusterClassare removed, both theClusterGroupandBundleNamespaceMappingare cleaned up.Note: If the
clusterfield is not set, this setting is enabled by default.Example:
spec: cluster: applyClassGroup: true -
cluster.hostNetwork- Description: Host network allows to deploy agent configuration using
hostNetwork: truesetting which eludes dependency on the CNI configuration for the cluster. - Type:
boolean - Optional: Yes
Example:
spec: cluster: hostNetwork: true - Description: Host network allows to deploy agent configuration using
-
cluster.namespaceSelector- Description: Namespace label selector. If set, only clusters in the namespace matching label selector will be imported. This configuration defines how to select namespaces based on specific labels. The
namespaceSelectorfield ensures that the import strategy applies only to namespaces that have the labelimport: "true". This is useful for scoping automatic import to specific namespaces rather than applying it cluster-wide. - Type:
object(LabelSelector) - Optional: No (Required within
cluster)
Example:
apiVersion: addons.cluster.x-k8s.io/v1alpha1 kind: FleetAddonConfig metadata: name: fleet-addon-config spec: cluster: namespaceSelector: matchLabels: import: "true" - Description: Namespace label selector. If set, only clusters in the namespace matching label selector will be imported. This configuration defines how to select namespaces based on specific labels. The
-
cluster.naming- Description: Naming settings for the fleet cluster.
- Type:
object - Optional: Yes
This section allows customizing the name of the created Fleet Cluster resource.
-
cluster.naming.prefix- Description: Specify a prefix for the Cluster name, applied to created Fleet cluster.
- Type:
string - Optional: Yes
Example:
spec: cluster: naming: prefix: capi- -
cluster.naming.suffix- Description: Specify a suffix for the Cluster name, applied to created Fleet cluster.
- Type:
string - Optional: Yes
Example:
spec: cluster: naming: suffix: -fleet
-
cluster.patchResource- Description: Allow to patch resources, maintaining the desired state. If is not set, resources will only be re-created in case of removal.
- Type:
boolean - Optional: Yes
Example:
spec: cluster: patchResource: true -
cluster.selector- Description: Cluster label selector. If set, only clusters matching label selector will be imported. This configuration filters clusters based on labels, ensuring that the
FleetAddonConfigapplies only to clusters with the labelimport: "true". This allows more granular per-cluster selection across the cluster scope. - Type:
object(LabelSelector) - Optional: No (Required within
cluster)
Example:
apiVersion: addons.cluster.x-k8s.io/v1alpha1 kind: FleetAddonConfig metadata: name: fleet-addon-config spec: cluster: selector: matchLabels: import: "true" - Description: Cluster label selector. If set, only clusters matching label selector will be imported. This configuration filters clusters based on labels, ensuring that the
-
cluster.setOwnerReferences- Description: Setting to disable setting owner references on the created resources.
- Type:
boolean - Optional: Yes
Example:
spec: cluster: setOwnerReferences: false
-
clusterClass- Description: Enable clusterClass controller functionality. This will create Fleet ClusterGroups for each ClusterClaster with the same name.
- Type:
object - Optional: Yes
This section configures the behavior for creating Fleet ClusterGroups from Cluster API ClusterClasses.
-
clusterClass.patchResource- Description: Allow to patch resources, maintaining the desired state. If is not set, resources will only be re-created in case of removal.
- Type:
boolean - Optional: Yes
Example:
spec: clusterClass: patchResource: true -
clusterClass.setOwnerReferences- Description: Setting to disable setting owner references on the created resources.
- Type:
boolean - Optional: Yes
Example:
spec: clusterClass: setOwnerReferences: false
-
install- Description: Configuration for installing the Fleet chart.
- Type:
object(oneOffollowLatestorversion) - Optional: Yes
This section configures how the Fleet chart is installed. You must specify either
followLatestorversion.-
install.followLatest- Description: Follow the latest version of the chart on install.
- Type:
boolean - Optional: No (Required if
versionis not set)
Example:
spec: install: followLatest: true -
install.version- Description: Use specific version to install.
- Type:
string - Optional: No (Required if
followLatestis not set)
Example:
spec: install: version: 0.12.0
Developers
This section contains developer oriented guides
CAAPF Releases
Release Cadence
- New versions are usually released every 2-4 weeks.
Release Process
- Clone the repository locally:
git clone git@github.com:rancher/cluster-api-addon-provider-fleet.git
-
Depending on whether you are cutting a minor/major or patch release, the process varies.
-
If you are cutting a new minor/major release:
Create a new release branch (i.e release-X) and push it to the upstream repository.
# Note: `upstream` must be the remote pointing to `github.com/rancher/cluster-api-addon-provider-fleet`. git checkout -b release-0.4 git push -u upstream release-0.4 # Export the tag of the minor/major release to be cut, e.g.: export RELEASE_TAG=v0.4.0 -
If you are cutting a patch release from an existing release branch:
Use existing release branch.
# Note: `upstream` must be the remote pointing to `github.com/rancher/cluster-api-addon-provider-fleet` git checkout upstream/release-0.4 # Export the tag of the patch release to be cut, e.g.: export RELEASE_TAG=v0.4.1
-
-
Create a signed/annotated tag and push it:
# Create tags locally
git tag -s -a ${RELEASE_TAG} -m ${RELEASE_TAG}
# Push tags
git push upstream ${RELEASE_TAG}
This will trigger a release GitHub action that creates a release with CAAPF components.
- Wait for the update metadata workflow to pass successfully.
This workflow will update the
metadata.yamlfile in the root of the repository preparing it for the next release. It will open a PR, which needs to be merged before the next minor version release can be cut.
WARNING: Out of date published metadata.yaml file will cause upstream install via clusterctl to fail
- Perform Downstream Build
Perform the downstream build for the release tag using the CAAPF GitHub action. Specific steps and references for this process can be found by asking in the #team-rancher-highlander channel.
Versioning
CAAPF follows semantic versioning specification.
Example versions:
- Pre-release:
v0.4.0-alpha.1 - Minor release:
v0.4.0 - Patch release:
v0.4.1 - Major release:
v1.0.0
With the v0 release of our codebase, we provide the following guarantees:
-
A (minor) release CAN include:
- Introduction of new API versions, or new Kinds.
- Compatible API changes like field additions, deprecation notices, etc.
- Breaking API changes for deprecated APIs, fields, or code.
- Features, promotion or removal of feature gates.
- And more!
-
A (patch) release SHOULD only include backwards compatible set of bugfixes.
Backporting
Any backport MUST not be breaking for either API or behavioral changes.
It is generally not accepted to submit pull requests directly against release branches (release-X). However, backports of fixes or changes that have already been merged into the main branch may be accepted to all supported branches:
- Critical bugs fixes, security issue fixes, or fixes for bugs without easy workarounds.
- Dependency bumps for CVE (usually limited to CVE resolution; backports of non-CVE related version bumps are considered exceptions to be evaluated case by case)
- Cert-manager version bumps (to avoid having releases with cert-manager versions that are out of support, when possible)
- Changes required to support new Kubernetes versions, when possible. See supported Kubernetes versions for more details.
- Changes to use the latest Go patch version to build controller images.
- Improvements to existing docs (the latest supported branch hosts the current version of the book)
Branches
CAAPF has two types of branches: the main and release-X branches.
The main branch is where development happens. All the latest and greatest code, including breaking changes, happens on main.
The release-X branches contain stable, backwards compatible code. On every major or minor release, a new branch is created. It is from these branches that minor and patch releases are tagged. In some cases, it may be necessary to open PRs for bugfixes directly against stable branches, but this should generally not be the case.
Support and guarantees
CAAPF maintains the most recent release/releases for all supported APIs. Support for this section refers to the ability to backport and release patch versions; backport policy is defined above.
- The API version is determined from the GroupVersion defined in the
#[kube(...)]derive macro inside./src/api. - For the current stable API version (v1alpha1) we support the two most recent minor releases; older minor releases are immediately unsupported when a new major/minor release is available.
Development
Development setup
Prerequisites
Alternatively:
To enter the environment with prerequisites:
nix-shell
Common prerequisite
Create a local development environment
- Clone the CAAPF repository locally.
- The project provides an easy way of starting your own development environment. You can take some time to study the justfile that includes a number of pre-configured commands to set up and build your own CAPI management cluster and install the addon provider for Fleet.
- Run the following:
just start-dev
This command will create a kind cluster and manage the installation of the fleet provider and all dependencies. 4. Once the installation is complete, you can inspect the current state of your development cluster.
E2E Test Failure Investigation Guide
This guide provides a structured approach to investigating end-to-end (e2e) test failures in the cluster-api-addon-provider-fleet project.
Understanding E2E Tests
Our CI pipeline runs several e2e tests to validate functionality across different Kubernetes versions:
- Cluster Class Import Tests: Validate the cluster class import functionality
- Import Tests: Validate the general import functionality
- Import RKE2 Tests: Validate import functionality specific to RKE2 clusters
Each test runs on multiple Kubernetes versions (stable and latest) to ensure compatibility.
Accessing Test Artifacts
When e2e tests fail, the CI pipeline automatically collects and uploads artifacts containing valuable debugging information. These artifacts are created using crust-gather, a tool that captures the state of Kubernetes clusters.
Finding the Artifact URL
- Navigate to the failed GitHub Actions workflow run
- Scroll down to the "Artifacts" section
- Find the artifact corresponding to the failed test (e.g.,
artifacts-cluster-class-import-stable) - Copy the artifact URL (right-click on the artifact link and copy the URL)
Using the serve-artifact.sh Script
The serve-artifact.sh script allows you to download and serve the test artifacts locally, providing access to the Kubernetes contexts from the test environment.
Prerequisites
- A GitHub token with
reporead permissions (set asGITHUB_TOKENenvironment variable) kubectlinstalled,krewinstalled.- crust-gather installed. Can be replicated with nix, if available.
Serving Artifacts
Fetch the serve-artifact.sh script from the crust-gather GitHub repository:
curl -L https://raw.githubusercontent.com/crust-gather/crust-gather/refs/heads/main/serve-artifact.sh -o serve-artifact.sh && chmod +x serve-artifact.sh
# Using the full artifact URL
./serve-artifact.sh -u https://github.com/rancher/cluster-api-addon-provider-fleet/actions/runs/15737662078/artifacts/3356068059 -s 0.0.0.0:9095
# OR using individual components
./serve-artifact.sh -o rancher -r cluster-api-addon-provider-fleet -a 3356068059 -s 0.0.0.0:9095
This will:
- Download the artifact from GitHub
- Extract its contents
- Start a local server that provides access to the Kubernetes contexts from the test environment
Investigating Failures
Once the artifact server is running, you can use various tools to investigate the failure:
Using k9s
k9s provides a terminal UI to interact with Kubernetes clusters:
- Open a new terminal
- Run
k9s - Press
:to open the command prompt - Type
ctxand press Enter - Select the context from the test environment (there may be multiple contexts).
devfor the e2e tests. - Navigate through resources to identify issues:
- Check pods for crash loops or errors
- Examine events for warnings or errors
- Review logs from relevant components
Common Investigation Paths
-
Check Fleet Resources:
FleetAddonConfigresources- Fleet
Clusterresource - CAPI
ClusterGroupresources - Ensure all relevant labels are present on above.
- Check for created
Fleetnamespacecluster-<ns>-<cluster name>-<random-prefix>that it is consitent with the NS in the Cluster.status.namespace. - Check for
ClusterRegistrationTokenin the cluster namespace. - Check for
BundleNamespaceMappingin theClusterClassnamespace if a cluster references aClusterClassin a different namespace
-
Check CAPI Resources:
- Cluster resource
- Check for
ControlPlaneInitializedcondition to betrue - ClusterClass resources, these are present and have
status.observedGenerationconsistent with themetadata.generation - Continue on a per-cluster basis
-
Check Controller Logs:
- Look for error messages or warnings in the controller logs in the
caapf-systemnamespace. - Check for reconciliation failures in
managercontainer. In case of upstream installation, check forhelm-managercontainer logs.
- Look for error messages or warnings in the controller logs in the
-
Check Kubernetes Events:
- Events often contain information about failures, otherwise
CAAPFpublishes events for each resource apply from CAPICluster, including FleetClusterin the cluster namespace,ClusterGroupandBundleNamespaceMappingin theClusterClassnamespace. These events are created bycaapf-controllercomponent.
- Events often contain information about failures, otherwise
Common Failure Patterns
Import Failures
- Symptom: Fleet
Clusternot created or in error state - Investigation: Check the controller logs in the
cattle-fleet-systemnamespace for errors during import processing. Check for errors in theCAAPFlogs for missing cluster definition. - Common causes:
- Fleet cluster import process is serial, and hot loop in other cluster import blocks further cluster imports. Fleet issue.
- CAPI
Clusteris not ready and does not haveControlPlaneInitializedcondition. Issue with CAPI or requires more time to be ready. - Otherwise
CAAPFissue.
Cluster Class Failures
- Symptom: ClusterClass not properly imported or is not evaluated as a target.
- Investigation: Check for the
BundleNamespaceMappingin theClusterClassnamespace named after theClusterresource. Check the controller logs in thecaapf-systemnamespace for errors during ClusterClass processing. CheckClusterGroupresource in theClusternamespace. - Common causes:
- Check for
ClusterreferencingClusterClassin a different namespace. - In the event of missing resources,
CAAPFrelated error.
- Check for