Cluster API Provider RKE2
What is Cluster API Provider RKE2
The Cluster API brings declarative, Kubernetes-style APIs to cluster creation, configuration and management.
Cluster API Provider RKE2 (CAPRKE2) is a combination of 2 provider types, a Cluster API Control Plane Provider for provisioning Kubernetes control plane nodes and a Cluster API Bootstrap Provider for bootstrapping Kubernetes on a machine where RKE2 is used as the Kubernetes distro.
Getting Started
Follow our getting started guide to start creating RKE2 clusters with CAPI.
Developer Guide
Check our developer guide for instructions on how to setup your dev environment in order to contribute to this project.
Get in contact
You can get in contact with us via the #capbr channel on the Rancher Users Slack.
User guide
This section contains a getting started guide to help new users utilise CAPRKE2.
Getting Started
Cluster API Provider RKE2 is compliant with the clusterctl
contract, which means that clusterctl
simplifies its deployment to the CAPI Management Cluster. In this Getting Started guide, we will be using the RKE2 Provider with the docker
provider (also called CAPD
).
Prerequisites
- clusterctl to handle the lifecycle of a Cluster API management cluster
- kubectl to apply the workload cluster manifests that
clusterctl
generates - kind and docker to create a local Cluster API management cluster
Management Cluster
In order to use this provider, you need to have a management cluster available to you and have your current KUBECONFIG context set to talk to that cluster. If you do not have a cluster available to you, you can create a kind
cluster. These are the steps needed to achieve that:
- Ensure kind is installed (https://kind.sigs.k8s.io/docs/user/quick-start/#installation)
- Create a special
kind
configuration file if you intend to use the Docker infrastructure provider:
cat > kind-cluster-with-extramounts.yaml <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: capi-test
nodes:
- role: control-plane
extraMounts:
- hostPath: /var/run/docker.sock
containerPath: /var/run/docker.sock
EOF
- Run the following command to create a local kind cluster:
kind create cluster --config kind-cluster-with-extramounts.yaml
- Check your newly created
kind
cluster :
kubectl cluster-info
and get a similar result to this:
Kubernetes control plane is running at https://127.0.0.1:40819
CoreDNS is running at https://127.0.0.1:40819/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Setting up clusterctl
CAPI >= v1.6.0
No additional steps are required and you can install the RKE2 provider with clusterctl directly:
clusterctl init --core cluster-api:v1.8.5 --bootstrap rke2:v0.8.0 --control-plane rke2:v0.8.0 --infrastructure docker:v1.8.5
Next, you can proceed to creating a workload cluster.
CAPI < v1.6.0
With CAPI & clusterctl versions less than v1.6.0 you need a specific configuration. To do this create a file called clusterctl.yaml
in the $HOME/.cluster-api
folder with the following content (substitute ${VERSION}
with a valid semver specification - e.g. v0.5.0 - from releases):
providers:
- name: "rke2"
url: "https://github.com/rancher/cluster-api-provider-rke2/releases/${VERSION}/bootstrap-components.yaml"
type: "BootstrapProvider"
- name: "rke2"
url: "https://github.com/rancher/cluster-api-provider-rke2/releases/${VERSION}/control-plane-components.yaml"
type: "ControlPlaneProvider"
NOTE: Due to some issue related to how
CAPD
creates Load Balancer healthchecks, it is necessary to use a fork ofCAPD
by providing in the above configuration file the following :
- name: "docker"
url: "https://github.com/belgaied2/cluster-api/releases/v1.3.3-cabpr-fix/infrastructure-components.yaml"
type: "InfrastructureProvider"
This configuration tells clusterctl where to look for provider manifests in order to deploy provider components in the management cluster.
The next step is to run the clusterctl init
command:
clusterctl init --bootstrap rke2 --control-plane rke2 --infrastructure docker:v1.3.3-cabpr-fix
This should output something similar to the following:
Fetching providers
Installing cert-manager Version="v1.10.1"
Waiting for cert-manager to be available...
Installing Provider="cluster-api" Version="v1.3.3" TargetNamespace="capi-system"
Installing Provider="bootstrap-rke2" Version="v0.1.0-alpha.1" TargetNamespace="rke2-bootstrap-system"
Installing Provider="control-plane-rke2" Version="v0.1.0-alpha.1" TargetNamespace="rke2-control-plane-system"
Your management cluster has been initialized successfully!
You can now create your first workload cluster by running the following:
clusterctl generate cluster [name] --kubernetes-version [version] | kubectl apply -f -
Create a workload cluster
There are some sample cluster templates available under the examples
folder. This section assumes you are using CAPI v1.6.0 or higher.
For this Getting Started
section, we will be using the docker
samples available under examples/docker/online-default
folder. This folder contains a YAML template file called cluster-template.yaml
which contains environment variable placeholders which can be substituted using the envsubst tool. We will use clusterctl
to generate the manifests from these template files.
Set the following environment variables:
- CABPR_NAMESPACE
- CLUSTER_NAME
- CABPR_CP_REPLICAS
- CABPR_WK_REPLICAS
- KUBERNETES_VERSION
- KIND_IMAGE_VERSION
for example:
export CABPR_NAMESPACE=example
export CLUSTER_NAME=capd-rke2-test
export CABPR_CP_REPLICAS=3
export CABPR_WK_REPLICAS=2
export KUBERNETES_VERSION=v1.30.3
export KIND_IMAGE_VERSION=v1.30.3
The next step is to substitue the values in the YAML using the following commands:
cd examples/docker/online-default
cat cluster-template.yaml | clusterctl generate yaml > rke2-docker-example.yaml
At this moment, you can take some time to study the resulting YAML, then you can apply it to the management cluster:
kubectl apply -f rke2-docker-example.yaml
and see the following output:
namespace/example created
cluster.cluster.x-k8s.io/capd-rke2-test created
dockercluster.infrastructure.cluster.x-k8s.io/capd-rke2-test created
rke2controlplane.controlplane.cluster.x-k8s.io/capd-rke2-test-control-plane created
dockermachinetemplate.infrastructure.cluster.x-k8s.io/controlplane created
machinedeployment.cluster.x-k8s.io/worker-md-0 created
dockermachinetemplate.infrastructure.cluster.x-k8s.io/worker created
rke2configtemplate.bootstrap.cluster.x-k8s.io/capd-rke2-test-agent created
configmap/capd-rke2-test-lb-config created
Checking the workload cluster
After waiting several minutes, you can check the state of CAPI machines, by running the following command:
kubectl get machine -n example
and you should see output similar to the following:
NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION
capd-rke2-test-control-plane-9fw9t capd-rke2-test capd-rke2-test-control-plane-9fw9t docker:////capd-rke2-test-control-plane-9fw9t Running 35m v1.27.3+rke2r1
capd-rke2-test-control-plane-m2sdk capd-rke2-test capd-rke2-test-control-plane-m2sdk docker:////capd-rke2-test-control-plane-m2sdk Running 12m v1.27.3+rke2r1
capd-rke2-test-control-plane-zk2xb capd-rke2-test capd-rke2-test-control-plane-zk2xb docker:////capd-rke2-test-control-plane-zk2xb Running 27m v1.27.3+rke2r1
worker-md-0-fhxrw-crn5g capd-rke2-test capd-rke2-test-worker-md-0-fhxrw-crn5g docker:////capd-rke2-test-worker-md-0-fhxrw-crn5g Running 36m v1.27.3+rke2r1
worker-md-0-fhxrw-qsk7n capd-rke2-test capd-rke2-test-worker-md-0-fhxrw-qsk7n docker:////capd-rke2-test-worker-md-0-fhxrw-qsk7n Running 36m v1.27.3+rke2r1
Accessing the workload cluster
Once cluster is fully provisioned, you can check its status with:
kubectl get cluster -n example
and see an output similar to this:
NAMESPACE NAME CLUSTERCLASS PHASE AGE VERSION
example capd-rke2-test Provisioned 31m
You can also get an “at glance” view of the cluster and its resources by running:
clusterctl describe cluster capd-rke2-test -n example
This should output similar to this:
NAME READY SEVERITY REASON SINCE MESSAGE
Cluster/capd-rke2-test True 2m56s
├─ClusterInfrastructure - DockerCluster/capd-rke2-test True 31m
├─ControlPlane - RKE2ControlPlane/capd-rke2-test-control-plane True 2m56s
│ └─3 Machines... True 28m See capd-rke2-test-control-plane-9fw9t, capd-rke2-test-control-plane-m2sdk, ...
└─Workers
└─MachineDeployment/worker-md-0 True 15m
└─2 Machines... True 25m See worker-md-0-fhxrw-crn5g, worker-md-0-fhxrw-qsk7n
🎉 CONGRATULATIONS! 🎉 You created your first RKE2 cluster with CAPD as an infrastructure provider.
Using ClusterClass for cluster creation
This provider supports using ClusterClass, a Cluster API feature that implements an extra level of abstraction on top of the existing Cluster API functionality. The ClusterClass
object is used to define a collection of template resources (control plane and machine deployment) which are used to generate one or more clusters of the same flavor.
If you are interested in leveraging this functionality, you can refer to the examples here:
- clusterclass-quick-start.yaml: creates a sample
ClusterClass
and necessary resources. - rke2-sample.yaml: creates a workload cluster using the
ClusterClass
.
As with other sample templates, you will need to set a number environment variables:
- CLUSTER_NAME
- CABPR_CP_REPLICAS
- CABPR_WK_REPLICAS
- KUBERNETES_VERSION
- KIND_IP
for example:
export CLUSTER_NAME=capd-rke2-clusterclass
export CABPR_CP_REPLICAS=3
export CABPR_WK_REPLICAS=2
export KUBERNETES_VERSION=v1.30.3
export KIND_IP=192.168.20.20
Remember that, since we are using Kind, the value of KIND_IP
must be an IP address in the range of the kind
network.
You can check the range Docker assigns to this network by inspecting it:
docker network inspect kind
The next step is to substitue the values in the YAML using the following commands:
cat clusterclass-quick-start.yaml | clusterctl generate yaml > clusterclass-example.yaml
At this moment, you can take some time to study the resulting YAML, then you can apply it to the management cluster:
kubectl apply -f clusterclass-example.yaml
This will create a new ClusterClass
template that can be used to provision one or multiple workload clusters of the same flavor.
To do so, you can follow the same procedure and substitute the values in the YAML for the cluster definition:
cat rke2-sample.yaml | clusterctl generate yaml > rke2-clusterclass-example.yaml
And then apply the resulting YAML file to create a cluster from the existing ClusterClass
.
kubectl apply -f rke2-clusterclass-example.yaml
Known Issues
When using CAPD < v1.6.0 unmodified, Cluster creation is stuck after first node and API is not reachable
If you use docker
as your infrastructure provider without any modification, Cluster creation will stall after provisioning the first node, and the API will not be available using the LB address. This is caused by Load Balancer configuration used in CAPD which is not compatible with RKE2. Therefore, it is necessary to use our own fork of v1.3.3
by using a specific clusterctl configuration.
Topics
This section contains more detailed information about the features that CAPRKE2 offers and how to use them.
Air-Gapped Cluster Deployment
Introduction
The default way this provider uses to deploy RKE2 is by using the online installation method. This methdod needs access to Rancher servers and Docker.io registry for downloading scripts, RKE2 packages and container images neessary to the installation of RKE2.
Some users might prefer using Air-Gapped installation for multiple possible reasons like deployment on particularly secure environments, sporadic access issues (like Deployment to Edge Locations) or Bandwidth preservation.
RKE2 supports Air-Gapped installation using :
-
2 methods for node preparation: Tarball on the node, Container Image Registry
-
2 methods for actual RKE2 installation after the node is prepared: Manual deployment, and Using
install.sh
fromhttps://get.rke2.io
.
Methods supported by CABPR (Cluster API Provider RKE2)
In choosing between the RKE2 Air-Gapped cluster creation modes above, CABPR has chosen the best tradeoff in terms of simplicity, usability and limitation of dependencies.
Node preparation
The method that is supported by CABPR is the Tarball on the node using custom images. The reasons behind this choice include:
-
No dependency on the environments' network infrastructure and Image Registry, and the registry approach does not exempt from needing to use a custom image anyway.
-
CAPI's philosophy is to accept custom-defined base images for infrastructure providers, which makes it easy to build the RKE2 pre-requisites (for a specific RKE2 version) into a custom image to be used for all deployments.
RKE2 deployment
The method that is supported by CABPR for RKE2 deployment is by using the install.sh
approach, described here. This approach is used because it automates a number of tasks needed for RKE2 to be deployed, like creating file hierarchy, unpacking Tarball, and creating systemd
service units.
Since these tasks might change in the future, we prefer to rely on the upstream script from RKE2, available in the latest valid version at: https://get.rke2.io .
Pre-requisites on base image
Considering the above tradeoffs, base images used for Air-Gapped need to comply to some pre-requisites in order to work with CABPR. This sections list these pre-requisites:
-
Support and presence of
cloud-init
(ignition bootstrapping is also on the roadmap) -
Presence of
systemd
(because RKE2's installation relies on systemd to start RKE2) -
Presence of the folders
/opt
and/opt/rke2-artifacts
with the following files inside these folders:-
install.sh
in/opt
(this file has the content of the script available at https://get.rke2.io ). One way to create it at build time is by usingcurl -sfL https://get.rke2.io > /opt/install.sh
using a linux user withwrite
permissions to the/opt
folder. -
rke2-images.linux-amd64.tar.zst
,rke2.linux-amd64.tar.gz
andsha256sum-amd64.txt
in the/opt/rke2-artifacts
folder, these files can be downloaded for a specific version of RKE2 on its release page, for instance, this page : Release v1.23.16+rke2r1 · rancher/rke2 · GitHub for versionv1.23.16+rke2r1
. The files can be found under the Assets sections of the page.
-
-
Previous pre-requisites should be built into an machine image, for instance, for instance a container image for CAPD or an AMI for AWS EC2. Each Infrastructure provider has its own way of defining machine images.
Configuration of CABPR for Air-Gapped use
In order to deploy RKE2 Clusters in Air-Gapped mode using CABPR, you need to set the fields spec.agentConfig.airGapped
for the RKE2ControlPlane object and spec.template.spec.agentConfig.airGapped
for RKE2ConfigTemplate object to true
.
You can check a reference implementation for CAPD here including configuration for CAPD custom image.
Node Registration Methods
The provider supports multiple methods for registering a new node into the cluster.
Usage
The method to use is specified on the RKEControlPlane within the spec. If no method is supplied then the default method of internal-first will be used.
You cannot change the registration method after creation.
An example of using a different method:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: RKE2ControlPlane
metadata:
name: test1-control-plane
namespace: default
spec:
agentConfig:
version: v1.26.4+rke2r1
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
name: controlplane
nodeDrainTimeout: 2m
replicas: 3
serverConfig:
cni: calico
registrationMethod: "address"
registrationAddress: "172.19.0.3"
Registration Methods
internal-first
For each CAPI Machine
that is used for the control plane, we take the internal ip address from Machine.status.addresses
if it exists. If there is no internal ip for a machine then we will use an external address instead. For the ip address found for a machine then we add it to RKEControlPlane.status.availableServerIPs
.
The first IP address listed in RKEControlPlane.status.availableServerIPs
is then used for the join.
internal-only-ips
For each CAPI Machine
that is used for the control plane, we take the internal ip address from Machine.status.addresses
if it exists and then we add it to RKEControlPlane.status.availableServerIPs
.
The first IP address listed in RKEControlPlane.status.availableServerIPs
is then used for the join.
external-only-ips
For each CAPI Machine
that is used for the control plane, we take the external ip address from Machine.status.addresses
if it exists and then we add it to RKEControlPlane.status.availableServerIPs
.
The first IP address listed in RKEControlPlane.status.availableServerIPs
is then used for the join.
address
For this method you must supply an address in the control plane spec (i.e. RKE2ControlPlane.spec.registrationAddress
). This address is then used for the join.
With this method its expected that you have a load balancer / VIP solution sitting in front of all the control plane machines and all the join requests will be routed via this.
CIS and Pod Security Admission
In order to set a custom Pod Security Admission policy when CIS profile is selected it's required to create a secret with the policy content and set an appropriate field on the RKE2ControlPlane
object:
apiVersion: v1
kind: Secret
metadata:
name: pod-security-admission-config
data:
pod-security-admission-config.yaml: |
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
configuration:
apiVersion: pod-security.admission.config.k8s.io/v1beta1
kind: PodSecurityConfiguration
defaults:
enforce: "restricted"
enforce-version: "latest"
audit: "restricted"
audit-version: "latest"
warn: "restricted"
warn-version: "latest"
exemptions:
usernames: []
runtimeClasses: []
namespaces: [kube-system, cis-operator-system, tigera-operator]
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: RKE2ControlPlane
metadata:
...
spec:
...
files:
- path: /path/to/pod-security-admission-config.yaml
contentFrom:
secret:
name: pod-security-admission-config
key: pod-security-admission-config.yaml
agentConfig:
profile: cis
podSecurityAdmissionConfigFile: /path/to/pod-security-admission-config.yaml
...
Example of PSA to allow Rancher components to run in the cluster:
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
configuration:
apiVersion: pod-security.admission.config.k8s.io/v1
kind: PodSecurityConfiguration
defaults:
enforce: "restricted"
enforce-version: "latest"
audit: "restricted"
audit-version: "latest"
warn: "restricted"
warn-version: "latest"
exemptions:
usernames: []
runtimeClasses: []
namespaces: [cattle-alerting,
cattle-fleet-local-system,
cattle-fleet-system,
cattle-global-data,
cattle-impersonation-system,
cattle-monitoring-system,
cattle-prometheus,
cattle-resources-system,
cattle-system,
cattle-ui-plugin-system,
cert-manager,
cis-operator-system,
fleet-default,
ingress-nginx,
kube-node-lease,
kube-public,
kube-system,
rancher-alerting-drivers]
Examples
This section contains examples of how to use the CAPRKE2 with different cloud providers and platforms.
Setting up the Management Cluster
Make sure you set up a Management Cluster to use with Cluster API, you can follow instructions from the Cluster API book.
Cluster API AWS Infrastructure Provider
Installing the AWS provider
Refer to the Cluster API book for configuring AWS credentials and setting up the AWS infrastructure provider.
The next step is to run the clusterctl init command (make sure to provide valid AWS Credential using the AWS_B64ENCODED_CREDENTIALS
environment variable):
CAPRKE2 can also be deployed with clusterctl
clusterctl init --bootstrap rke2 --control-plane rke2 --infrastructure aws
Create a workload cluster
Before creating a workload clusters, it is required to build an AMI for the RKE2 version that is going to be installed on the cluster. You can follow the steps in the image-builder README to build the AMI.
You will need to set the following environment variables:
export CONTROL_PLANE_MACHINE_COUNT=3
export WORKER_MACHINE_COUNT=1
export RKE2_VERSION=v1.30.2+rke2r1
export AWS_NODE_MACHINE_TYPE=t3a.large
export AWS_CONTROL_PLANE_MACHINE_TYPE=t3a.large
export AWS_SSH_KEY_NAME="aws-ssh-key"
export AWS_REGION="aws-region"
export AWS_AMI_ID="ami-id"
Now, we can generate the YAML files from the templates using clusterctl generate yaml
command:
clusterctl generate cluster --from https://github.com/rancher/cluster-api-provider-rke2/blob/main/examples/aws/cluster-template.yaml -n example-aws rke2-aws > aws-rke2-clusterctl.yaml
After examining the result YAML file, you can apply to the management cluster using :
kubectl apply -f aws-rke2-clusterctl.yaml
Checking the workload cluster
After a while you should be able to check functionality of the workload cluster using clusterctl
:
clusterctl describe cluster -n example-aws rke2-aws
and once the cluster is provisioned, it should look similar to the following:
NAME READY SEVERITY REASON SINCE MESSAGE
Cluster/rke2-aws True 16m
├─ClusterInfrastructure - AWSCluster/rke2-aws True 25m
├─ControlPlane - RKE2ControlPlane/rke2-aws-control-plane True 16m
│ └─3 Machines... True 19m See rke2-aws-control-plane-8wsfm, rke2-aws-control-plane-qgwr7, ...
└─Workers
└─MachineDeployment/rke2-aws-md-0 True 18m
└─2 Machines... True 19m See rke2-aws-md-0-6d47bf584d-g2ljz, rke2-aws-md-0-6d47bf584d-m9z8h
Ignition based bootstrap
Note: ignition
template is currently outdated.
Make sure that BootstrapFormatIgnition
feature gate is enable for CAPA manager, you can do it
by changing flag in the CAPA manager deployment:
containers:
- args:
- --feature-gates=EKS=true,EKSEnableIAM=false,EKSAllowAddRoles=false,EKSFargate=false,MachinePool=false,EventBridgeInstanceState=false,AutoControllerIdentityCreator=true,BootstrapFormatIgnition=true,ExternalResourceGC=false
...
name: manager
or by setting the following environment variable before installing CAPA with clusterctl
:
export BOOTSTRAP_FORMAT_IGNITION=true
For the Ignition based bootstrap, you will also need to set the following environment variables:
export AWS_S3_BUCKET_NAME=<YOUR_AWS_S3_BUCKET_NAME>
Now you can generate manifests from the cluster template:
clusterctl generate cluster --from https://github.com/rancher/cluster-api-provider-rke2/blob/main/examples/aws/ignition/cluster-template-ignition.yaml -n example-aws rke2-aws > aws-rke2-clusterctl.yaml
Cluster API vSphere Infrastructure Provider
Installing the vSphere provider and creating a workload cluster
This config includes a kubevip loadbalancer on the controlplane nodes. The VIP of the loadbalancer for the Kubernetes API is set by the CONTROL_PLANE_ENDPOINT_IP.
Prerequisites:
- VM template to be used for the cluster machine should be present in the vSphere environment.
- If airgapped environment is required then the VM template should already include RKE2 binaries as described in the docs. CAPRKE2 is using the tarball method to install RKE2 on the machines. Any additional images like vSphere CPI image should be present in the local environment too.
To initialize Cluster API Provider vSphere, clusterctl requires the following variables, which should be set in ~/.cluster-api/clusterctl.yaml as the following:
## -- Controller settings -- ##
VSPHERE_USERNAME: "<username>" # The username used to access the remote vSphere endpoint
VSPHERE_PASSWORD: "<password>" # The password used to access the remote vSphere endpoint
## -- Required workload cluster default settings -- ##
VSPHERE_SERVER: "10.0.0.1" # The vCenter server IP or FQDN
VSPHERE_DATACENTER: "SDDC-Datacenter" # The vSphere datacenter to deploy the management cluster on
VSPHERE_DATASTORE: "DefaultDatastore" # The vSphere datastore to deploy the management cluster on
VSPHERE_NETWORK: "VM Network" # The VM network to deploy the management cluster on
VSPHERE_RESOURCE_POOL: "*/Resources" # The vSphere resource pool for your VMs
VSPHERE_FOLDER: "vm" # The VM folder for your VMs. Set to "" to use the root vSphere folder
VSPHERE_TEMPLATE: "ubuntu-1804-kube-v1.17.3" # The VM template to use for your management cluster.
CONTROL_PLANE_ENDPOINT_IP: "192.168.9.230" # the IP that kube-vip is going to use as a control plane endpoint
VSPHERE_TLS_THUMBPRINT: "..." # sha256 thumbprint of the vcenter certificate: openssl x509 -sha256 -fingerprint -in ca.crt -noout
EXP_CLUSTER_RESOURCE_SET: "true" # This enables the ClusterResourceSet feature that we are using to deploy CSI
VSPHERE_SSH_AUTHORIZED_KEY: "ssh-rsa AAAAB3N..." # The public ssh authorized key on all machines in this cluster.
# Set to "" if you don't want to enable SSH, or are using another solution.
"CPI_IMAGE_K8S_VERSION": "v1.30.0" # The version of the vSphere CPI image to be used by the CPI workloads
# Keep this close to the minimum Kubernetes version of the cluster being created.
Then run the following command to generate the RKE2 cluster manifests:
clusterctl generate cluster --from https://github.com/rancher/cluster-api-provider-rke2/blob/main/examples/vmware/cluster-template.yaml -n example-vsphere rke2-vsphere > vsphere-rke2-clusterctl.yaml
kubectl apply -f vsphere-rke2-clusterctl.yaml
Cluster API Docker Infrastructure Provider
This page focuses on using the RKE2 provider with the Docker Infrastructure provider.
Setting up the Management Cluster
Make sure you set up a Management Cluster to use with Cluster API, you can follow instructions from the Cluster API book.
Create a workload cluster
Before creating a workload clusters, it is required to set the following environment variables:
export CONTROL_PLANE_MACHINE_COUNT=3
export WORKER_MACHINE_COUNT=1
export RKE2_VERSION=v1.30.2+rke2r1
export KIND_IMAGE_VERSION=v1.30.0
Now, we can generate the YAML files from the templates using clusterctl generate yaml
command:
clusterctl generate cluster --from https://github.com/rancher/cluster-api-provider-rke2/blob/main/examples/docker/online-default/cluster-template.yaml -n example-docker rke2-docker > docker-rke2-clusterctl.yaml
After examining the result YAML file, you can apply to the management cluster using:
kubectl apply -f docker-rke2-clusterctl.yaml
Developer Guide
This section describes the workflow for regular developer tasks, such as:
- Development guide
- Releasing a new version of CAPRKE2
Development
The following instructions are for development purposes.
- Clone the Cluster API Repo into the GOPATH
Why clone into the GOPATH? There have been historic issues with code generation tools when they are run outside the go path
- Fork the Cluster API Provider RKE2 repo
- Clone your new repo into the GOPATH (i.e.
~/go/src/github.com/yourname/cluster-api-provider-rke2
) - Ensure Tilt and kind are installed
- Create a
tilt-settings.json
file in the root of your forked/clonedcluster-api
directory. - Add the following contents to the file (replace "yourname" with your github account name):
{
"default_registry": "ghcr.io/yourname",
"provider_repos": ["../../github.com/yourname/cluster-api-provider-rke2"],
"enable_providers": ["docker", "rke2-bootstrap", "rke2-control-plane"],
"kustomize_substitutions": {
"EXP_MACHINE_POOL": "true",
"EXP_CLUSTER_RESOURCE_SET": "true"
},
"extra_args": {
"rke2-bootstrap": ["--v=4"],
"rke2-control-plane": ["--v=4"],
"core": ["--v=4"]
},
"debug": {
"rke2-bootstrap": {
"continue": true,
"port": 30001
},
"rke2-control-plane": {
"continue": true,
"port": 30002
}
}
}
NOTE: Until this bug merged in CAPI you will have to make the changes locally to your clone of CAPI.
- Open another terminal (or pane) and go to the
cluster-api
directory. - Run the following to create a configuration for kind:
cat > kind-cluster-with-extramounts.yaml <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: capi-test
nodes:
- role: control-plane
extraMounts:
- hostPath: /var/run/docker.sock
containerPath: /var/run/docker.sock
EOF
NOTE: if you are using Docker Desktop v4.13 or above then you will you will encounter issues from here. Until a permanent solution is found its recommended you use v4.12
- Run the following command to create a local kind cluster:
kind create cluster --config kind-cluster-with-extramounts.yaml
- Now start tilt by running the following:
tilt up
- Press the space key to see the Tilt web ui and check that everything goes green.
CAPRKE2 Releases
Release Cadence
- CAPRKE2 minor versions (v0.2.0 versus v0.1.0) are released every 1-2 months.
- CAPRKE2 patch versions (v0.2.2 versus v0.2.1) are released as often as weekly or bi-weekly.
Release Process
- Clone the repository locally:
git clone git@github.com:rancher/cluster-api-provider-rke2.git
-
Depending on whether you are cutting a minor or patch release, the process varies.
-
If you are cutting a new minor release:
Create a new release branch (i.e release-X) and push it to the upstream repository.
# Note: `upstream` must be the remote pointing to `github.com:rancher/cluster-api-provider-rke2`. git checkout -b release-0.2 git push -u upstream release-0.2 # Export the tag of the minor release to be cut, e.g.: export RELEASE_TAG=v0.2.0
-
If you are cutting a patch release from an existing release branch:
Use existing release branch.
git checkout upstream/release-0.2 # Export the tag of the patch release to be cut, e.g.: export RELEASE_TAG=v0.2.1
-
-
Create a signed/annotated tag and push it:
# Create tags locally
git tag -s -a ${RELEASE_TAG} -m ${RELEASE_TAG}
# Push tags
git push upstream ${RELEASE_TAG}
This will trigger a release GitHub action that creates a release with RKE2 provider components.
- Mark release as ready.
Published releases are initially marked as draft
. If the published version is supposed to be latest
, mark it so on the release page, while editing the release. Please note that we are using semantic versioning while choosing latest
version.
- Perform mandatory post-release activities, which will ensure contract
metadata.yaml
file is up-to-date in case of a futureminor/major
version change.
Prepare main branch for development of the new release
The goal of this task is to bump the versions on the main branch so that the upcoming release version is used for e.g. local development and e2e tests. We also modify tests so that they are testing the previous release.
This comes down to changing occurrences of the old version to the new version, e.g. v1.5 to v1.6, and preparing metadata.yaml
for a future release version:
1. Update E2E tests
Existing E2E tests that point to a specific version need to be updated to use the new version instead.
- Add a future release to the list of providers in
test/e2e/config/e2e_conf.yaml
following the format used for previous versions. This will be used as a fake provider version for testing the current state of the repository instead of the actual GitHub release. - Update bootstrap/control plane versions* inside function
initUpgradableBootstrapCluster
intest/e2e/e2e_suite_test.go
. - Edit upgrade test* in
test/e2e/e2e_upgrade_test.go
.
*To maintain the upgrade test concise and clean, and avoid a growing list of versions, it is required to maintain N-1 minor as a starting version (e.g. if releasing version v4.x, starting version is v3.x and the upgrade is as follows: v3.x -> v4.x).
2. Add future version to metadata.yaml
. For example, if v0.5
was just released, we add v0.6
to the list of releaseSeries
:
apiVersion: clusterctl.cluster.x-k8s.io/v1alpha3
kind: Metadata
releaseSeries:
- major: 0
minor: 1
contract: v1beta1
- major: 0
minor: 2
contract: v1beta1
...
...
...
- major: x
minor: x
contract: x
Versioning
Cluster API Provider RKE2 follows semantic versioning specification.
Example versions:
- Pre-release:
v0.2.0-alpha.1
- Minor release:
v0.2.0
- Patch release:
v0.2.1
- Major release:
v2.0.0
With the v0 release of our codebase, we provide the following guarantees:
-
A (minor) release CAN include:
- Introduction of new API versions, or new Kinds.
- Compatible API changes like field additions, deprecation notices, etc.
- Breaking API changes for deprecated APIs, fields, or code.
- Features, promotion or removal of feature gates.
- And more!
-
A (patch) release SHOULD only include backwards compatible set of bugfixes.
Backporting
Any backport MUST not be breaking for either API or behavioral changes.
It is generally not accepted to submit pull requests directly against release branches (release-X). However, backports of fixes or changes that have already been merged into the main branch may be accepted to all supported branches:
- Critical bugs fixes, security issue fixes, or fixes for bugs without easy workarounds.
- Dependency bumps for CVE (usually limited to CVE resolution; backports of non-CVE related version bumps are considered exceptions to be evaluated case by case)
- Cert-manager version bumps (to avoid having releases with cert-manager versions that are out of support, when possible)
- Changes required to support new Kubernetes versions, when possible. See supported Kubernetes versions for more details.
- Changes to use the latest Go patch version to build controller images.
- Improvements to existing docs (the latest supported branch hosts the current version of the book)
Note: We generally do not accept backports to Cluster API Provider RKE2 release branches that are out of support.
Branches
Cluster API Provider RKE2 has two types of branches: the main branch and release-X branches.
The main branch is where development happens. All the latest and greatest code, including breaking changes, happens on main.
The release-X branches contain stable, backwards compatible code. On every major or minor release, a new branch is created. It is from these branches that minor and patch releases are tagged. In some cases, it may be necessary to open PRs for bugfixes directly against stable branches, but this should generally not be the case.
Support and guarantees
Cluster API Provider RKE2 maintains the most recent release/releases for all supported APIs. Support for this section refers to the ability to backport and release patch versions; backport policy is defined above.
- The API version is determined from the GroupVersion defined in the top-level
bootstrap/api/
andcontrolplane/api/
packages. - For the current stable API version (v1beta1) we support the two most recent minor releases; older minor releases are immediately unsupported when a new major/minor release is available.
Reference
This section contains reference documentation for CAPRKE2 API types.