Skip to content

enable multiple control plane classes#1

Draft
Dhairya-Arora01 wants to merge 35 commits intosyself-1.10.7from
dhairya/controlPlaneClass
Draft

enable multiple control plane classes#1
Dhairya-Arora01 wants to merge 35 commits intosyself-1.10.7from
dhairya/controlPlaneClass

Conversation

@Dhairya-Arora01
Copy link

@Dhairya-Arora01 Dhairya-Arora01 commented Feb 25, 2026

What this PR does / why we need it:
CAPI's ClusterClass currently supports only a single control plane definition under spec.controlPlane. This means all clusters using a ClusterClass are locked to one type of machine infrastructure for the control plane

With Hetzner disabling certain locations for HCloud machines for some customers, we need the ability to spin up baremetal based cp as an alternative, but the current design doesn't allow this.

When we tried to set the spec.controlPlane.machineInfrastructure.ref to HetznerBareMetalMachineTemplate, the ClusterClass was rejected by the admission webhook:

admission webhook "validation.clusterclass.cluster.x-k8s.io" denied the request:
spec.patches[7].definitions[0].selector.matchResources.controlPlane: Invalid value: true:
selector is enabled but matches neither the controlPlane ref nor the controlPlane machineInfrastructure ref

This happened because existing patches for HCloud machines use controlPlane: true selectors with kind: HCloudMachineTemplate. When we changed spec.controlPlane to reference HetznerBareMetalMachineTemplate instead, those HCloud patches no longer matched any ref in spec.controlPlane, causing validation to fail.

Solution
This PR introduces multiple control plane classes in the same way as how workers already support multiple MachineDeploymentClass entries. This allows different machine infrastructure types to exist with each targeted by its own patches.

Key changes:

  • ClusterClass now supports multiple classes for control-plane - similar to workers.

  • Cluster topology now includes a field "class" for control-plane which references to the control. If class is empty, it would mean that we fallback to the old way i.e. clusterclass.spec.controlPlane will be used.

  • clusterclass.spec.controlPlane(old way) stays populated with HCloud ref

  • clusterclass.spec.controlPlaneClasses (new list) is populated with both hcloud and baremetal as classes.

  • Patches that apply to old way keep using controlPlane: true.

  • Patches that are infras specific use controlPlaneClass.names.

Example of clusterclass

spec:
  # Old way
  controlPlane:
    ref:
      apiVersion: controlplane.cluster.x-k8s.io/v1beta1
      kind: KubeadmControlPlaneTemplate
      name: my-control-plane
    machineInfrastructure:
      ref:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: HCloudMachineTemplate
        name: my-machinetemplate-hcloud

  # New way
  controlPlaneClasses:
    - class: hcloud
      ref:
        apiVersion: controlplane.cluster.x-k8s.io/v1beta1
        kind: KubeadmControlPlaneTemplate
        name: my-control-plane
      machineInfrastructure:
        ref:
          apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
          kind: HCloudMachineTemplate
          name: my-machinetemplate-hcloud
    - class: baremetal
      ref:
        apiVersion: controlplane.cluster.x-k8s.io/v1beta1
        kind: KubeadmControlPlaneTemplate
        name: my-control-plane
      machineInfrastructure:
        ref:
          apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
          kind: HetznerBareMetalMachineTemplate
          name: my-machinetemplate-baremetal

Example of Patches:

- name: KubeadmControlPlaneTemplateOIDC
  enabledIf: '{{ if not (empty .oidcIssuerUrl) }}true{{ end }}'
  definitions:
    - selector:
        apiVersion: controlplane.cluster.x-k8s.io/v1beta1
        kind: KubeadmControlPlaneTemplate
        matchResources:
          controlPlane: true  # for backward compatibility
      jsonPatches: 
    - selector:
        apiVersion: controlplane.cluster.x-k8s.io/v1beta1
        kind: KubeadmControlPlaneTemplate
        matchResources:
          controlPlaneClass:
            names:
              - "*"  # for changing in all the classes
        jsonPatches:

For infra specific patch (i.e. patch in our classes)

- name: HetznerBareMetalMachineTemplateControlPlaneImage
  definitions:
    - selector:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: HetznerBareMetalMachineTemplate
        matchResources:
          controlPlaneClass:         #  new way
            names:
              - baremetal
      jsonPatches:
        - op: replace
          path: "/spec/template/spec/installImage/image"
          value:
            name: "Ubuntu-2404-noble-amd64-v1.31.4"
            path: ""
            url: "https://example.com/image.tar.gz"

For Clusters now

if we dont set the topology.controlPlane.class field then it will use hcloud by deffault

otherwise specify the class from the clusterclass cp classes.

spec:
  topology:
    controlPlane:
      class: baremetal    #  pick the baremetal class
      replicas: 3
      variables:
        overrides:
          - name: controlPlaneHostSelectorBareMetal
            value:
              matchLabels:
                name: bm-1

also matchLabels required to select a particular hbmh

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

- ClusterClass now supports multiple classes for control-plane - similar
to workers.
- Cluster topology now includes a field "class" for control-plane which
references to the control.

Signed-off-by: Dhairya Arora <dhairya.arora@syself.com>
// ClusterClass.spec.controlPlane.classes.
// Otherwise, the inline ClusterClass.spec.controlPlane is used.
func resolveControlPlaneClass(cluster *clusterv1.Cluster, clusterClass *clusterv1.ClusterClass) (*clusterv1.ControlPlaneClass, error) {
// If the topology doesn't specify a class, use the inline definition.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we specify a default? We need to ensure that every cluster using a newer clusterclass/clusterstack is using a default without the need to specify in the cluster object

@guettli
Copy link

guettli commented Feb 25, 2026

@Dhairya-Arora01 please add syself to every part we modify, so that it is easier to understand.

Dhairya-Arora01 and others added 9 commits February 26, 2026 16:37
In order to fix the error on applying the updated CRDs

```
failed to patch provider object: CustomResourceDefinition.apiextensions.k8s.io \"clusterclasses.cluster.x-k8s.io\" is invalid: spec.versions[1].schema.openAPIV3Schema.properties[spec].properties[controlPlaneClasses].items.properties[class].default: Required value: this property is in x-kubernetes-list-map-keys, so it must have a default or be a required property

```

Signed-off-by: Dhairya Arora <dhairya.arora@syself.com>
"currentKind", s.Current.ControlPlane.InfrastructureMachineTemplate.GetKind(),
"desiredKind", s.Desired.ControlPlane.InfrastructureMachineTemplate.GetKind(),
)
currentCPInfraMachineTemplate = nil

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment to explain why you set it on nil? This is not clear to me

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants