Skip to content

migrations: add migrations for prefer-closest-numa-nodes and max-allowable-numa-nodes#4778

Open
piyush-jena wants to merge 3 commits intobottlerocket-os:developfrom
piyush-jena:add-k8s-settings
Open

migrations: add migrations for prefer-closest-numa-nodes and max-allowable-numa-nodes#4778
piyush-jena wants to merge 3 commits intobottlerocket-os:developfrom
piyush-jena:add-k8s-settings

Conversation

@piyush-jena
Copy link
Contributor

@piyush-jena piyush-jena commented Mar 3, 2026

Issue number:

Closes #4750

Description of changes:
Add 2 topology manager policy options:

  1. max-allowable-numa-nodes - GA k8s-1.35+
  2. prefer-closest-numa-nodes - GA k8s-1.32+

Testing done:
Migration testing:

  1. Before upgrade
[ssm-user@control]$ apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "36151b8b",
    "pretty_name": "Bottlerocket OS 1.56.0 (aws-k8s-1.35)",
    "variant_id": "aws-k8s-1.35",
    "version_id": "1.56.0"
  }
}
[ssm-user@control]$ apiclient set \
  kubernetes.cpu-manager-policy=static \
  kubernetes.topology-manager-policy="best-effort" \
  kubernetes.topology-manager-policy-options.prefer-closest-numa-nodes="true"
Failed to change settings: Failed PATCH request to '/settings/keypair?tx=apiclient-set-KXBvywfwgeVYcZdS': Status 400 when PATCHing /settings/keypair?tx=apiclient-set-KXBvywfwgeVYcZdS: Unable to match your input to the data model.  We may not have enough type information.  Please try the --json input form.  Cause: Error during deserialization: unknown field `topology-manager-policy-options`, expected one of `cluster-name`, `cluster-certificate`, `api-server`, `node-labels`, `node-taints`, `static-pods`, `authentication-mode`, `bootstrap-token`, `standalone-mode`, `eviction-hard`, `eviction-soft`, `eviction-soft-grace-period`, `eviction-max-pod-grace-period`, `kube-reserved`, `system-reserved`, `allowed-unsafe-sysctls`, `server-tls-bootstrap`, `cloud-provider`, `registry-qps`, `registry-burst`, `event-qps`, `event-burst`, `kube-api-qps`, `kube-api-burst`, `container-log-max-size`, `container-log-max-files`, `container-log-max-workers`, `container-log-monitor-interval`, `cpu-cfs-quota-enforced`, `cpu-manager-policy`, `cpu-manager-reconcile-period`, `cpu-manager-policy-options`, `topology-manager-scope`, `topology-manager-policy`, `pod-pids-limit`, `image-gc-high-threshold-percent`, `image-gc-low-threshold-percent`, `image-minimum-gc-age`, `image-maximum-gc-age`, `provider-id`, `log-level`, `credential-providers`, `server-certificate`, `server-key`, `shutdown-grace-period`, `shutdown-grace-period-for-critical-pods`, `memory-manager-reserved-memory`, `memory-manager-policy`, `reserved-cpus`, `memory-swap-behavior`, `hostname-override-source`, `seccomp-default`, `device-ownership-from-security-context`, `single-process-oom-kill`, `static-pods-enabled`, `max-pods`, `cluster-dns-ip`, `cluster-domain`, `node-ip`, `pod-infra-container-image`, `hostname-override`, `ids-per-pod`, `max-parallel-image-pulls` at line 1 column 118

bash-5.2# updog check-update -a --json
[
  {
    "variant": "aws-k8s-1.35",
    "arch": "x86_64",
    "version": "1.57.0",
    "max_version": "1.57.0",
    "waves": {
      "0": "2026-03-09T23:16:35.592575499Z",
      "20": "2026-03-10T02:16:35.592575499Z",
      "102": "2026-03-10T22:16:35.592575499Z",
      "307": "2026-03-11T22:16:35.592575499Z",
      "819": "2026-03-13T22:16:35.592575499Z",
      "1228": "2026-03-14T22:16:35.592575499Z",
      "1843": "2026-03-15T22:16:35.592575499Z"
    },
    "images": {
      "boot": "bottlerocket-aws-k8s-1.35-x86_64-1.57.0-54e01036-boot.ext4.lz4",
      "root": "bottlerocket-aws-k8s-1.35-x86_64-1.57.0-54e01036-root.ext4.lz4",
      "hash": "bottlerocket-aws-k8s-1.35-x86_64-1.57.0-54e01036-root.verity.lz4"
    }
  }
]
  1. After upgrading to v1.57.0
[ssm-user@control]$ apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "54e01036",
    "pretty_name": "Bottlerocket OS 1.57.0 (aws-k8s-1.35)",
    "variant_id": "aws-k8s-1.35",
    "version_id": "1.57.0"
  }
}
[ssm-user@control]$ apiclient set \
  kubernetes.cpu-manager-policy=static \
  kubernetes.topology-manager-policy="best-effort" \
  kubernetes.topology-manager-policy-options.prefer-closest-numa-nodes="true"
[ssm-user@control]$ apiclient get settings.kubernetes
{
  "settings": {
    "kubernetes": {
      "authentication-mode": "aws",
      "cloud-provider": "external",
      "cluster-dns-ip": "10.100.0.10",
      "cluster-domain": "cluster.local",
      "cpu-manager-policy": "static",
      "credential-providers": {
        "ecr-credential-provider": {
          "cache-duration": "12h",
          "enabled": true,
          "image-patterns": [
            "*.dkr.ecr.*.amazonaws.com",
            "*.dkr.ecr.*.amazonaws.com.cn",
            "*.dkr.ecr.*.amazonaws.eu",
            "*.dkr-ecr.*.on.aws",
            "*.dkr-ecr.*.on.amazonwebservices.com.cn",
            "*.dkr.ecr-fips.*.amazonaws.com",
            "*.dkr.ecr-fips.*.amazonaws.eu",
            "*.dkr.ecr.*.cloud.adc-e.uk",
            "*.dkr.ecr-fips.*.cloud.adc-e.uk",
            "*.dkr.ecr.*.c2s.ic.gov",
            "*.dkr.ecr-fips.*.c2s.ic.gov",
            "*.dkr.ecr.*.sc2s.sgov.gov",
            "*.dkr.ecr-fips.*.sc2s.sgov.gov",
            "*.dkr.ecr.*.csp.hci.ic.gov",
            "*.dkr.ecr-fips.*.csp.hci.ic.gov",
            "public.ecr.aws"
          ]
        }
      },
      "device-ownership-from-security-context": true,
      "hostname-override": "ip-172-31-10-220.us-west-2.compute.internal",
      "hostname-override-source": "private-dns-name",
      "max-pods": 29,
      "node-ip": "172.31.10.220",
      "provider-id": "aws:///us-west-2c/i-0fce061d5c684b9a8",
      "seccomp-default": true,
      "server-tls-bootstrap": true,
      "shutdown-grace-period": "150s",
      "shutdown-grace-period-for-critical-pods": "30s",
      "standalone-mode": false,
      "topology-manager-policy": "best-effort",
      "topology-manager-policy-options": {
        "prefer-closest-numa-nodes": true
      }
    }
  }
}

bash-5.2# cat /etc/kubernetes/kubelet/config
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 2m0s
    enabled: true
  x509:
    clientCAFile: "/etc/kubernetes/pki/ca.crt"
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 5m0s
    cacheUnauthorizedTTL: 30s
clusterDomain: cluster.local
clusterDNS:
- 10.100.0.10
kubeReserved:
  cpu: "70m"
  memory: "574Mi"
  ephemeral-storage: "1Gi"
kubeReservedCgroup: "/runtime"
cpuCFSQuota: true
cpuManagerPolicy: static
topologyManagerPolicy: best-effort
topologyManagerPolicyOptions:
  prefer-closest-numa-nodes: "true"
podPidsLimit: 1048576
providerID: aws:///us-west-2c/i-0fce061d5c684b9a8
resolvConf: "/run/netdog/resolv.conf"
hairpinMode: hairpin-veth
readOnlyPort: 0
cgroupDriver: systemd
cgroupRoot: "/"
runtimeRequestTimeout: 15m
protectKernelDefaults: true
serializeImagePulls: false
seccompDefault: true
serverTLSBootstrap: true
tlsCipherSuites:
- TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
- TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
volumePluginDir: "/var/lib/kubelet/plugins/volume/exec"
maxPods: 29
staticPodPath: "/etc/kubernetes/static-pods/"
shutdownGracePeriod: 150s
shutdownGracePeriodCriticalPods: 30s
failSwapOn: false
failCgroupV1: false
featureGates:
  DynamicResourceAllocation: true
  MutableCSINodeAllocatableCount: true
  1. After downgrading back to v1.56.0
[ssm-user@control]$ apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "36151b8b",
    "pretty_name": "Bottlerocket OS 1.56.0 (aws-k8s-1.35)",
    "variant_id": "aws-k8s-1.35",
    "version_id": "1.56.0"
  }
}
[ssm-user@control]$ apiclient get settings.kubernetes
{
  "settings": {
    "kubernetes": {
      "authentication-mode": "aws",
      "cloud-provider": "external",
      "cluster-dns-ip": "10.100.0.10",
      "cluster-domain": "cluster.local",
      "cpu-manager-policy": "static",
      "credential-providers": {
        "ecr-credential-provider": {
          "cache-duration": "12h",
          "enabled": true,
          "image-patterns": [
            "*.dkr.ecr.*.amazonaws.com",
            "*.dkr.ecr.*.amazonaws.com.cn",
            "*.dkr.ecr.*.amazonaws.eu",
            "*.dkr-ecr.*.on.aws",
            "*.dkr-ecr.*.on.amazonwebservices.com.cn",
            "*.dkr.ecr-fips.*.amazonaws.com",
            "*.dkr.ecr-fips.*.amazonaws.eu",
            "*.dkr.ecr.*.cloud.adc-e.uk",
            "*.dkr.ecr-fips.*.cloud.adc-e.uk",
            "*.dkr.ecr.*.c2s.ic.gov",
            "*.dkr.ecr-fips.*.c2s.ic.gov",
            "*.dkr.ecr.*.sc2s.sgov.gov",
            "*.dkr.ecr-fips.*.sc2s.sgov.gov",
            "*.dkr.ecr.*.csp.hci.ic.gov",
            "*.dkr.ecr-fips.*.csp.hci.ic.gov",
            "public.ecr.aws"
          ]
        }
      },
      "device-ownership-from-security-context": true,
      "hostname-override": "ip-172-31-10-220.us-west-2.compute.internal",
      "hostname-override-source": "private-dns-name",
      "max-pods": 29,
      "node-ip": "172.31.10.220",
      "provider-id": "aws:///us-west-2c/i-0fce061d5c684b9a8",
      "seccomp-default": true,
      "server-tls-bootstrap": true,
      "shutdown-grace-period": "150s",
      "shutdown-grace-period-for-critical-pods": "30s",
      "standalone-mode": false,
      "topology-manager-policy": "best-effort"
    }
  }
}

bash-5.2# cat /etc/kubernetes/kubelet/config
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 2m0s
    enabled: true
  x509:
    clientCAFile: "/etc/kubernetes/pki/ca.crt"
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 5m0s
    cacheUnauthorizedTTL: 30s
clusterDomain: cluster.local
clusterDNS:
- 10.100.0.10
kubeReserved:
  cpu: "70m"
  memory: "574Mi"
  ephemeral-storage: "1Gi"
kubeReservedCgroup: "/runtime"
cpuCFSQuota: true
cpuManagerPolicy: static
topologyManagerPolicy: best-effort
podPidsLimit: 1048576
providerID: aws:///us-west-2c/i-0fce061d5c684b9a8
resolvConf: "/run/netdog/resolv.conf"
hairpinMode: hairpin-veth
readOnlyPort: 0
cgroupDriver: systemd
cgroupRoot: "/"
runtimeRequestTimeout: 15m
protectKernelDefaults: true
serializeImagePulls: false
seccompDefault: true
serverTLSBootstrap: true
tlsCipherSuites:
- TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
- TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
volumePluginDir: "/var/lib/kubelet/plugins/volume/exec"
maxPods: 29
staticPodPath: "/etc/kubernetes/static-pods/"
shutdownGracePeriod: 150s
shutdownGracePeriodCriticalPods: 30s
failSwapOn: false
failCgroupV1: false
featureGates:
  DynamicResourceAllocation: true
  MutableCSINodeAllocatableCount: true

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

Update all five settings-sdk dependency blocks to use the new
v0.22.0 tag which includes topology-manager-policy-options.
Add AddSettingsMigration for:
- settings.kubernetes.topology-manager-policy-options
- settings.kubernetes.topology-manager-policy-options.prefer-closest-numa-nodes
- settings.kubernetes.topology-manager-policy-options.max-allowable-numa-nodes
Signed-off-by: Piyush Jena <jepiyush@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Bottlerocket support for kubelet "Topology manager policy options"

1 participant