- About the Kernel Module Management Operator
- Installing the Kernel Module Management Operator
- Installing the Kernel Module Management Operator using the web console
- Installing the Kernel Module Management Operator by using the CLI
- Installing the Kernel Module Management Operator on earlier versions of OpenShift Container Platform
- Configuring the Kernel Module Management Operator
- Unloading the kernel module
- Setting the kernel firmware search path
- Uninstalling the Kernel Module Management Operator
- Uninstalling a Red Hat catalog installation
- Uninstalling a CLI installation
- Kernel module deployment
- The Module custom resource definition
- Set soft dependencies between kernel modules
- Security and permissions
- ServiceAccounts and SecurityContextConstraints
- Pod security standards
- Replacing in-tree modules with out-of-tree modules
- Example Module CR
- Symbolic links for in-tree dependencies
- Creating a kmod image
- Running depmod
- Building in the cluster
- Using the Driver Toolkit
- Using signing with Kernel Module Management (KMM)
- Adding the keys for secureboot
- Checking the keys
- Signing kmods in a pre-built image
- Building and signing a kmod image
- KMM hub and spoke
- KMM-Hub
- Installing KMM-Hub
- Using the ManagedClusterModule CRD
- Running KMM on the spoke
- Customizing upgrades for kernel modules
- Day 1 kernel module loading
- Day 1 supported use cases
- OOT kernel module loading flow
- The kernel module image
- In-tree module replacement
- MCO yaml creation
- The MachineConfigPool
- Debugging and troubleshooting
- KMM firmware support
- Configuring the lookup path on nodes
- Building a kmod image
- Tuning the Module resource
- Day 0 through Day 2 kmod installation
- Layering background
- Lifecycle management
- Troubleshooting KMM
- Reading Operator logs
- Observing events
- Using the must-gather tool
Learn about the Kernel Module Management (KMM) Operator and how you can use it to deploy out-of-tree kernel modules and device plugins on OpenShift Container Platform clusters.
About the Kernel Module Management Operator
The Kernel Module Management (KMM) Operator manages, builds, signs, and deploys out-of-tree kernel modules and device plugins on OpenShift Container Platform clusters.
KMM adds a new Module
CRD which describes an out-of-tree kernel module and its associated device plugin.You can use Module
resources to configure how to load the module, define ModuleLoader
images for kernel versions, and include instructions for building and signing modules for specific kernel versions.
KMM is designed to accommodate multiple kernel versions at once for any kernel module, allowing for seamless node upgrades and reduced application downtime.
Installing the Kernel Module Management Operator
As a cluster administrator, you can install the Kernel Module Management (KMM) Operator by using the OpenShift CLI or the web console.
The KMM Operator is supported on OpenShift Container Platform 4.12 and later.Installing KMM on version 4.11 does not require specific additional steps.For details on installing KMM on version 4.10 and earlier, see the section "Installing the Kernel Module Management Operator on earlier versions of OpenShift Container Platform".
Installing the Kernel Module Management Operator using the web console
As a cluster administrator, you can install the Kernel Module Management (KMM) Operator using the OpenShift Container Platform web console.
Procedure
Log in to the OpenShift Container Platform web console.
Install the Kernel Module Management Operator:
In the OpenShift Container Platform web console, click Operators → OperatorHub.
Select Kernel Module Management Operator from the list of available Operators, and then click Install.
From the Installed Namespace list, select the
openshift-kmm
namespace.Click Install.
Verification
To verify that KMM Operator installed successfully:
Navigate to the Operators → Installed Operators page.
Ensure that Kernel Module Management Operator is listed in the openshift-kmm project with a Status of InstallSucceeded.
During installation, an Operator might display a Failed status. If the installation later succeeds with an InstallSucceeded message, you can ignore the Failed message.
Troubleshooting
To troubleshoot issues with Operator installation:
Navigate to the Operators → Installed Operators page and inspect the Operator Subscriptions and Install Plans tabs for any failure or errors under Status.
Navigate to the Workloads → Pods page and check the logs for pods in the
openshift-kmm
project.
Installing the Kernel Module Management Operator by using the CLI
As a cluster administrator, you can install the Kernel Module Management (KMM) Operator by using the OpenShift CLI.
Prerequisites
You have a running OpenShift Container Platform cluster.
You installed the OpenShift CLI (
oc
).You are logged into the OpenShift CLI as a user with
cluster-admin
privileges.
Procedure
Install KMM in the
openshift-kmm
namespace:Create the following
Namespace
CR and save the YAML file, for example,kmm-namespace.yaml
:apiVersion: v1kind: Namespacemetadata: name: openshift-kmm
Create the following
OperatorGroup
CR and save the YAML file, for example,kmm-op-group.yaml
:apiVersion: operators.coreos.com/v1kind: OperatorGroupmetadata: name: kernel-module-management namespace: openshift-kmm
Create the following
Subscription
CR and save the YAML file, for example,kmm-sub.yaml
:apiVersion: operators.coreos.com/v1alpha1kind: Subscriptionmetadata: name: kernel-module-management namespace: openshift-kmmspec: channel: release-1.0 installPlanApproval: Automatic name: kernel-module-management source: redhat-operators sourceNamespace: openshift-marketplace startingCSV: kernel-module-management.v1.0.0
Create the subscription object by running the following command:
$ oc create -f kmm-sub.yaml
Verification
To verify that the Operator deployment is successful, run the following command:
$ oc get -n openshift-kmm deployments.apps kmm-operator-controller
Example output
NAME READY UP-TO-DATE AVAILABLE AGEkmm-operator-controller 1/1 1 1 97s
The Operator is available.
Installing the Kernel Module Management Operator on earlier versions of OpenShift Container Platform
The KMM Operator is supported on OpenShift Container Platform 4.12 and later.For version 4.10 and earlier, you must create a new SecurityContextConstraint
object and bind it to the Operator’s ServiceAccount
.As a cluster administrator, you can install the Kernel Module Management (KMM) Operator by using the OpenShift CLI.
Prerequisites
You have a running OpenShift Container Platform cluster.
You installed the OpenShift CLI (
oc
).You are logged into the OpenShift CLI as a user with
cluster-admin
privileges.
Procedure
Install KMM in the
openshift-kmm
namespace:Create the following
Namespace
CR and save the YAML file, for example,kmm-namespace.yaml
file:apiVersion: v1kind: Namespacemetadata: name: openshift-kmm
Create the following
SecurityContextConstraint
object and save the YAML file, for example,kmm-security-constraint.yaml
:allowHostDirVolumePlugin: falseallowHostIPC: falseallowHostNetwork: falseallowHostPID: falseallowHostPorts: falseallowPrivilegeEscalation: falseallowPrivilegedContainer: falseallowedCapabilities: - NET_BIND_SERVICEapiVersion: security.openshift.io/v1defaultAddCapabilities: nullfsGroup: type: MustRunAsgroups: []kind: SecurityContextConstraintsmetadata: name: restricted-v2priority: nullreadOnlyRootFilesystem: falserequiredDropCapabilities: - ALLrunAsUser: type: MustRunAsRangeseLinuxContext: type: MustRunAsseccompProfiles: - runtime/defaultsupplementalGroups: type: RunAsAnyusers: []volumes: - configMap - downwardAPI - emptyDir - persistentVolumeClaim - projected - secret
Bind the
SecurityContextConstraint
object to the Operator’sServiceAccount
by running the following commands:$ oc apply -f kmm-security-constraint.yaml
$ oc adm policy add-scc-to-user kmm-security-constraint -z kmm-operator-controller -n openshift-kmm
Create the following
OperatorGroup
CR and save the YAML file, for example,kmm-op-group.yaml
:apiVersion: operators.coreos.com/v1kind: OperatorGroupmetadata: name: kernel-module-management namespace: openshift-kmm
Create the following
Subscription
CR and save the YAML file, for example,kmm-sub.yaml
:apiVersion: operators.coreos.com/v1alpha1kind: Subscriptionmetadata: name: kernel-module-management namespace: openshift-kmmspec: channel: release-1.0 installPlanApproval: Automatic name: kernel-module-management source: redhat-operators sourceNamespace: openshift-marketplace startingCSV: kernel-module-management.v1.0.0
Create the subscription object by running the following command:
$ oc create -f kmm-sub.yaml
Verification
To verify that the Operator deployment is successful, run the following command:
$ oc get -n openshift-kmm deployments.apps kmm-operator-controller
Example output
NAME READY UP-TO-DATE AVAILABLE AGEkmm-operator-controller 1/1 1 1 97s
The Operator is available.
Configuring the Kernel Module Management Operator
In most cases, the default configuration for the Kernel Module Management (KMM) Operator does not need to be modified. However, you can modify the Operator settings to suit your environment using the following procedure.
The Operator configuration is set in the kmm-operator-manager-config
ConfigMap
in the Operator namespace.
Procedure
To modify the settings, edit the
ConfigMap
data by entering the following command:$ oc edit configmap -n "$namespace" kmm-operator-manager-config
Example output
healthProbeBindAddress: :8081job: gcDelay: 1hleaderElection: enabled: true resourceID: kmm.sigs.x-k8s.iowebhook: disableHTTP2: true # CVE-2023-44487 port: 9443metrics: enableAuthnAuthz: true disableHTTP2: true # CVE-2023-44487 bindAddress: 0.0.0.0:8443 secureServing: trueworker: runAsUser: 0 seLinuxType: spc_t setFirmwareClassPath: /var/lib/firmware
Table 1. Operator configuration parameters Parameter Description healthProbeBindAddress
Defines the address on which the Operator monitors for kubelet health probes. The recommended value is
:8081
.job.gcDelay
Defines the duration that successful build pods should be preserved for before they are deleted. There is no recommended value for this setting. For information about the valid values for this setting, see ParseDuration.
leaderElection.enabled
Determines whether leader election is used to ensure that only one replica of the KMM Operator is running at any time. For more information, see Leases. The recommended value is
true
.leaderElection.resourceID
Determines the name of the resource that leader election uses for holding the leader lock. The recommended value is
kmm.sigs.x-k8s.io
.webhook.disableHTTP2
If
true
, disables HTTP/2 for the webhook server, as a mitigation for cve-2023-44487. The recommended value istrue
.webhook.port
Defines the port on which the Operator monitors webhook requests. The recommended value is
9443
.metrics.enableAuthnAuthz
Determines if metrics are authenticated using
TokenReviews
and authorized usingSubjectAccessReviews
with the kube-apiserver.For authentication and authorization, the controller needs a
ClusterRole
with the following rules:apiGroups: authentication.k8s.io, resources: tokenreviews, verbs: create
apiGroups: authorization.k8s.io, resources: subjectaccessreviews, verbs: create
To scrape metrics, for example, using Prometheus, the client needs a
ClusterRole
with the following rule:nonResourceURLs: "/metrics", verbs: get
The recommended value is
true
.metrics.disableHTTP2
If
true
, disables HTTP/2 for the metrics server as a mitigation for CVE-2023-44487. The recommended value istrue
.metrics.bindAddress
Determines the bind address for the metrics server. If unspecified, the default is
:8080
. To disable the metrics server, set to0
. The recommended value is0.0.0.0:8443
.metrics.secureServing
Determines whether the metrics are served over HTTPS instead of HTTP. The recommended value is
true
.worker.runAsUser
Determines the value of the
runAsUser
field of the worker container’s security context. For more information, see SecurityContext. The recommended value is9443
.worker.seLinuxType
Determines the value of the
seLinuxOptions.type
field of the worker container’s security context. For more information, see SecurityContext. The recommended value isspc_t
.worker.setFirmwareClassPath
Sets the kernel’s firmware search path into the
/sys/module/firmware_class/parameters/path
file on the node. The recommended value is/var/lib/firmware
if you need to set that value through the worker app. Otherwise, unset.After modifying the settings, restart the controller with the following command:
$ oc delete pod -n "<namespace>" -l app.kubernetes.io/component=kmm
The value of <namespace> depends on your original installation method.
Additional resources
For more information, see Installing the Kernel Module Management Operator.
Unloading the kernel module
You must unload the kernel modules when moving to a newer version or if they introduce some undesirable side effect on the node.
Procedure
To unload a module loaded with KMM from nodes, delete the corresponding
Module
resource. KMM then creates worker pods, where required, to runmodprobe -r
and unload the kernel module from the nodes.When unloading worker pods, KMM needs all the resources it uses when loading the kernel module. This includes the
ServiceAccount
referenced in theModule
as well as any RBAC defined to allow privileged KMM worker Pods to run. It also includes any pull secret referenced in.spec.imageRepoSecret
.To avoid situations where KMM is unable to unload the kernel module from nodes, make sure those resources are not deleted while the
Module
resource is still present in the cluster in any state, includingTerminating
. KMM includes a validating admission webhook that rejects the deletion of namespaces that contain at least oneModule
resource.
Setting the kernel firmware search path
The Linux kernel accepts the firmware_class.path
parameter as a search path for firmware, as explained in Firmware search paths.
KMM worker pods can set this value on nodes by writing to sysfs before attempting to load kmods.
Procedure
To define a firmware search path, set
worker.setFirmwareClassPath
to/var/lib/firmware
in the Operator configuration.
Additional resources
For more information about the
worker.setFirmwareClassPath
path, see Configuring the Kernel Module Management Operator.
Uninstalling the Kernel Module Management Operator
Use one of the following procedures to uninstall the Kernel Module Management (KMM) Operator, depending on howthe KMM Operator was installed.
Uninstalling a Red Hat catalog installation
Use this procedure if KMM was installed from the Red Hat catalog.
Procedure
Use the following method to uninstall the KMM Operator:
Use the OpenShift console under Operators -→ Installed Operators to locate and uninstall the Operator.
Alternatively, you can delete the |
Uninstalling a CLI installation
Use this command if the KMM Operator was installed using the OpenShift CLI.
Procedure
Run the following command to uninstall the KMM Operator:
$ oc delete -k https://github.com/rh-ecosystem-edge/kernel-module-management/config/default
Using this command deletes the
Module
CRD and allModule
instances in the cluster.
Kernel module deployment
Kernel Module Management (KMM) monitors Node
and Module
resources in the cluster to determine if a kernel module should be loaded on or unloaded from a node.
To be eligible for a module, a node must contain the following:
Labels that match the module’s
.spec.selector
field.A kernel version matching one of the items in the module’s
.spec.moduleLoader.container.kernelMappings
field.If ordered upgrade (
ordered_upgrade.md
) is configured in the module, a label that matches its.spec.moduleLoader.container.version
field.
When KMM reconciles nodes with the desired state as configured in the Module
resource, it creates worker pods on the target nodes to run the necessary action. The KMM Operator monitors the outcome of the pods and records the information. The Operator uses this information to label the Node
objects when the module is successfully loaded, and to run the device plugin, if configured.
Worker pods run the KMM worker
binary that performs the following tasks:
Pulls the kmod image configured in the
Module
resource. Kmod images are standard OCI images that contain.ko
files.Extracts the image in the pod’s filesystem.
Runs
modprobe
with the specified arguments to perform the necessary action.
The Module custom resource definition
The Module
custom resource definition (CRD) represents a kernel module that can be loaded on all or select nodes in the cluster, through a kmod image. A Module
custom resource (CR) specifies one or more kernel versions with which it is compatible, and a node selector.
The compatible versions for a Module
resource are listed under .spec.moduleLoader.container.kernelMappings
. A kernel mapping can either match a literal
version, or use regexp
to match many of them at the same time.
The reconciliation loop for the Module
resource runs the following steps:
List all nodes matching
.spec.selector
.Build a set of all kernel versions running on those nodes.
For each kernel version:
Go through
.spec.moduleLoader.container.kernelMappings
and find the appropriate container image name.If the kernel mapping hasbuild
orsign
defined and the container image does not already exist, run the build, the signing pod, or both, as needed.Create a worker pod to pull the container image determined in the previous step and run
modprobe
.If
.spec.devicePlugin
is defined, create a device plugin daemon set using the configuration specified under.spec.devicePlugin.container
.
Run
garbage-collect
on:Obsolete device plugin
DaemonSets
that do not target any node.Successful build pods.
Successful signing pods.
Set soft dependencies between kernel modules
Some configurations require that several kernel modules be loaded in a specific order to work properly, even though the modules do not directly depend on each other through symbols. These are called soft dependencies. depmod
is usually not aware of these dependencies, and they do not appear in the files it produces. For example, if mod_a
has a soft dependency on mod_b
, modprobe mod_a
will not load mod_b
.
You can resolve these situations by declaring soft dependencies in the Module custom resource definition (CRD) using the modulesLoadingOrder
field.
# ...spec: moduleLoader: container: modprobe: moduleName: mod_a dirName: /opt firmwarePath: /firmware parameters: - param=1 modulesLoadingOrder: - mod_a - mod_b
In the configuration above, the worker pod will first try to unload the in-tree mod_b
before loading mod_a
from the kmod image.When the worker pod is terminated and mod_a
is unloaded, mod_b
will not be loaded again.
The first value in the list, to be loaded last, must be equivalent to the |
Security and permissions
Loading kernel modules is a highly sensitive operation.After they are loaded, kernel modules have all possible permissions to do any kind of operation on the node. |
ServiceAccounts and SecurityContextConstraints
Kernel Module Management (KMM) creates a privileged workload to load the kernel modules on nodes.That workload needs ServiceAccounts
allowed to use the privileged
SecurityContextConstraint
(SCC) resource.
The authorization model for that workload depends on the namespace of the Module
resource, as well as its spec.
If the
.spec.moduleLoader.serviceAccountName
or.spec.devicePlugin.serviceAccountName
fields are set, they are always used.If those fields are not set, then:
If the
Module
resource is created in the Operator’s namespace (openshift-kmm
by default), then KMM uses its default, powerfulServiceAccounts
to run the worker and device plugin pods.If the
Module
resource is created in any other namespace, then KMM runs the pods with the namespace’sdefault
ServiceAccount
. TheModule
resource cannot run a privileged workload unless you manually enable it to use theprivileged
SCC.
When setting up RBAC permissions, remember that any user or |
To allow any ServiceAccount
to use the privileged
SCC and run worker or device plugin pods, you can use the oc adm policy
command, as in the following example:
$ oc adm policy add-scc-to-user privileged -z "${serviceAccountName}" [ -n "${namespace}" ]
Pod security standards
OpenShift runs a synchronization mechanism that sets the namespace Pod Security level automatically based onthe security contexts in use. No action is needed.
Additional resources
Understanding and managing pod security admission
Replacing in-tree modules with out-of-tree modules
You can use Kernel Module Management (KMM) to build kernel modules that can be loaded or unloaded into the kernel on demand. These modules extend the functionality of the kernel without the need to reboot the system. Modules can be configured as built-in or dynamically loaded.
Dynamically loaded modules include in-tree modules and out-of-tree (OOT) modules. In-tree modules are internal to the Linux kernel tree, that is, they are already part of the kernel. Out-of-tree modules are external to the Linux kernel tree. They are generally written for development and testing purposes, such as testing the new version of a kernel module that is shipped in-tree, or to deal with incompatibilities.
Some modules that are loaded by KMM could replace in-tree modules that are already loaded on the node. To unload in-tree modules before loading your module, set the value of the .spec.moduleLoader.container.inTreeModulesToRemove
field to the modules that you want to unload. The following example demonstrates module replacement for all kernel mappings:
# ...spec: moduleLoader: container: modprobe: moduleName: mod_a inTreeModulesToRemove: [mod_a, mod_b]
In this example, the moduleLoader
pod uses inTreeModulesToRemove
to unload the in-tree mod_a
and mod_b
before loading mod_a
from the moduleLoader
image. When the moduleLoader`pod is terminated and `mod_a
is unloaded, mod_b
is not loaded again.
The following is an example for module replacement for specific kernel mappings:
# ...spec: moduleLoader: container: kernelMappings: - literal: 6.0.15-300.fc37.x86_64 containerImage: "some.registry/org/my-kmod:${KERNEL_FULL_VERSION}" inTreeModulesToRemove: [<module_name>, <module_name>]
Additional resources
Example Module CR
The following is an annotated Module
example:
apiVersion: kmm.sigs.x-k8s.io/v1beta1kind: Modulemetadata: name: <my_kmod>spec: moduleLoader: container: modprobe: moduleName: <my_kmod> (1) dirName: /opt (2) firmwarePath: /firmware (3) parameters: (4) - param=1 kernelMappings: (5) - literal: 6.0.15-300.fc37.x86_64 containerImage: some.registry/org/my-kmod:6.0.15-300.fc37.x86_64 - regexp: '^.+\fc37\.x86_64$' (6) containerImage: "some.other.registry/org/<my_kmod>:${KERNEL_FULL_VERSION}" - regexp: '^.+$' (7) containerImage: "some.registry/org/<my_kmod>:${KERNEL_FULL_VERSION}" build: buildArgs: (8) - name: ARG_NAME value: <some_value> secrets: - name: <some_kubernetes_secret> (9) baseImageRegistryTLS: (10) insecure: false insecureSkipTLSVerify: false (11) dockerfileConfigMap: (12) name: <my_kmod_dockerfile> sign: certSecret: name: <cert_secret> (13) keySecret: name: <key_secret> (14) filesToSign: - /opt/lib/modules/${KERNEL_FULL_VERSION}/<my_kmod>.ko registryTLS: (15) insecure: false (16) insecureSkipTLSVerify: false serviceAccountName: <sa_module_loader> (17) devicePlugin: (18) container: image: some.registry/org/device-plugin:latest (19) env: - name: MY_DEVICE_PLUGIN_ENV_VAR value: SOME_VALUE volumeMounts: (20) - mountPath: /some/mountPath name: <device_plugin_volume> volumes: (21) - name: <device_plugin_volume> configMap: name: <some_configmap> serviceAccountName: <sa_device_plugin> (22) imageRepoSecret: (23) name: <secret_name> selector: node-role.kubernetes.io/worker: ""
1 | Required. |
2 | Optional. |
3 | Optional: Copies /firmware/* into /var/lib/firmware/ on the node. |
4 | Optional. |
5 | At least one kernel item is required. |
6 | For each node running a kernel matching the regular expression, KMM checks if you have included a tag or a digest. If you have not specified a tag or digest in the container image, then the validation webhook returns an error and does not apply the module. |
7 | For any other kernel, build the image using the Dockerfile in the my-kmod ConfigMap. |
8 | Optional. |
9 | Optional: A value for some-kubernetes-secret can be obtained from the build environment at /run/secrets/some-kubernetes-secret . |
10 | This field has no effect. When building kmod images or signing kmods within a kmod image,you might sometimes need to pull base images from a registry that serves a certificate signed by anuntrusted Certificate Authority (CA). In order for KMM to trust that CA, it must also trust the new CAby replacing the cluster’s CA bundle. See "Additional resources" to learn how to replace the cluster’s CA bundle. |
11 | Optional: Avoid using this parameter. If set to true , the build will skip any TLS server certificate validation when pulling the image in the Dockerfile FROM instruction using plain HTTP. |
12 | Required. |
13 | Required: A secret holding the public secureboot key with the key 'cert'. |
14 | Required: A secret holding the private secureboot key with the key 'key'. |
15 | Optional: Avoid using this parameter. If set to true , KMM will be allowed to check if the container image already exists using plain HTTP. |
16 | Optional: Avoid using this parameter. If set to true , KMM will skip any TLS server certificate validation when checking if the container image already exists. |
17 | Optional. |
18 | Optional. |
19 | Required: If the device plugin section is present. |
20 | Optional. |
21 | Optional. |
22 | Optional. |
23 | Optional: Used to pull module loader and device plugin images. |
Additional resources
Replacing the CA Bundle certificate
Symbolic links for in-tree dependencies
Some kernel modules depend on other kernel modules that are shipped with the node’s operating system. To avoid copying those dependencies into the kmod image, Kernel Module Management (KMM) mounts /usr/lib/modules
into both the build and the worker pod’s filesystems.
By creating a symlink from /opt/usr/lib/modules/<kernel_version>/<symlink_name>
to /usr/lib/modules/<kernel_version>
, depmod
can use the in-tree kmods on the building node’s filesystem to resolve dependencies.
At runtime, the worker pod extracts the entire image, including the <symlink_name>
symbolic link. That symbolic link points to /usr/lib/modules/<kernel_version>
in the worker pod, which is mounted from the node’s filesystem. modprobe
can then follow that link and load the in-tree dependencies as needed.
In the following example, host
is the symbolic link name under /opt/usr/lib/modules/<kernel_version>
:
ARG DTK_AUTOFROM ${DTK_AUTO} as builder## Build steps#FROM ubi9/ubiARG KERNEL_FULL_VERSIONRUN dnf update && dnf install -y kmodCOPY --from=builder /usr/src/kernel-module-management/ci/kmm-kmod/kmm_ci_a.ko /opt/lib/modules/${KERNEL_FULL_VERSION}/COPY --from=builder /usr/src/kernel-module-management/ci/kmm-kmod/kmm_ci_b.ko /opt/lib/modules/${KERNEL_FULL_VERSION}/# Create the symbolic linkRUN ln -s /lib/modules/${KERNEL_FULL_VERSION} /opt/lib/modules/${KERNEL_FULL_VERSION}/hostRUN depmod -b /opt ${KERNEL_FULL_VERSION}
On the node on which KMM loads the kernel modules, |
Creating a kmod image
Kernel Module Management (KMM) works with purpose-built kmod images, which are standard OCI images that contain .ko
files.The location of the .ko
files must match the following pattern: <prefix>/lib/modules/[kernel-version]/
.
Keep the following in mind when working with the .ko
files:
In most cases,
<prefix>
should be equal to/opt
. This is theModule
CRD’s default value.kernel-version
must not be empty and must be equal to the kernel version the kernel modules were built for.
Running depmod
It is recommended to run depmod
at the end of the build process to generate modules.dep
and .map
files. This is especially useful if your kmod image contains several kernel modules and if one of the modules depends on another module.
You must have a Red Hat subscription to download the |
Procedure
Generate
modules.dep
and.map
files for a specific kernel version by running the following command:$ depmod -b /opt ${KERNEL_FULL_VERSION}+`.
Example Dockerfile
If you are building your image on OpenShift Container Platform, consider using the Driver Tool Kit (DTK).
For further information, see using an entitled build.
apiVersion: v1kind: ConfigMapmetadata: name: kmm-ci-dockerfiledata: dockerfile: | ARG DTK_AUTO FROM ${DTK_AUTO} as builder ARG KERNEL_FULL_VERSION WORKDIR /usr/src RUN ["git", "clone", "https://github.com/rh-ecosystem-edge/kernel-module-management.git"] WORKDIR /usr/src/kernel-module-management/ci/kmm-kmod RUN KERNEL_SRC_DIR=/lib/modules/${KERNEL_FULL_VERSION}/build make all FROM registry.redhat.io/ubi9/ubi-minimal ARG KERNEL_FULL_VERSION RUN microdnf install kmod COPY --from=builder /usr/src/kernel-module-management/ci/kmm-kmod/kmm_ci_a.ko /opt/lib/modules/${KERNEL_FULL_VERSION}/ COPY --from=builder /usr/src/kernel-module-management/ci/kmm-kmod/kmm_ci_b.ko /opt/lib/modules/${KERNEL_FULL_VERSION}/ RUN depmod -b /opt ${KERNEL_FULL_VERSION}
Additional resources
Driver Toolkit
Building in the cluster
KMM can build kmod images in the cluster. Follow these guidelines:
Provide build instructions using the
build
section of a kernel mapping.Copy the
Dockerfile
for your container image into aConfigMap
resource, under thedockerfile
key.Ensure that the
ConfigMap
is located in the same namespace as theModule
.
KMM checks if the image name specified in the containerImage
field exists. If it does, the build is skipped.
Otherwise, KMM creates a Build
resource to build your image. After the image is built, KMM proceeds with the Module
reconciliation. See the following example.
# ...- regexp: '^.+$' containerImage: "some.registry/org/<my_kmod>:${KERNEL_FULL_VERSION}" build: buildArgs: (1) - name: ARG_NAME value: <some_value> secrets: (2) - name: <some_kubernetes_secret> (3) baseImageRegistryTLS: insecure: false (4) insecureSkipTLSVerify: false (5) dockerfileConfigMap: (6) name: <my_kmod_dockerfile> registryTLS: insecure: false (7) insecureSkipTLSVerify: false (8)
1 | Optional. |
2 | Optional. |
3 | Will be mounted in the build pod as /run/secrets/some-kubernetes-secret . |
4 | Optional: Avoid using this parameter. If set to true , the build will be allowed to pull the image in the Dockerfile FROM instruction using plain HTTP. |
5 | Optional: Avoid using this parameter. If set to true , the build will skip any TLS server certificate validation when pulling the image in the Dockerfile FROM instruction using plain HTTP. |
6 | Required. |
7 | Optional: Avoid using this parameter. If set to true , KMM will be allowed to check if the container image already exists using plain HTTP. |
8 | Optional: Avoid using this parameter. If set to true , KMM will skip any TLS server certificate validation when checking if the container image already exists. |
Successful build pods are garbage collected immediately, unless the job.gcDelay
parameter is set in the Operator configuration. Failed build pods are always preserved and must be deleted manually by the administrator for the build to be restarted.
Additional resources
Build configuration resources
Preflight validation for Kernel Module Management (KMM) Modules
Using the Driver Toolkit
The Driver Toolkit (DTK) is a convenient base image for building build kmod loader images.It contains tools and libraries for the OpenShift version currently running in the cluster.
Procedure
Use DTK as the first stage of a multi-stage Dockerfile
.
Build the kernel modules.
Copy the
.ko
files into a smaller end-user image such asubi-minimal
.To leverage DTK in your in-cluster build, use the
DTK_AUTO
build argument.The value is automatically set by KMM when creating theBuild
resource. See the following example.ARG DTK_AUTOFROM ${DTK_AUTO} as builderARG KERNEL_FULL_VERSIONWORKDIR /usr/srcRUN ["git", "clone", "https://github.com/rh-ecosystem-edge/kernel-module-management.git"]WORKDIR /usr/src/kernel-module-management/ci/kmm-kmodRUN KERNEL_SRC_DIR=/lib/modules/${KERNEL_FULL_VERSION}/build make allFROM ubi9/ubi-minimalARG KERNEL_FULL_VERSIONRUN microdnf install kmodCOPY --from=builder /usr/src/kernel-module-management/ci/kmm-kmod/kmm_ci_a.ko /opt/lib/modules/${KERNEL_FULL_VERSION}/COPY --from=builder /usr/src/kernel-module-management/ci/kmm-kmod/kmm_ci_b.ko /opt/lib/modules/${KERNEL_FULL_VERSION}/RUN depmod -b /opt ${KERNEL_FULL_VERSION}
Additional resources
Driver Toolkit
Using signing with Kernel Module Management (KMM)
On a Secure Boot enabled system, all kernel modules (kmods) must be signed with a public/private key-pair enrolled into the Machine Owner’s Key (MOK) database. Drivers distributed as part of a distribution should already be signed by the distribution’s private key, but for kernel modules build out-of-tree, KMM supports signing kernel modules using the sign
section of the kernel mapping.
For more details on using Secure Boot, see Generating a public and private key pair
Prerequisites
A public private key pair in the correct (DER) format.
At least one secure-boot enabled node with the public key enrolled in its MOK database.
Either a pre-built driver container image, or the source code and
Dockerfile
needed to build one in-cluster.
Adding the keys for secureboot
To use KMM Kernel Module Management (KMM) to sign kernel modules, a certificate and private key are required. For details on how to create these, see Generating a public and private key pair.
For details on how to extract the public and private key pair, see Signing kernel modules with the private key. Use steps 1 through 4 to extract the keys into files.
Procedure
Create the
sb_cert.cer
file that contains the certificate and thesb_cert.priv
file that contains the private key:$ openssl req -x509 -new -nodes -utf8 -sha256 -days 36500 -batch -config configuration_file.config -outform DER -out my_signing_key_pub.der -keyout my_signing_key.priv
Add the files by using one of the following methods:
Add the files as secrets directly:
$ oc create secret generic my-signing-key --from-file=key=<my_signing_key.priv>
$ oc create secret generic my-signing-key-pub --from-file=cert=<my_signing_key_pub.der>
Add the files by base64 encoding them:
$ cat sb_cert.priv | base64 -w 0 > my_signing_key2.base64
$ cat sb_cert.cer | base64 -w 0 > my_signing_key_pub.base64
Add the encoded text to a YAML file:
apiVersion: v1kind: Secretmetadata: name: my-signing-key-pub namespace: default (1)type: Opaquedata: cert: <base64_encoded_secureboot_public_key>---apiVersion: v1kind: Secretmetadata: name: my-signing-key namespace: default (1)type: Opaquedata: key: <base64_encoded_secureboot_private_key>
1 namespace
- Replacedefault
with a valid namespace.Apply the YAML file:
$ oc apply -f <yaml_filename>
Checking the keys
After you have added the keys, you must check them to ensure they are set correctly.
Procedure
Check to ensure the public key secret is set correctly:
$ oc get secret -o yaml <certificate secret name> | awk '/cert/{print $2; exit}' | base64 -d | openssl x509 -inform der -text
This should display a certificate with a Serial Number, Issuer, Subject, and more.
Check to ensure the private key secret is set correctly:
$ oc get secret -o yaml <private key secret name> | awk '/key/{print $2; exit}' | base64 -d
This should display the key enclosed in the
-----BEGIN PRIVATE KEY-----
and-----END PRIVATE KEY-----
lines.
Signing kmods in a pre-built image
Use this procedure if you have a pre-built image, such as an image either distributed by a hardware vendor or built elsewhere.
The following YAML file adds the public/private key-pair as secrets with the required key names - key
for the private key, cert
for the public key. The cluster then pulls down the unsignedImage
image, opens it, signs the kernel modules listed in filesToSign
, adds them back, and pushes the resulting image as containerImage
.
KMM then loads the signed kmods onto all the nodes with that match the selector. The kmods are successfully loaded on any nodes that have the public key in their MOK database, and any nodes that are not secure-boot enabled, which will ignore the signature.
Prerequisites
The
keySecret
andcertSecret
secrets have been created in the same namespace as the rest of the resources.
Procedure
Apply the YAML file:
---apiVersion: kmm.sigs.x-k8s.io/v1beta1kind: Modulemetadata: name: example-modulespec: moduleLoader: serviceAccountName: default container: modprobe: (1) moduleName: '<module_name>' kernelMappings: # the kmods will be deployed on all nodes in the cluster with a kernel that matches the regexp - regexp: '^.*\.x86_64$' # the container to produce containing the signed kmods containerImage: <image_name> (2) sign: # the image containing the unsigned kmods (we need this because we are not building the kmods within the cluster) unsignedImage: <image_name> (3) keySecret: # a secret holding the private secureboot key with the key 'key' name: <private_key_secret_name> certSecret: # a secret holding the public secureboot key with the key 'cert' name: <certificate_secret_name> filesToSign: # full path within the unsignedImage container to the kmod(s) to sign - /opt/lib/modules/4.18.0-348.2.1.el8_5.x86_64/kmm_ci_a.ko imageRepoSecret: # the name of a secret containing credentials to pull unsignedImage and push containerImage to the registry name: repo-pull-secret selector: kubernetes.io/arch: amd64
1 | The name of the kmod to load. |
2 | The name of the container image. For example, quay.io/myuser/my-driver:<kernelversion . |
3 | The name of the unsigned image. For example, quay.io/myuser/my-driver:<kernelversion . |
Building and signing a kmod image
Use this procedure if you have source code and must build your image first.
The following YAML file builds a new container image using the source code from the repository. The image produced is saved back in the registry with a temporary name, and this temporary image is then signed using the parameters in the sign
section.
The temporary image name is based on the final image name and is set to be <containerImage>:<tag>-<namespace>_<module name>_kmm_unsigned
.
For example, using the following YAML file, Kernel Module Management (KMM) builds an image named example.org/repository/minimal-driver:final-default_example-module_kmm_unsigned
containing the build with unsigned kmods and pushes it to the registry. Then it creates a second image named example.org/repository/minimal-driver:final
that contains the signed kmods. It is this second image that is pulled by the worker pods and contains the kmods to be loaded on the cluster nodes.
After it is signed, you can safely delete the temporary image from the registry. It will be rebuilt, if needed.
Prerequisites
The
keySecret
andcertSecret
secrets have been created in the same namespace as the rest of the resources.
Procedure
Apply the YAML file:
---apiVersion: v1kind: ConfigMapmetadata: name: example-module-dockerfile namespace: <namespace> (1)data: Dockerfile: | ARG DTK_AUTO ARG KERNEL_VERSION FROM ${DTK_AUTO} as builder WORKDIR /build/ RUN git clone -b main --single-branch https://github.com/rh-ecosystem-edge/kernel-module-management.git WORKDIR kernel-module-management/ci/kmm-kmod/ RUN make FROM registry.access.redhat.com/ubi9/ubi:latest ARG KERNEL_VERSION RUN yum -y install kmod && yum clean all RUN mkdir -p /opt/lib/modules/${KERNEL_VERSION} COPY --from=builder /build/kernel-module-management/ci/kmm-kmod/*.ko /opt/lib/modules/${KERNEL_VERSION}/ RUN /usr/sbin/depmod -b /opt---apiVersion: kmm.sigs.x-k8s.io/v1beta1kind: Modulemetadata: name: example-module namespace: <namespace> (1)spec: moduleLoader: serviceAccountName: default (2) container: modprobe: moduleName: simple_kmod kernelMappings: - regexp: '^.*\.x86_64$' containerImage: <final_driver_container_name> build: dockerfileConfigMap: name: example-module-dockerfile sign: keySecret: name: <private_key_secret_name> certSecret: name: <certificate_secret_name> filesToSign: - /opt/lib/modules/4.18.0-348.2.1.el8_5.x86_64/kmm_ci_a.ko imageRepoSecret: (3) name: repo-pull-secret selector: # top-level selector kubernetes.io/arch: amd64
1 | Replace default with a valid namespace. |
2 | The default serviceAccountName does not have the required permissions to run a module that is privileged. For information on creating a service account, see "Creating service accounts" in the "Additional resources" of this section. |
3 | Used as imagePullSecrets in the DaemonSet object and to pull and push for the build and sign features. |
Additional resources
Creating service accounts.
KMM hub and spoke
In hub and spoke scenarios, many spoke clusters are connected to a central, powerful hub cluster. Kernel Module Management (KMM) depends on RedHat Advanced Cluster Management (RHACM) to operate in hub and spoke environments.
KMM is compatible with hub and spoke environments through decoupling KMM features. A ManagedClusterModule
custom resource definition (CRD) is provided to wrap the existing Module
CRD and extend it to select Spoke clusters. Also provided is KMM-Hub, a new standalone controller that builds images and signs modules on the hub cluster.
In hub and spoke setups, spokes are focused, resource-constrained clusters that are centrally managed by a hub cluster. Spokes run the single-cluster edition of KMM, with those resource-intensive features disabled. To adapt KMM to this environment, you should reduce the workload running on the spokes to the minimum, while the hub takes care of the expensive tasks.
Building kernel module images and signing the .ko
files, should run on the hub. The scheduling of the Module Loader and Device Plugin DaemonSets
can only happen on the spokes.
Additional resources
KMM-Hub
The KMM project provides KMM-Hub, an edition of KMM dedicated to hub clusters. KMM-Hub monitors all kernel versions running on the spokes and determines the nodes on the cluster that should receive a kernel module.
KMM-Hub runs all compute-intensive tasks such as image builds and kmod signing, and prepares the trimmed-down Module
to be transferred to the spokes through RHACM.
KMM-Hub cannot be used to load kernel modules on the hub cluster. Install the regular edition of KMM to load kernel modules. |
Additional resources
Installing KMM-Hub
You can use one of the following methods to install KMM-Hub:
With the Operator Lifecycle Manager (OLM)
Creating KMM resources
Additional resources
Installing KMM-Hub using the Operator Lifecycle Manager
Use the Operators section of the OpenShift console to install KMM-Hub.
Installing KMM-Hub by creating KMM resources
Procedure
If you want to install KMM-Hub programmatically, you can use the following resources to createthe
Namespace
,OperatorGroup
andSubscription
resources:
---apiVersion: v1kind: Namespacemetadata: name: openshift-kmm-hub---apiVersion: operators.coreos.com/v1kind: OperatorGroupmetadata: name: kernel-module-management-hub namespace: openshift-kmm-hub---apiVersion: operators.coreos.com/v1alpha1kind: Subscriptionmetadata: name: kernel-module-management-hub namespace: openshift-kmm-hubspec: channel: stable installPlanApproval: Automatic name: kernel-module-management-hub source: redhat-operators sourceNamespace: openshift-marketplace
Using the ManagedClusterModule
CRD
Use the ManagedClusterModule
Custom Resource Definition (CRD) to configure the deployment of kernel modules on spoke clusters.This CRD is cluster-scoped, wraps a Module
spec and adds the following additional fields:
apiVersion: hub.kmm.sigs.x-k8s.io/v1beta1kind: ManagedClusterModulemetadata: name: <my-mcm> # No namespace, because this resource is cluster-scoped.spec: moduleSpec: (1) selector: (2) node-wants-my-mcm: 'true' spokeNamespace: <some-namespace> (3) selector: (4) wants-my-mcm: 'true'
1 | moduleSpec : Contains moduleLoader and devicePlugin sections, similar to a Module resource. |
2 | Selects nodes within the ManagedCluster . |
3 | Specifies in which namespace the Module should be created. |
4 | Selects ManagedCluster objects. |
If build or signing instructions are present in .spec.moduleSpec
, those pods are run on the hub cluster in the operator’s namespace.
When the .spec.selector matches
one or more ManagedCluster
resources, then KMM-Hub creates a ManifestWork
resource in the corresponding namespace(s). ManifestWork
contains a trimmed-down Module
resource, with kernel mappings preserved but all build
and sign
subsections are removed. containerImage
fields that contain image names ending with a tag are replaced with their digest equivalent.
Running KMM on the spoke
After installing Kernel Module Management (KMM) on the spoke, no further action is required. Create a ManagedClusterModule
object from the hub to deploy kernel modules on spoke clusters.
Procedure
You can install KMM on the spokes cluster through a RHACM Policy
object. In addition to installing KMM from the OperatorHub and running it in a lightweight spoke mode, the Policy
configures additional RBAC required for the RHACM agent to be able to manage Module
resources.
Use the following RHACM policy to install KMM on spoke clusters:
---apiVersion: policy.open-cluster-management.io/v1kind: Policymetadata: name: install-kmmspec: remediationAction: enforce disabled: false policy-templates: - objectDefinition: apiVersion: policy.open-cluster-management.io/v1 kind: ConfigurationPolicy metadata: name: install-kmm spec: severity: high object-templates: - complianceType: mustonlyhave objectDefinition: apiVersion: v1 kind: Namespace metadata: name: openshift-kmm - complianceType: mustonlyhave objectDefinition: apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: kmm namespace: openshift-kmm spec: upgradeStrategy: Default - complianceType: mustonlyhave objectDefinition: apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: kernel-module-management namespace: openshift-kmm spec: channel: stable config: env: - name: KMM_MANAGED (1) value: "1" installPlanApproval: Automatic name: kernel-module-management source: redhat-operators sourceNamespace: openshift-marketplace - complianceType: mustonlyhave objectDefinition: apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: kmm-module-manager rules: - apiGroups: [kmm.sigs.x-k8s.io] resources: [modules] verbs: [create, delete, get, list, patch, update, watch] - complianceType: mustonlyhave objectDefinition: apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: klusterlet-kmm subjects: - kind: ServiceAccount name: klusterlet-work-sa namespace: open-cluster-management-agent roleRef: kind: ClusterRole name: kmm-module-manager apiGroup: rbac.authorization.k8s.io---apiVersion: apps.open-cluster-management.io/v1kind: PlacementRulemetadata: name: all-managed-clustersspec: clusterSelector: (2) matchExpressions: []---apiVersion: policy.open-cluster-management.io/v1kind: PlacementBindingmetadata: name: install-kmmplacementRef: apiGroup: apps.open-cluster-management.io kind: PlacementRule name: all-managed-clusterssubjects: - apiGroup: policy.open-cluster-management.io kind: Policy name: install-kmm
1 This environment variable is required when running KMM on a spoke cluster. 2 The spec.clusterSelector
field can be customized to target select clusters only.
Customizing upgrades for kernel modules
Use this procedure to upgrade the kernel module while running maintenance operations on the node, including rebooting the node, if needed. To minimize the impact on the workloads running in the cluster, run the kernel upgrade process sequentially, one node at a time.
This procedure requires knowledge of the workload utilizing the kernel module and must be managed by the cluster administrator. |
Prerequisites
Before upgrading, set the
kmm.node.kubernetes.io/version-module.<module_namespace>.<module_name>=$moduleVersion
label on all the nodes that are used by the kernel module.Terminate all user application workloads on the node or move them to another node.
Unload the currently loaded kernel module.
Ensure that the user workload (the application running in the cluster that is accessing kernel module) is not running on the node prior to kernel module unloading and that the workload is back running on the node after the new kernel module version has been loaded.
Procedure
Ensure that the device plugin managed by KMM on the node is unloaded.
Update the following fields in the
Module
custom resource (CR):containerImage
(to the appropriate kernel version)version
The update should be atomic; that is, both the
containerImage
andversion
fields must be updated simultaneously.
Terminate any workload using the kernel module on the node being upgraded.
Remove the
kmm.node.kubernetes.io/version-module.<module_namespace>.<module_name>
label on the node.Run the following command to unload the kernel module from the node:$ oc label node/<node_name> kmm.node.kubernetes.io/version-module.<module_namespace>.<module_name>-
If required, as the cluster administrator, perform any additional maintenance required on the node for the kernel module upgrade.
If no additional upgrading is needed, you can skip Steps 3 through 6 by updating the
kmm.node.kubernetes.io/version-module.<module_namespace>.<module_name>
label value to the new$moduleVersion
as set in theModule
.Run the following command to add the
kmm.node.kubernetes.io/version-module.<module_namespace>.<module_name>=$moduleVersion
label to the node. The$moduleVersion
must be equal to the new value of theversion
field in theModule
CR.$ oc label node/<node_name> kmm.node.kubernetes.io/version-module.<module_namespace>.<module_name>=<desired_version>
Because of Kubernetes limitations in label names, the combined length of
Module
name and namespace must not exceed 39 characters.Restore any workload that leverages the kernel module on the node.
Reload the device plugin managed by KMM on the node.
Day 1 kernel module loading
Kernel Module Management (KMM) is typically a Day 2 Operator. Kernel modules are loaded only after the complete initialization of a Linux (RHCOS) server. However, in some scenarios the kernel module must be loaded at an earlier stage. Day 1 functionality allows you to use the Machine Config Operator (MCO) to load kernel modules during the Linux systemd
initialization stage.
Additional resources
Machine Config Operator
Day 1 supported use cases
The Day 1 functionality supports a limited number of use cases. The main use case is to allow loading out-of-tree (OOT) kernel modules prior to NetworkManager service initialization. It does not support loading kernel module at the initramfs
stage.
The following are the conditions needed for Day 1 functionality:
The kernel module is not loaded in the kernel.
The in-tree kernel module is loaded into the kernel, but can be unloaded and replaced by the OOT kernel module. This means that the in-tree module is not referenced by any other kernel modules.
In order for Day 1 functionlity to work, the node must have a functional network interface, that is, an in-tree kernel driver for that interface. The OOT kernel module can be a network driver that will replace the functional network driver.
OOT kernel module loading flow
The loading of the out-of-tree (OOT) kernel module leverages the Machine Config Operator (MCO). The flow sequence is as follows:
Procedure
Apply a
MachineConfig
resource to the existing running cluster. In order to identify the necessary nodes that need to be updated,you must create an appropriateMachineConfigPool
resource.MCO applies the reboots node by node. On any rebooted node, two new
systemd
services are deployed:pull
service andload
service.The
load
service is configured to run prior to theNetworkConfiguration
service. The service tries to pull a predefined kernel module image and then, using that image, to unload an in-tree module and load an OOT kernel module.The
pull
service is configured to run after NetworkManager service. The service checks if the preconfigured kernel module image is located on the node’s filesystem. If it is, the service exists normally, and the server continues with the boot process. If not, it pulls the image onto the node and reboots the node afterwards.
The kernel module image
The Day 1 functionality uses the same DTK based image leveraged by Day 2 KMM builds. The out-of-tree kernel module should be located under /opt/lib/modules/${kernelVersion}
.
Additional resources
Driver Toolkit
In-tree module replacement
The Day 1 functionality always tries to replace the in-tree kernel module with the OOT version. If the in-tree kernel module is not loaded, the flow is not affected; the service proceeds and loads the OOT kernel module.
MCO yaml creation
KMM provides an API to create an MCO YAML manifest for the Day 1 functionality:
ProduceMachineConfig(machineConfigName, machineConfigPoolRef, kernelModuleImage, kernelModuleName string) (string, error)
The returned output is a string representation of the MCO YAML manifest to be applied. It is up to the customer to apply this YAML.
The parameters are:
machineConfigName
The name of the MCO YAML manifest. This parameter is set as the
name
parameter of the metadata of the MCO YAML manifest.machineConfigPoolRef
The
MachineConfigPool
name used to identify the targeted nodes.kernelModuleImage
The name of the container image that includes the OOT kernel module.
kernelModuleName
The name of the OOT kernel module. This parameter is used both to unload the in-tree kernel module (if loaded into the kernel) and to load the OOT kernel module.
The API is located under pkg/mcproducer
package of the KMM source code. The KMM operator does not need to be running to use the Day 1 functionality. You only need to import the pkg/mcproducer
package into their operator/utility code, call the API, and apply the produced MCO YAML to the cluster.
The MachineConfigPool
The MachineConfigPool
identifies a collection of nodes that are affected by the applied MCO.
kind: MachineConfigPoolmetadata: name: sfcspec: machineConfigSelector: (1) matchExpressions: - {key: machineconfiguration.openshift.io/role, operator: In, values: [worker, sfc]} nodeSelector: (2) matchLabels: node-role.kubernetes.io/sfc: "" paused: false maxUnavailable: 1
1 | Matches the labels in the MachineConfig. |
2 | Matches the labels on the node. |
There are predefined MachineConfigPools
in the OCP cluster:
worker
: Targets all worker nodes in the clustermaster
: Targets all master nodes in the cluster
Define the following MachineConfig
to target the master MachineConfigPool
:
metadata: labels: machineconfiguration.opensfhit.io/role: master
Define the following MachineConfig
to target the worker MachineConfigPool
:
metadata: labels: machineconfiguration.opensfhit.io/role: worker
Additional resources
Debugging and troubleshooting
If the kmods in your driver container are not signed or are signed with the wrong key, then the container can enter a PostStartHookError
or CrashLoopBackOff
status. You can verify by running the oc describe
command on your container, which displays the following message in this scenario:
modprobe: ERROR: could not insert '<your_kmod_name>': Required key not available
KMM firmware support
Kernel modules sometimes need to load firmware files from the file system. KMM supports copying firmware files from the kmod image to the node’s file system.
The contents of .spec.moduleLoader.container.modprobe.firmwarePath
are copied into the /var/lib/firmware
path on the node before running the modprobe
command to insert the kernel module.
All files and empty directories are removed from that location before running the modprobe -r
command to unload the kernel module, when the pod is terminated.
Configuring the lookup path on nodes
On OpenShift Container Platform nodes, the set of default lookup paths for firmwares does not include the /var/lib/firmware
path.
Procedure
Use the Machine Config Operator to create a
MachineConfig
custom resource (CR) that contains the/var/lib/firmware
path:apiVersion: machineconfiguration.openshift.io/v1kind: MachineConfigmetadata: labels: machineconfiguration.openshift.io/role: worker (1) name: 99-worker-kernel-args-firmware-pathspec: kernelArguments: - 'firmware_class.path=/var/lib/firmware'
1 You can configure the label based on your needs. In the case of single-node OpenShift, use either control-pane
ormaster
objects.By applying the
MachineConfig
CR, the nodes are automatically rebooted.
Additional resources
Machine Config Operator.
Building a kmod image
Procedure
In addition to building the kernel module itself, include the binary firmware in the builder image:
FROM registry.redhat.io/ubi9/ubi-minimal as builder# Build the kmodRUN ["mkdir", "/firmware"]RUN ["curl", "-o", "/firmware/firmware.bin", "https://artifacts.example.com/firmware.bin"]FROM registry.redhat.io/ubi9/ubi-minimal# Copy the kmod, install modprobe, run depmodCOPY --from=builder /firmware /firmware
Tuning the Module resource
Procedure
Set
.spec.moduleLoader.container.modprobe.firmwarePath
in theModule
custom resource (CR):apiVersion: kmm.sigs.x-k8s.io/v1beta1kind: Modulemetadata: name: my-kmodspec: moduleLoader: container: modprobe: moduleName: my-kmod # Required firmwarePath: /firmware (1)
1 Optional: Copies /firmware/*
into/var/lib/firmware/
on the node.
Day 0 through Day 2 kmod installation
You can install some kernel modules (kmods) during Day 0 through Day 2 operations without Kernel Module Management (KMM). This could assist in the transition of the kmods to KMM.
Use the following criteria to determine suitable kmod installations.
- Day 0
The most basic kmods that are required for a node to become
Ready
in the cluster. Examples of these types of kmods include:A storage driver that is required to mount the rootFS as part of the boot process
A network driver that is required for the machine to access
machine-config-server
on the bootstrap node to pull the ignition and join the cluster
- Day 1
Kmods that are not required for a node to become
Ready
in the cluster but cannot be unloaded when the node isReady
.An example of this type of kmod is an out-of-tree (OOT) network driver that replaces an outdated in-tree driver to exploit the full potential of the NIC while
NetworkManager
depends on it. When the node isReady
, you cannot unload the driver because of theNetworkManager
dependency.- Day 2
Kmods that can be dynamically loaded to the kernel or removed from it without interfering with the cluster infrastructure, for example, connectivity.
Examples of these types of kmods include:
GPU operators
Secondary network adapters
field-programmable gate arrays (FPGAs)
Layering background
When a Day 0 kmod is installed in the cluster, layering is applied through the Machine Config Operator (MCO) and OpenShift Container Platform upgrades do not trigger node upgrades.
You only need to recompile the driver if you add new features to it, because the node’s operating system will remain the same.
Lifecycle management
You can leverage KMM to manage the Day 0 through Day 2 lifecycle of kmods without a reboot when the driver allows it.
This will not work if the upgrade requires a node reboot, for example, when rebuilding |
Use one of the following options for lifecycle management.
Treat the kmod as an in-tree driver
Use this method when you want to upgrade the kmods. In this case, treat the kmod as an in-tree driver and create a Module
in the cluster with the inTreeRemoval
field to unload the old version of the driver.
Note the following characteristics of treating the kmod as an in-tree driver:
Downtime might occur as KMM tries to unload and load the kmod on all the selected nodes simultaneously.
This works if removing the driver makes the node lose connectivity because KMM uses a single pod to unload and load the driver.
Use ordered upgrade
You can use ordered upgrade (ordered_upgrade.md) to create a versioned Module
in the cluster representing the kmods with no effect, because the kmods are already loaded.
Note the following characteristics of using ordered upgrade:
There is no cluster downtime because you control the pace of the upgrade and how many nodes are upgraded at the same time; therefore, an upgrade with no downtime is possible.
This method will not work if unloading the driver results in losing connection to the node, because KMM creates two different worker pods for unloading and another for loading. These pods will not be scheduled.
Troubleshooting KMM
When troubleshooting KMM installation issues, you can monitor logs to determine at which stage issues occur.Then, retrieve diagnostic data relevant to that stage.
Reading Operator logs
You can use the oc logs
command to read Operator logs, as in the following examples.
Example command for KMM controller
$ oc logs -fn openshift-kmm deployments/kmm-operator-controller
Example command for KMM webhook server
$ oc logs -fn openshift-kmm deployments/kmm-operator-webhook-server
Example command for KMM-Hub controller
$ oc logs -fn openshift-kmm-hub deployments/kmm-operator-hub-controller
Example command for KMM-Hub webhook server
$ oc logs -fn openshift-kmm deployments/kmm-operator-hub-webhook-server
Observing events
Use the following methods to view KMM events.
Build & sign
KMM publishes events whenever it starts a kmod image build or observes its outcome. These events are attached to Module
objects and are available at the end of the output of oc describe module
command, as in the following example:
$ oc describe modules.kmm.sigs.x-k8s.io kmm-ci-a[...]Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal BuildCreated 2m29s kmm Build created for kernel 6.6.2-201.fc39.x86_64 Normal BuildSucceeded 63s kmm Build job succeeded for kernel 6.6.2-201.fc39.x86_64 Normal SignCreated 64s (x2 over 64s) kmm Sign created for kernel 6.6.2-201.fc39.x86_64 Normal SignSucceeded 57s kmm Sign job succeeded for kernel 6.6.2-201.fc39.x86_64
Module load or unload
KMM publishes events whenever it successfully loads or unloads a kernel module on a node. These events are attached to Node
objects and are available at the end of the output of oc describe node
command, as in the following example:
$ oc describe node my-node[...]Events: Type Reason Age From Message ---- ------ ---- ---- -------[...] Normal ModuleLoaded 4m17s kmm Module default/kmm-ci-a loaded into the kernel Normal ModuleUnloaded 2s kmm Module default/kmm-ci-a unloaded from the kernel
Using the must-gather tool
The oc adm must-gather
command is the preferred way to collect a support bundle and provide debugging information to Red HatSupport. Collect specific information by running the command with the appropriate arguments as described in the following sections.
Additional resources
About the must-gather tool
Gathering data for KMM
Procedure
Gather the data for the KMM Operator controller manager:
Set the
MUST_GATHER_IMAGE
variable:$ export MUST_GATHER_IMAGE=$(oc get deployment -n openshift-kmm kmm-operator-controller -ojsonpath='{.spec.template.spec.containers[?(@.name=="manager")].env[?(@.name=="RELATED_IMAGE_MUST_GATHER")].value}')$ oc adm must-gather --image="${MUST_GATHER_IMAGE}" -- /usr/bin/gather
Use the
-n <namespace>
switch to specify a namespace if you installed KMM in a custom namespace.Run the
must-gather
tool:$ oc adm must-gather --image="${MUST_GATHER_IMAGE}" -- /usr/bin/gather
View the Operator logs:
$ oc logs -fn openshift-kmm deployments/kmm-operator-controller
Example output
I0228 09:36:37.352405 1 request.go:682] Waited for 1.001998746s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/machine.openshift.io/v1beta1?timeout=32sI0228 09:36:40.767060 1 listener.go:44] kmm/controller-runtime/metrics "msg"="Metrics server is starting to listen" "addr"="127.0.0.1:8080"I0228 09:36:40.769483 1 main.go:234] kmm/setup "msg"="starting manager"I0228 09:36:40.769907 1 internal.go:366] kmm "msg"="Starting server" "addr"={"IP":"127.0.0.1","Port":8080,"Zone":""} "kind"="metrics" "path"="/metrics"I0228 09:36:40.770025 1 internal.go:366] kmm "msg"="Starting server" "addr"={"IP":"::","Port":8081,"Zone":""} "kind"="health probe"I0228 09:36:40.770128 1 leaderelection.go:248] attempting to acquire leader lease openshift-kmm/kmm.sigs.x-k8s.io...I0228 09:36:40.784396 1 leaderelection.go:258] successfully acquired lease openshift-kmm/kmm.sigs.x-k8s.ioI0228 09:36:40.784876 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="Module" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="Module" "source"="kind source: *v1beta1.Module"I0228 09:36:40.784925 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="Module" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="Module" "source"="kind source: *v1.DaemonSet"I0228 09:36:40.784968 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="Module" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="Module" "source"="kind source: *v1.Build"I0228 09:36:40.785001 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="Module" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="Module" "source"="kind source: *v1.Job"I0228 09:36:40.785025 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="Module" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="Module" "source"="kind source: *v1.Node"I0228 09:36:40.785039 1 controller.go:193] kmm "msg"="Starting Controller" "controller"="Module" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="Module"I0228 09:36:40.785458 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="PodNodeModule" "controllerGroup"="" "controllerKind"="Pod" "source"="kind source: *v1.Pod"I0228 09:36:40.786947 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="PreflightValidation" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="PreflightValidation" "source"="kind source: *v1beta1.PreflightValidation"I0228 09:36:40.787406 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="PreflightValidation" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="PreflightValidation" "source"="kind source: *v1.Build"I0228 09:36:40.787474 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="PreflightValidation" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="PreflightValidation" "source"="kind source: *v1.Job"I0228 09:36:40.787488 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="PreflightValidation" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="PreflightValidation" "source"="kind source: *v1beta1.Module"I0228 09:36:40.787603 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="NodeKernel" "controllerGroup"="" "controllerKind"="Node" "source"="kind source: *v1.Node"I0228 09:36:40.787634 1 controller.go:193] kmm "msg"="Starting Controller" "controller"="NodeKernel" "controllerGroup"="" "controllerKind"="Node"I0228 09:36:40.787680 1 controller.go:193] kmm "msg"="Starting Controller" "controller"="PreflightValidation" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="PreflightValidation"I0228 09:36:40.785607 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="imagestream" "controllerGroup"="image.openshift.io" "controllerKind"="ImageStream" "source"="kind source: *v1.ImageStream"I0228 09:36:40.787822 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="preflightvalidationocp" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="PreflightValidationOCP" "source"="kind source: *v1beta1.PreflightValidationOCP"I0228 09:36:40.787853 1 controller.go:193] kmm "msg"="Starting Controller" "controller"="imagestream" "controllerGroup"="image.openshift.io" "controllerKind"="ImageStream"I0228 09:36:40.787879 1 controller.go:185] kmm "msg"="Starting EventSource" "controller"="preflightvalidationocp" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="PreflightValidationOCP" "source"="kind source: *v1beta1.PreflightValidation"I0228 09:36:40.787905 1 controller.go:193] kmm "msg"="Starting Controller" "controller"="preflightvalidationocp" "controllerGroup"="kmm.sigs.x-k8s.io" "controllerKind"="PreflightValidationOCP"I0228 09:36:40.786489 1 controller.go:193] kmm "msg"="Starting Controller" "controller"="PodNodeModule" "controllerGroup"="" "controllerKind"="Pod"
Gathering data for KMM-Hub
Procedure
Gather the data for the KMM Operator hub controller manager:
Set the
MUST_GATHER_IMAGE
variable:$ export MUST_GATHER_IMAGE=$(oc get deployment -n openshift-kmm-hub kmm-operator-hub-controller -ojsonpath='{.spec.template.spec.containers[?(@.name=="manager")].env[?(@.name=="RELATED_IMAGE_MUST_GATHER")].value}')$ oc adm must-gather --image="${MUST_GATHER_IMAGE}" -- /usr/bin/gather -u
Use the
-n <namespace>
switch to specify a namespace if you installed KMM in a custom namespace.Run the
must-gather
tool:$ oc adm must-gather --image="${MUST_GATHER_IMAGE}" -- /usr/bin/gather -u
View the Operator logs:
$ oc logs -fn openshift-kmm-hub deployments/kmm-operator-hub-controller
Example output
I0417 11:34:08.807472 1 request.go:682] Waited for 1.023403273s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/tuned.openshift.io/v1?timeout=32sI0417 11:34:12.373413 1 listener.go:44] kmm-hub/controller-runtime/metrics "msg"="Metrics server is starting to listen" "addr"="127.0.0.1:8080"I0417 11:34:12.376253 1 main.go:150] kmm-hub/setup "msg"="Adding controller" "name"="ManagedClusterModule"I0417 11:34:12.376621 1 main.go:186] kmm-hub/setup "msg"="starting manager"I0417 11:34:12.377690 1 leaderelection.go:248] attempting to acquire leader lease openshift-kmm-hub/kmm-hub.sigs.x-k8s.io...I0417 11:34:12.378078 1 internal.go:366] kmm-hub "msg"="Starting server" "addr"={"IP":"127.0.0.1","Port":8080,"Zone":""} "kind"="metrics" "path"="/metrics"I0417 11:34:12.378222 1 internal.go:366] kmm-hub "msg"="Starting server" "addr"={"IP":"::","Port":8081,"Zone":""} "kind"="health probe"I0417 11:34:12.395703 1 leaderelection.go:258] successfully acquired lease openshift-kmm-hub/kmm-hub.sigs.x-k8s.ioI0417 11:34:12.396334 1 controller.go:185] kmm-hub "msg"="Starting EventSource" "controller"="ManagedClusterModule" "controllerGroup"="hub.kmm.sigs.x-k8s.io" "controllerKind"="ManagedClusterModule" "source"="kind source: *v1beta1.ManagedClusterModule"I0417 11:34:12.396403 1 controller.go:185] kmm-hub "msg"="Starting EventSource" "controller"="ManagedClusterModule" "controllerGroup"="hub.kmm.sigs.x-k8s.io" "controllerKind"="ManagedClusterModule" "source"="kind source: *v1.ManifestWork"I0417 11:34:12.396430 1 controller.go:185] kmm-hub "msg"="Starting EventSource" "controller"="ManagedClusterModule" "controllerGroup"="hub.kmm.sigs.x-k8s.io" "controllerKind"="ManagedClusterModule" "source"="kind source: *v1.Build"I0417 11:34:12.396469 1 controller.go:185] kmm-hub "msg"="Starting EventSource" "controller"="ManagedClusterModule" "controllerGroup"="hub.kmm.sigs.x-k8s.io" "controllerKind"="ManagedClusterModule" "source"="kind source: *v1.Job"I0417 11:34:12.396522 1 controller.go:185] kmm-hub "msg"="Starting EventSource" "controller"="ManagedClusterModule" "controllerGroup"="hub.kmm.sigs.x-k8s.io" "controllerKind"="ManagedClusterModule" "source"="kind source: *v1.ManagedCluster"I0417 11:34:12.396543 1 controller.go:193] kmm-hub "msg"="Starting Controller" "controller"="ManagedClusterModule" "controllerGroup"="hub.kmm.sigs.x-k8s.io" "controllerKind"="ManagedClusterModule"I0417 11:34:12.397175 1 controller.go:185] kmm-hub "msg"="Starting EventSource" "controller"="imagestream" "controllerGroup"="image.openshift.io" "controllerKind"="ImageStream" "source"="kind source: *v1.ImageStream"I0417 11:34:12.397221 1 controller.go:193] kmm-hub "msg"="Starting Controller" "controller"="imagestream" "controllerGroup"="image.openshift.io" "controllerKind"="ImageStream"I0417 11:34:12.498335 1 filter.go:196] kmm-hub "msg"="Listing all ManagedClusterModules" "managedcluster"="local-cluster"I0417 11:34:12.498570 1 filter.go:205] kmm-hub "msg"="Listed ManagedClusterModules" "count"=0 "managedcluster"="local-cluster"I0417 11:34:12.498629 1 filter.go:238] kmm-hub "msg"="Adding reconciliation requests" "count"=0 "managedcluster"="local-cluster"I0417 11:34:12.498687 1 filter.go:196] kmm-hub "msg"="Listing all ManagedClusterModules" "managedcluster"="sno1-0"I0417 11:34:12.498750 1 filter.go:205] kmm-hub "msg"="Listed ManagedClusterModules" "count"=0 "managedcluster"="sno1-0"I0417 11:34:12.498801 1 filter.go:238] kmm-hub "msg"="Adding reconciliation requests" "count"=0 "managedcluster"="sno1-0"I0417 11:34:12.501947 1 controller.go:227] kmm-hub "msg"="Starting workers" "controller"="imagestream" "controllerGroup"="image.openshift.io" "controllerKind"="ImageStream" "worker count"=1I0417 11:34:12.501948 1 controller.go:227] kmm-hub "msg"="Starting workers" "controller"="ManagedClusterModule" "controllerGroup"="hub.kmm.sigs.x-k8s.io" "controllerKind"="ManagedClusterModule" "worker count"=1I0417 11:34:12.502285 1 imagestream_reconciler.go:50] kmm-hub "msg"="registered imagestream info mapping" "ImageStream"={"name":"driver-toolkit","namespace":"openshift"} "controller"="imagestream" "controllerGroup"="image.openshift.io" "controllerKind"="ImageStream" "dtkImage"="quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:df42b4785a7a662b30da53bdb0d206120cf4d24b45674227b16051ba4b7c3934" "name"="driver-toolkit" "namespace"="openshift" "osImageVersion"="412.86.202302211547-0" "reconcileID"="e709ff0a-5664-4007-8270-49b5dff8bae9"