Skip to content

OpenShift Virtualization (CNV/KubeVirt)

OpenShift Virtualization is a fully developed virtualization solution utilizing the type-1 KVM hypervisor from Red Hat Enterprise Linux with the powerful features and capabilities of OpenShift for managing hypervisor nodes, virtual machines, and their consumers.

Usefull resources:

Example deployments

Tiny RHEL 9 VM with pod bridge network
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  annotations:
  labels:
    app: rhel9-pod-bridge
    kubevirt.io/dynamic-credentials-support: "true"
  name: rhel9-pod-bridge
spec:
  dataVolumeTemplates:
    - apiVersion: cdi.kubevirt.io/v1beta1
      kind: DataVolume
      metadata:
        name: rhel9-pod-bridge
      spec:
        sourceRef:
          kind: DataSource
          name: rhel9
          namespace: openshift-virtualization-os-images
        storage:
          accessModes:
            - ReadWriteMany
          storageClassName: ocs-storagecluster-ceph-rbd-virtualization
          resources:
            requests:
              storage: 30Gi
  running: false
  template:
    metadata:
      annotations:
        vm.kubevirt.io/flavor: tiny
        vm.kubevirt.io/os: rhel9
        vm.kubevirt.io/workload: server
        kubevirt.io/allow-pod-bridge-network-live-migration: ""
      labels:
        kubevirt.io/domain: rhel9-pod-bridge
        kubevirt.io/size: tiny
    spec:
      domain:
        cpu:
          cores: 1
          sockets: 1
          threads: 1
        devices:
          disks:
            - disk:
                bus: virtio
              name: rootdisk
            - disk:
                bus: virtio
              name: cloudinitdisk
          interfaces:
            - bridge: {}
              name: default
        machine:
          type: pc-q35-rhel9.2.0
        memory:
          guest: 1.5Gi
      networks:
        - name: default
          pod: {}
      terminationGracePeriodSeconds: 180
      volumes:
        - dataVolume:
            name: rhel9-pod-bridge
          name: rootdisk
        - cloudInitNoCloud:
            userData: |-
              #cloud-config
              user: cloud-user
              password: redhat
              chpasswd: { expire: False }
          name: cloudinitdisk
oc apply -f https://examples.openshift.pub/kubevirt/example/tiny-rhel-pod-bridge.yaml
Red Hat CoreOS with ignition & pod bridge network
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  annotations:
  labels:
    app: rhcos-pod-bridge
    kubevirt.io/dynamic-credentials-support: "true"
  name: rhcos-pod-bridge
spec:
  dataVolumeTemplates:
    - apiVersion: cdi.kubevirt.io/v1beta1
      kind: DataVolume
      metadata:
        name: rhcos-pod-bridge
      spec:
        source:
          registry:
            pullMethod: node
            # openshift-install coreos print-stream-json | jq '.architectures.x86_64.images.kubevirt'
            url: docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ab118238b01765f103fe0739c0cd48ba10e745f25d5d1da202faf8c08b57fb58
        storage:
          accessModes:
            - ReadWriteMany
          storageClassName: ocs-storagecluster-ceph-rbd-virtualization
          resources:
            requests:
              storage: 30Gi
  running: false
  template:
    metadata:
      annotations:
        vm.kubevirt.io/flavor: tiny
        vm.kubevirt.io/os: rhcos
        vm.kubevirt.io/workload: server
        kubevirt.io/allow-pod-bridge-network-live-migration: ""
      labels:
        kubevirt.io/domain: rhcos-pod-bridge
        kubevirt.io/size: large
    spec:
      domain:
        cpu:
          cores: 1
          sockets: 2
          threads: 1
        devices:
          disks:
            - disk:
                bus: virtio
              name: rootdisk
            - disk:
                bus: virtio
              name: cloudinitdisk
          interfaces:
            - bridge: {}
              name: default
        machine:
          type: pc-q35-rhel9.2.0
        memory:
          guest: 8Gi
      networks:
        - name: default
          pod: {}
      terminationGracePeriodSeconds: 180
      volumes:
        - dataVolume:
            name: rhcos-pod-bridge
          name: rootdisk
        - cloudInitConfigDrive:
            # Password hash
            #   podman run -ti --rm quay.io/coreos/mkpasswd --method=yescrypt
            # Password: redhat
            userData: |-
              {
                "ignition": {
                  "version": "3.2.0"
                },
                "passwd": {
                  "users": [
                    {
                      "name": "core",
                      "passwordHash": "$y$j9T$15cuONdoH5AKB62c9qTtD.$oOf4GqrwEnNzT7WuEFvkDuSOyv2xIx/z4EXzbQivdO0",
                      "sshAuthorizedKeys": [
                        "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAOfl+764UFbDkkxpsQYjET7ZAWoVApSf4I64L1KImoc rbohne@redhat.com"
                      ]
                    }
                  ]
                }
              }
          name: cloudinitdisk
oc apply -f https://examples.openshift.pub/kubevirt/example/rhcos-pod-bridge.yaml
Boot from ISO
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  annotations:
  labels:
    app: beryllium
  name: beryllium
  namespace: demo-cluster-disco
spec:
  dataVolumeTemplates:
    - metadata:
        name: beryllium-root
      spec:
        storage:
          accessModes:
            - ReadWriteMany
          storageClassName: coe-netapp-nas
          resources:
            requests:
              storage: 80Gi
        source:
          blank: {}
  running: true
  template:
    metadata:
      labels:
        kubevirt.io/domain: beryllium
    spec:
      volumes:
        - name: cdrom
          persistentVolumeClaim:
            claimName: beryllium-1-i386-hybrid
        - name: root
          dataVolume:
            name: beryllium-root
      networks:
        - name: coe
          multus:
            networkName: coe-bridge
        - name: disco
          multus:
            networkName: coe-br-vlan-69
      domain:
        cpu:
          cores: 4
        memory:
          guest: 8Gi
        resources:
          requests:
            memory: 8Gi
        devices:
          disks:
            - name: root
              bootOrder: 1
              disk:
                bus: virtio
            - name: cdrom
              bootOrder: 2
              cdrom:
                bus: sata
          interfaces:
            - bridge: {}
              # macAddress: 02:d8:6d:00:00:12
              model: virtio
              name: coe
            - bridge: {}
              # macAddress: 02:d8:6d:00:00:13
              model: virtio
              name: disco
oc apply -f https://examples.openshift.pub/kubevirt/example/boot-from-iso.yaml
Example Fedora with httpd cloud-init and network
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: fedora
spec:
  runStrategy: Always
  template:
    spec:
      domain:
        devices:
          disks:
            - disk:
                bus: virtio
              name: containerdisk
            - disk:
                bus: virtio
              name: cloudinit
          rng: {}
          interfaces:
            - bridge: {}
              model: virtio
              name: coe
        features:
          acpi: {}
          smm:
            enabled: true
        firmware:
          bootloader:
            efi:
              secureBoot: true
        resources:
          requests:
            memory: 1Gi
      terminationGracePeriodSeconds: 180
      networks:
        - multus:
            networkName: coe
          name: coe
      volumes:
        - name: containerdisk
          containerDisk:
            image: quay.io/containerdisks/fedora:41
        - name: cloudinit
          cloudInitNoCloud:
            networkData: |
              version: 2
              ethernets:
                eth0:
                  dhcp4: true
            userData: |-
              #cloud-config

              users:
                - name: coe
                  lock_passwd: false
                  # redhat // mkpasswd --method=SHA-512 --rounds=4096
                  hashed_passwd: "$6$rounds=4096$kmUERoUZHwzYfQMJ$G70T2Qg24d0XUhu.GTCH7Ia1F0B/B48JqIFdzVfigeMgfG5nsxp3dEWFKokfXGmhuetFXl4l41L8t1AZgEDW0."
                  sudo: ['ALL=(ALL) NOPASSWD:ALL']
                  chpasswd: { expire: False }
                  groups: wheel
                  shell: /bin/bash
                  ssh_authorized_keys:
                    - ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIEQM82o2imwpHyGVO7DxCNbdE0ZWnkp6oxdawb7/MOCT coe-muc

              packages:
                - httpd

              # install puppet (and dependencies); make sure apache and postgres
              # both start at boot-time
              runcmd:
                - [ systemctl, enable, httpd.service ]
                - [ systemctl, start, httpd.service ]
oc apply -f https://examples.openshift.pub/kubevirt/networking/localnet-fedora-vm.yaml

Useful Commands

Configure a new number of CPUs for a VM

1
2
3
4
5
6
7
8
# Set the number of CPUs which you'd like to configure
export NEW_CPU=8

# This loop will configure the new number of CPUs
for VM in $(kubectl get vm -o jsonpath='{.items[*].metadata.name}'); do
    echo "Updating compute resources for VM: $VM"
    kubectl patch vm "$VM" --type='json' -p="[{'op': 'replace', 'path': '/spec/template/spec/domain/cpu/sockets', 'value': $NEW_CPU}]"
done

Add the OCP Descheduler Annotation to True or False

1
2
3
4
5
6
7
8
# Set the Descheduler Annotation to True or False
export DESCHEDULER=True

# Add the OCP Descheduler Annotation to True or False
for VM in $(kubectl get vm -o jsonpath='{.items[*].metadata.name}'); do
    echo "Updating descheduler annotation: $VM"
    kubectl patch vm "$VM" --type='json' -p="[{'op': 'add', 'path': '/spec/template/metadata/annotations/descheduler.alpha.kubernetes.io~1evict', 'value': '$DESCHEDULER'}]"
done

Create multiple VMs loop

for i in $(seq 1 15);  do oc process -n openshift rhel9-server-medium  -p NAME=vm${i} | oc apply -f - ; done;

Containerized Data Importer (CDI) / DataVolume

apiVersion: cdi.kubevirt.io/v1alpha1
kind: DataVolume
metadata:
  name: registry-image-datavolume
spec:
  pvc:
    accessModes:
    - ReadWriteMany
    resources:
      requests:
        storage: 5Gi
  source:
    registry:
      url: docker://image-registry.openshift-image-registry.svc:5000/cnv-demo/build-vm-image-container:latest
      certConfigMap: "tls-certs"

Source: cdi-examples

OpenShift Virtualization & Container Storage

Recommended storage settings:

1
2
3
4
5
6
7
8
$ oc edit cm kubevirt-storage-class-defaults -n openshift-cnv

accessMode: ReadWriteMany
ocs-storagecluster-ceph-rbd.accessMode: ReadWriteMany
ocs-storagecluster-ceph-rbd.volumeMode: Block
ocs-storagecluster-cephfs.accessMode: ReadWriteMany
ocs-storagecluster-cephfs.volumeMode: Filesystem
volumeMode: Block

Build container image with OS disk

1
2
3
oc new-build --name cirros \
    --build-arg image_url=http://download.cirros-cloud.net/0.5.1/cirros-0.5.1-x86_64-disk.img \
    https://github.com/openshift-examples/cnv-container-disk-build.git

Local IIS build in my lab

qemu-img convert -f raw -O qcow2 disk.img iis.qcow2

cat - > Dockerfile <<EOF
FROM scratch
LABEL maintainer="Robert Bohne <robert.bohne@redhat.com>"
ADD iis.qcow2 /disk/rhel.qcow2
EOF

oc create is iis -n cnv

export REGISTRY=$(oc get route default-route -n openshift-image-registry -o jsonpath='{.spec.host}')
export REGISTRY_TOKEN=$(oc whoami -t)
podman login -u $(oc whoami) -p $REGISTRY_TOKEN --tls-verify=false $HOST

podman build -t ${REGISTRY}/cnv/iis:latest .
podman push ${REGISTRY}/cnv/iis:latest

# Deploy template
oc apply -f https://raw.githubusercontent.com/openshift-examples/web/master/content/kubevirt/iis-template.yaml

Resource Capacity Calculation

At a high level, the process is to determine the amount of virtualization resources needed (VM sizes, overhead, burst capacity, failover capacity), add that to the amount of resources needed for cluster services (logging, metrics, ODF/ACM/ACS if hosted in the same cluster, etc.) and customer workload (hosted control planes, other Pods deployed to the hardware, etc.), then find a balance of node size vs node count.

CPU capacity calculation

1
2
3
Formula

(((physical_cpu_cores - odf_requirements - control_plane_requirements) * node_count * overcommitment_ratio) * (1 -ha_reserve_percent)) * (1 - spare_capacity_percent)
  • physical_cpu_cores = the number of physical cores available on the node.
  • odf_requirements = the amount of resources reserved for ODF. A value of 32 cores was used for the example architectures.
  • control_plane_requirements = the amount of CPU reserved for the control plane workload. A value of 4 cores was used for the example architectures.
  • node_count = the number of nodes with this geometry. For small, all nodes were equal. For medium, the nodes are mixed-purpose, so the previous steps would need to be repeated for each node type, taking into account the appropriate node type.
  • overcommitment_ratio = the amount of CPU overcommitment. A ratio of 4:1 was used for this document.
  • spare_capacity = the amount of capacity reserved for spare/burst. A value of 10% was used for this document.
  • ha_reserve_percent = the amount of capacity reserved for recovering workload lost in the event of node failure. For the small example, a value of 25% was used, allowing for one node to fail. For the medium example, a value of 20% was used, allowing for two nodes to fail.

Memory capacity calculation

1
2
3
Formula

((total_node_memory - odf_requirements - control_plane_requirements) * soft_eviction_threshold_percent * node_count) * (1 - ha_reserve_percent)
  • total_node_memory = the total physical memory installed on the node.
  • odf_requirements = the amount of memory assigned to ODF. A value of 72GiB was used for the example architectures in this document.
  • control_plane_requirements = the amount of memory reserved for the control plane functions. A value of 24GiB was used for the example architectures.
  • soft_eviction_threshold_percent = the value at which soft eviction is triggered to rebalance resource utilization on the node. Unless all nodes in the cluster exceed this value, it’s expected that the node will be below this utilization. A value of 90% was used for this document.
  • node_count = the number of nodes with this geometry. For small, all nodes were equal. For medium, the nodes are mixed-purpose, so the previous steps would need to be repeated for each node type, taking into account the appropriate node type.
  • ha_reserve_percent = the amount of capacity reserved for recovering workload lost in the event of node failure. For the small example, a value of 25% was used, allowing for one node to fail. For the medium example, a value of 20% was used, allowing for two nodes to fail.

ODF capacity calculation

1
2
3
Formula

(((disk_size * disk_count) * node_count) / replica_count) * (1 - utilization_percent)
  • disk_size = the size of the disk(s) used. 4TB and 8TB disks were used in the example architectures.
  • disk_count = the number of disks of disk_size in the node.
  • node_count = the number of nodes with this geometry. For small, all nodes were equal. For medium, the nodes are mixed-purpose, so the previous steps would need to be repeated for each node type taking into account the appropriate node type.
  • replica_count = the number of copies ODF stores of the data for protection/resiliency. A value of 3 was used for this document.
  • utilization_percent = the desired threshold of capacity used in the ODF instance. A value of 65% was used for this document.

Resources and useful articles


2025-11-28 2019-10-08 Contributors: Robert Bohne Robert Guske