nodes
Upgrade compute nodes
Upgrade GPU driver
- Verify that the target driver is available at https://yumrepo.nrp-nautilus.io/nvidia/. If not, obtain the driver file from nvidia.com, and transfer it to the pod
repo
in thekernel
namespace, in directory/repo/nvidia
. Make sure the name of the driver file is in the format ofNVIDIA-Linux-<architecture>-<driver-version>.run
. For example:NVIDIA-Linux-x86_64-550.76.run
. - Pull the latest ansible playbook repo from https://gitlab.nrp-nautilus.io/prp/nautilus-ansible, update the
nvidia_version
variable. Thenvidia_version
variable can be under a single host innautilus-hosts.yaml
, or in the group variable files undergroup_vars
directory, depending on which nodes will get GPU driver updated. - Run command
ansible-playbook upgrade.yml -l <node> -t gpu -e netbox_token=<your-netbox-token>
. With the extra variablenetbox_token
, a record regarding node rebooting due to GPU upgrade will be generated in https://netbox-3.nrp-nautilus.io, but it's optional.
Upgrade all packages
Pull the latest ansible playbook repo and run ansible-playbook upgrade.yml -l <node> -t os -e netbox_token=<your-netbox-token>
.
Upgrade kubernetes
Pull the latest ansible playbook repo, and edit the file group_vars/all
with the matching versions of kubernetes
, kubernetes_repo
and crio
.
Here's an example of the configs:
The value of kubernetes_repo
should include the patch version and match the version of kubernetes
installed on the master node, for example, 1.26.11
. The value of crio_version
only applies to the nodes that are runnning crio
runtime, instead of containerd
.
After updating the versions, run ansible-playbook upgrade.yml -l <node> -t kubernetes
.
Upgrade Ubuntu release
Pull the latest ansible playbook repo and run ansible-playbook upgrade.yml -l <node> -t dist-upgrade -e netbox_token=<your-netbox-token>
.
Notice that release upgrade requires all installed packages to be the latest version. If it complaints, upgrade all packages first.
Upgrade above tasks at once
Pull the latest ansible playbook repo and run ansible-playbook upgrade.yml -l <node> -e netbox_token=<your-netbox-token>
.