Skip to content

Special use

Our cluster combines various hardware resources from multiple universities and other organizations. By default you can only use the nodes having NO Taints (see the resources page of the portal).

All taints

Here's the full list of taints on the nodes. To run on the node having a taint, you need to use the node toleration in your pod. Some are set automatically on deployed jobs by our cluster, some can only be used by privileged users. Please refer to this list and only set the ones you were allowed to use by cluster admins.

Taint Purpose Who can set manually
nautilus.io/arm64=true:NoSchedule ARM64 node. Make sure your image supports it. All
nautilus.io/ceph=true:NoSchedule Don't run any user jobs on ceph storage nodes No
nautilus.io/csusb=true:NoSchedule CSUSB researved nodes. Namespaces: csusb-chaseci, csusb-hpc, csusb-cousins-lab, csusb-jupyterhub, csusb-mpi, csusb-salloum, prp-dvu-csusb
nautilus.io/haosu=true:NoSchedule Private Hao Su cluster. Namespaces: ucsd-haosulab, ucsd-ravigroup, mc-lab
nautilus.io/iclr-ece3d-vision=true:NoSchedule Securing resources for ICLR deadline (until 9/29) Namespaces: ece3d-vision
nautilus.io/iclr=true:NoSchedule Securing resources for ICLR deadline (until 9/29) Namespaces: ucsd-haosulab
nautilus.io/large-gpu=true:NoSchedule Node accepts 4- and 8-GPU jobs only. Set automatically. No
nautilus.io/noceph=true:NoSchedule Ceph will not mount on the node. Otherwise the node is fine. All
nautilus.io/prism-center=true:NoSchedule RESERVED for PRISM Center No
nautilus.io/science-dmz=true:NoSchedule Node can only access the science DMZ network, and not the public Internet. All
nautilus.io/stashcache=true:NoSchedule Private OSG nodes No
nautilus.io/testing=true:NoSchedule Node is broken No
nvidia.com/gpu=Exists:PreferNoSchedule Fence GPU nods from CPU jobs (Preferred! Jobs can still go on the node if there are no free CPU nodes) No

Running in group namespaces

Our cluster contains several sets of nodes dedicated to certain groups.

User can target ONLY THE GROUP NODES by using the nodeAffinity such as:

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: nautilus.io/group
            operator: In
            values:
            - group1

for large jobs to avoid taking over all shared cluster resources. Optionally a higher priority can be used for such jobs (talk to admins before using one).

Other taints

Some nodes in the cluster don't have access to public Internet, and can only access educational network. They still can pull images from Docker Hub using a proxy.

If your workload is not using the public Internet resources, you might tolerate the nautilus.io/science-dmz and get access to additional nodes.