December 2024: Monthly Maintenance Downtime (12/20/24)

Date of downtime: FRIDAY, December 20, 2024

NOTE: The date of this downtime has been changed to accommodate the end of semester and multiple holidays at the end of December.  Thank you for your understanding!


Approximate time of outage: 7am-5pm


Resources affected by downtime:

UB-HPC cluster (all partitions) 

Faculty cluster (all partitions)

Portals: OnDemand, ColdFront (temporarily), IDM (temporarily) 


What will be done:  

  • Reboot of all cluster nodes
  • Updates of front-end login nodes (login1/2, vortex-future)
  • Infrastructure service node updates
  • Decommissioning of Skylake compute nodes (racks i05,i06,i07,i08) - these have reached 7 years in age and are being removed from service.
  • Updated versions of cluster related services and packages:
    • Apptainer v1.3.5 -> v1.3.6
    • NVIDIA driver v535.216.01 -> v535.216.03
    • Slurm v24.05.4 -> v24.05.5


Affects on user workflows:

  • Removing the Skylake compute nodes will remove the MRI tag from Slurm
  • We do not anticipate any of these updates will result in system problems.  However, with NVIDIA driver updates on the compute nodes, it's possible this might have an effect on users' workflows.  Please report any suspected problems to CCR Help with details so that we may attempt to replicate them.