January 2024: CENTERWIDE Monthly Maintenance Downtime (1/16/24)

Date of downtime: Tuesday, January 16, 2024


Approximate time of outage: 7am-5pm


Resources affected by downtime:

Lake Effect Research cloud

Globus data servers

UB-HPC cluster (all partitions) 

Faculty cluster (all partitions)

Portals: OnDemand, ColdFront, Identity Management (IDM)


What will be done:  

  • Edge switch updates will result in a short (approximately 5 minute) outage of all networking at CCR.  This will affect not only the clusters but the Lake Effect cloud and Globus services as well
  • Reboot of all cluster nodes
  • Updates of front-end login nodes (vortex1/2, vortex-future) and OnDemand
  • Infrastructure services updated
  • Default software release changing to ccrsoft/2023.01

IMPORTANT:  Potentially breaking, changes for cluster users:

  • Default software release changing:  The default software release on all login nodes and compute nodes of all clusters will be changing FROM ccrsoft/legacy TO ccrsoft/2023.01.  See here for more info on software environments.  If you already have your default software release set, you will not be affected.

    There may be some confusion as to whether you are using legacy software.  To be clear:
    - If you created a CCR account after August 8, 2023, your account automatically points to the latest software release, ccrsoft/2023.01.The only way you would be using legacy software is if you changed your account to pin ccrsoft/legacy or you run “module load ccrsoft/legacy”
    - User accounts created prior to August 2023 will have been defaulted to use the legacy software.  If you have not yet updated to ccrsoft/2023.01 and continued to use the default on CCR systems, you WILL BE AFFECTED by the changes made during this downtime.

    Still not sure?  Check the .modulerc file in your home directory.  More info here 


  • For those who wish to continue to use the ccrsoft/legacy release, you must ensure the following: 
    • You're loading the ccrsoft/legacy module prior to loading any other modules OR set your default following these directions
    • Use nodes tagged LEGACY to ensure your jobs will run properly.  Use --constraint=LEGACY in job scripts or interactive job requests or specify LEGACY in the "node features" box of OnDemand apps.  If this is not done, your jobs will likely run on nodes where the ccrsoft/legacy environment is not supported and the jobs will fail.
  • The ubhpc-future reservation is being removed.  All nodes that were in this reservation will be available without special request.  Those using the latest software release can run on any compute node.  
  • GPU usage:  CCR's newest GPU nodes have the latest NVIDIA drivers installed on them and should be used with the latest software release. To ensure your job gets allocated on a newer GPU node, we recommend specifying --constraint=NOTLEGACY in your Slurm job script or in your interactive session request. If using OnDemand, you can specify NOTLEGACY in the "Node Features" box of the application forms.

Jobs will be held in queue during the maintenance downtime and will run after the updates are complete.  


If you have  any questions or concerns please e-mail ccr-help_at_buffalo.edu