March 2024: Monthly Maintenance Downtime (3/26/24)

NOTE: Racks u26 and u27 of the faculty cluster remain offline for updates.  We apologize for any inconvenience


Date of downtime: Tuesday, March 26, 2024


Approximate time of outage: 7am-5pm


Resources affected by downtime:

UB-HPC cluster (all partitions) 

Faculty cluster (all partitions)

Portals: OnDemand, ColdFront


What will be done:  

  • Reboot of all cluster nodes
  • Updates of front-end login nodes (vortex1/2, vortex-future) and OnDemand
  • Infrastructure services updated
  • Conversion of faculty racks v12, u26, u27, u28 to NOTLEGACY environment from LEGACY environment.   This means anyone using the nodes in these rack will need to be using the latest software release 
  • Singularity/apptainer removed from login nodes - please use the compile nodes or any compute node tagged NOTLEGACY to use containers.  See here for documentation
  • Two new login nodes have been added to the pool of vortex.ccr.buffalo.edu - you may get logged into login1 or login2, like you would have previously seen vortex1 and vortex2.  This is to be expected and configured to be exactly the same as vortex-future.  You may notice additional issues, if you are either a long time user of CCR's systems or were still using legacy software.  Here are a few things you may run into:
    1. User accounts that have been around prior to January 2023, may have a .bashrc file that's got outdated formatting.  Here's what we recommend going forward - this is a clean .bashrc file that goes in your home directory:


# .bashrc


# Source global definitions

if [ -f /etc/bashrc ]; then

        . /etc/bashrc

fi



if [ -f /etc/bash.bashrc ]; then

    . /etc/bash.bashrc

fi


# Uncomment the following line if you don't like systemctl's auto-paging feature:

# export SYSTEMD_PAGER=


# User specific aliases and functions

# enable color support of ls and also add handy aliases

if [ -x /usr/bin/dircolors ]; then

    test -r ~/.dircolors && eval "$(dircolors -b ~/.dircolors)" || eval "$(dircolors -b)"

    alias ls='ls --color=auto'


    alias grep='grep --color=auto'

    alias fgrep='fgrep --color=auto'

    alias egrep='egrep --color=auto'

fi



After the last line, you can add additional lines for things like aliases or exporting environment paths.  We do recommend extreme caution with setting environment variables as these could break things done automatically with CCR's profile scripts.  If in doubt, please submit a ticket to CCR Help for assistance.  Incorrect syntax in a bash environment script can cause login and job submission problems.  There are some things you especially should NOT do in your .bashrc file.  These will definitely break things like job submissions, OnDemand desktops and apps, and CCR's quota scripts:

- Load modules
- Use the "module use" statement - see here for instructions to source your group's modules built with Easybuild or run the 'module use' command AFTER you login.
- Activate a conda environment


2. The new login nodes do not accept RSA SSH keys.  You must update to a newer format of SSH key.  See here for our must updated documentation


3. The new login nodes do not accept outdated cipher keys.  We're following current security best practices for ssh ciphers and the new login nodes support the following:

- chacha20-poly1305@openssh.com

- aes128-ctr

- aes192-ctr

- aes256-ctr

- aes128-gcm@openssh.com

- aes256-gcm@openssh.com 


These are recommended by most all current linux distros which you can find here for reference.  Per recommended guidelines, all cipher block chaining (cbc) ciphers have been removed as they are no longer recommended and vulnerable.  To fix this issue, we suggest upgrading your SSH client or using a newer one that supports these ciphers.


4. Having issues with jobs washing out of the queue?  Jobs submitted from a login node within a directory in /panasas/scratch are not working.  See here for more details



3 people like this