January 2016 Downtime Schedule

The January downtime is complete.


 Please take note of the following changes and submit a ticket to CCR Help if you are experiencing problems:


What has changed for users:

  • Home directory quotas are being increased to 5GB from 2GB
  • Modules software has been updated - should be transparent to users but see this knowledgebase article for more details
  • Project directories for some have moved.  If your project directory was in /projects it has been moved to /projects/academic/<group_name>  unless you've been notified otherwise
  • Changes to software installations.  /util is no longer mounted as /gpfs/util.  If you receive errors while running code, please notify CCR help


------------------------------------------------------------------------------------------------------------------------------------------------------------

Below you'll find a tentative schedule for the centerwide downtime January 17-20, 2016.  More details on what this affects can be found here


Sunday, January 17, beginning at 6pm:

  • Both academic (rush) and industry (presto) cluster schedulers are stopped and any currently running jobs are removed.  All jobs are removed from queues. - COMPLETED
  • SLURM scheduler database update begins - expected update run time 24 hours - COMPLETED
  • Final data sync from GPFS storage to new IFS storage begins - expected sync run time 12 hours - COMPLETED


Monday, January 18, beginning at 7am - Tuesday, January 19 (at minimum.  Depending on sync times & database updates, this may run into Wednesday):

  • Update of SLURM controllers - COMPLETED
  • Migrate virtual servers to IFS storage and bring online - COMPLETED
  • Update additional administrative and faculty group servers - COMPLETED
  • Update and test front-end servers - COMPLETED
  • Reboot cluster nodes & verify new storage - COMPLETED
  • Update modules software & links - COMPLETED
  • Run test jobs & stress test Isilon COMPLETED
  • Update WebMO - COMPLETED