9/1/21: RESOLVED: Problems with Panasas scratch

9:45pm: The Panasas storage is back online.  It has been re-mounted on the OnDemand server.  Any jobs actively using Panasas at the time of the outage will see errors and may have ended prematurely.  All the jobs in the 'CG' state a few hours ago have cleared the queue and new jobs are starting.  Unfortunately, this is the nature of high speed scratch systems; they are volatile.  Any nodes remaining in a bad state will be addressed Thursday morning.


8pm:  We're actively working with the vendor to resolve these issues.  Thanks for your patience!


5:15pm - Update: Panasas has been unmounted on the OnDemand server so logins are now working again.  However, you will not be able to access /panasas/scratch until the problem is resolved.


We are aware of issues with the Panasas file system.  This is causing problems with logins to OnDemand as well as issues accessing /panasas/scratch on the front end login nodes.  You will also get errors if your jobs are running in /panasas/scratch.  Nodes will likely begin hanging because they can't reach the storage and jobs will fail to complete.  You'll see them in the 'CG' state preventing new jobs from starting.  We will address this after the storage is brought back online.


We are currently investigating the problem and will work as quickly as possible to get the system back online.