Duration:
12/30/2024 9:00 am — 12/30/2024 6:00 pm
Group Responsible:
IT Fabric
Affected Area:
All SDCC services
Expected Impact:
No access to SDCC resources (computing, storage and services)
Maintenance Type:
Planned Maintenance/Downtime
Description:
A critical maintenance/replacement procedure on the BNL main electrical grid scheduled for Monday, Dec. 30th was announced to the SDCC on very short-notice last week. This procedure is planned to start around 12 noon and last approximately 4 hours.\n\nWe recognize this procedure is happening during the BNL-declared "quiet period", but a postponement would incur increased costs to the Lab and potentially place this must-do procedure during the start-up period for RHIC run 25, which is deemed even less desirable than the current plan. BNL management has decided to go ahead with the Dec. 30th procedure, as planned.\n\nThis procedure requires transferring the power source from the electrical utility to the back-up generator, with an UPS to bridge the time gap (a few seconds) between utility and generator power, and then remain on generator power for the duration of this procedure. Because there is a small risk of failure during the transfer process and in generator operations and because of reduced staff availability during the BNL quiet period, the SDCC management has decided to quiet down the facility resources to minimize the chances of data corruption, service disruptions and hardware failures, in the unlikely event that an unplanned power outage occurs.\n\nQuieting down means: 1) draining batch jobs (HTCondor and Slurm), holding new ones from starting and stopping interactive access to SDCC cpu resources on SUNDAY (DEC., 29TH) AT 3 PM ET and 2) stopping all data read/write and movement activities (disk and tape) on MONDAY (DEC. 30TH) AT 9AM ET.\n\nAnnouncements to SDCC Liaisons and program/experimental PoCs will be made when SDCC resources are fully available again.