• Star GPFS back up

    By SDCC News |

    Duration:
    2/12/2024 10:00 pm — 2/12/2024 10:01 pm

    Group Responsible:
    IT Services

    Affected Area:
    Star GPFS

    Expected Impact:
    access is available

    Maintenance Type:
    Information

    Description:
    Access to Star GPFS (gpfs/gpfs01) has been restored.

  • Star GPFS down

    By SDCC News |

    Duration:
    2/12/2024 7:23 am — 2/13/2024 7:27 am

    Group Responsible:
    IT Services

    Affected Area:
    Star GPFS

    Expected Impact:
    access is unavailable until issue is resolved

    Maintenance Type:
    Unplanned/Outage

    Description:
    The Star GPFS File System experienced serious errors over the weekend and is currently Offline. We have opened a ticket with the Vendor and are working to get this resolved. Will send updates as available.

  • US ATLAS dCache storage maintenance

    By SDCC News |

    Duration:
    1/22/2024 9:00 am — 1/22/2024 1:00 pm

    Group Responsible:
    Services & Tools

    Affected Area:
    US ATLAS dCache storage service

    Expected Impact:
    US ATLAS dCache storage service interruption

    Maintenance Type:
    Planned Maintenance/Downtime

    Description:
    The US ATLAS storage system is scheduled for an upgrade to the latest software release, dCache 9.2. Consequently, this maintenance process will result in a temporary interruption of accessibility to the storage system from 9:00 AM to 1:00 PM (EST) on January 22, 2024. We apologize in advance for any inconvenience this work may cause.

  • eicoss02 Lustre maintenance

    By SDCC News |

    Duration:
    1/5/2024 12:00 pm — 1/5/2024 5:00 pm

    Group Responsible:
    IT Services

    Affected Area:
    EIC Lustre

    Expected Impact:
    Lustre file system for EIC won't be availble.

    Maintenance Type:
    Unplanned/Outage

    Description:
    Due to hardware issues, eicoss02.sdcc.bnl.local needs to be brought offline.

  • Site wide downtime on December 18th/19th

    By SDCC News |

    Duration:
    12/18/2023 5:00 pm — 12/19/2023 9:00 pm

    Group Responsible:
    Network

    Affected Area:
    All SDCC services

    Expected Impact:
    No access to SDCC services during downtime

    Maintenance Type:
    Planned Maintenance/Downtime

    Description:
    This extended downtime is necessary for two key activities: upgrading user home directories beginning on Monday (for all programs but the NSLS) and a network maintenance scheduled for Tuesday (affecting all customers). This maintenance marks the final phase of the multi-year migration of SDCC services to a new data center.

    Impact on Services:
    - Storage and Data Transfer Services: Access will be suspended starting Monday, December 18th, at 5 PM EST to allow for a clone of the user’s home directories. Special Note for NSLS Program Users: The NFS home directory work scheduled from Monday to Tuesday does not impact the NSLS program. NSLS program users should expect regular operation during this period (until the network maintenance begins on Tuesday).
    - Computing Resources: Access to computing resources (interactive and batch), storage services (disk and tape), and collaborative tools (BNLBox, RCF email, MatterMost, web services) will be unavailable throughout the respective maintenance periods.
    - Batch Job Scheduling: Scheduling of new HTCondor and Slurm batch jobs will cease on Friday, December 15th, at 11 PM EST. Any remaining HTCondor jobs will be terminated on Monday, December 18th, at 3 PM.
    - Interactive Sessions: Open sessions through the facility SSH gateways will be terminated at 5 PM on December 18th to facilitate the cloning/copying of user home directories.
    - Email Services: The @bnl.gov email services may experience interruptions. Alternatively, users are encouraged to use the sdcc-staff-l@lists.bnl.gov mailing list to contact SDCC staff.
    - NX Service and SSH Gateway: The NX service and SSH Gateway will be unavailable. Existing cssh sessions initiated before the outage will continue to function.

    Restoration of Services:
    Complete restoration of services is anticipated by 9 PM EST on December 19th. The SDCC will notify the community through its regular mailing lists.
    We apologize for the inconvenience this may cause and thank you for your understanding and cooperation as we complete this essential upgrade to our infrastructure.

  • SDCC-wide downtime on Tuesday, December 19th

    By SDCC News |

    Duration:
    12/19/2023 6:30 am — 12/19/2023 9:00 pm

    Group Responsible:
    Network

    Affected Area:
    All SDCC services

    Expected Impact:
    No access to SDCC services during downtime

    Maintenance Type:
    Planned Maintenance/Downtime

    Description:
    On Tuesday, December 19th, from 6:30 am EST to 9 pm EST, network connectivity (both WAN and Campus) to/from the SDCC Facility and Science DMZ
    will be unavailable due to scheduled maintenance.

    This event marks the final phase of the years-long migration of SDCC services from the old data center to the new one. We anticipate the full restoration of
    network connectivity by 9 pm EST on December 19th, concluding the outage. During the intervention period, access to SDCC services, including computing
    resources (both interactive and batch), storage services (disk and tape), and collaborative tools (such as BNLBox, RCF email, MatterMost, and web services),
    will not be available.

    To prepare for this network outage, the scheduling of new HTCondor and Slurm batch jobs will cease on Friday, December 15th, at 11 pm EST. This pause
    will facilitate the smooth draining and graceful termination of ongoing computing jobs. In addition, access to storage and data transfer services will be
    temporarily halted on Monday, December 18th, at 5 pm EST.

    This outage does not affect BNL mail service (@bnl.gov domain), which will be available during this period.

    The SDCC will notify the community about the full restoration of services through its regular mailing lists.

  • SDCC Mattermost Upgrade

    By SDCC News |

    Duration:
    11/20/2023 6:00 pm — 11/20/2023 7:00 pm

    Group Responsible:
    Services & Tools

    Affected Area:
    SDCC Mattermost

    Expected Impact:
    Service will be suspended

    Maintenance Type:
    Planned Maintenance/Downtime

    Description:
    The SDCC Mattermost service (chat.sdcc.bnl.gov) will be suspended on Monday, November 20, 2023, from 6PM EST to 7PM EST

  • Reboot of SFTP and CFTP Servers on 10/13/2023

    By SDCC News |

    Duration:
    10/13/2023 12:30 pm — 10/13/2023 1:30 pm

    Group Responsible:
    IT Services

    Affected Area:
    SFTP Services

    Expected Impact:
    aacces wil unavailable and current sessions ended

    Maintenance Type:
    Planned Maintenance/Downtime

    Description:
    Due to a security vulnerability in Red Hat Linux the SFTP & CFTP Servers will be updated and rebooted between 12:30 PM EST and 1:00PM EST today. This will result in existing connections being disconnected during the reboots.

  • DTN03/IC Globus maintenance on 10/03/23

    By SDCC News |

    Duration:
    10/3/2023 1:30 pm — 10/3/2023 3:30 pm

    Group Responsible:
    IT Services

    Affected Area:
    Globus

    Expected Impact:
    aacces wil unavailable and current sessions ended

    Maintenance Type:
    Planned Maintenance/Downtime

    Description:
    DTN03/IC Globus maintenance on 10/03/23, server will be down for approx. 1-2 hours as it is relocated in the Data Center

  • rcas2069-2076 Shutdown

    By SDCC News |

    Duration:
    6/6/2023 6:00 am

    Group Responsible:
    IT Fabric

    Affected Area:
    PHENIX interactive nodes

    Expected Impact:
    A portion of the PHENIX interactive nodes will be unavailable

    Maintenance Type:
    Planned Maintenance/Downtime

    Description:
    A portion of the PHENIX interactive nodes, rcas2069-2076, will be shutdown and repurposed/renamed from PHENIX to sPHENIX nodes on Tuesday 6/6 at 10:00 AM. Any locally stored data on these nodes (files in /home, /tmp, and /var/tmp) will be destroyed as part of the system rebuild process. Please be sure to logout and transfer any local files elsewhere before the scheduled shutdown. rcas2061-2068 will remain available.