• HPSS Upgrade

    By SDCC News | Thu, 12/01/2022 - 11:15

    Duration:
    12/19/2022 12:00 am — 12/20/2022 9:00 pm

    Group Responsible:
    IT Fabric

    Affected Area:
    HPSS Service

    Expected Impact:
    Files on tape will be inaccessible during the upgrade.

    Maintenance Type:
    Planned Maintenance/Downtime

    Description:
    HPSS will be down for a system upgrade from midnight on Monday 12/19 through 9:00 PM on Tuesday 12/20. During the upgrade, it will not be possible to write or retrieve files from the HPSS system.

  • Gitea Upgrade

    By SDCC News | Thu, 10/20/2022 - 09:23

    Duration:
    10/26/2022 11:00 am — 10/26/2022 11:30 am

    Group Responsible:
    Services & Tools

    Affected Area:
    Gitea

    Expected Impact:
    Service Unavailable

    Maintenance Type:
    Planned Maintenance/Downtime

    Description:
    The Gitea service will be unavailable while it is upgraded to v.1.17.3

  • NoMachine/NX Service Update

    By SDCC News | Fri, 10/07/2022 - 18:15

    Duration:
    10/12/2022 11:00 am — 10/12/2022 1:00 pm

    Group Responsible:
    IT Services

    Affected Area:
    NX Service

    Expected Impact:
    NX sessions on xterm and nxcampus servers will be terminated

    Maintenance Type:
    Planned Maintenance/Downtime

    Description:
    The NX sessions on xterm and nxcampus servers will be terminated. Please save your work.

  • Interactive Farm Node Maintenance

    By SDCC News | Tue, 08/02/2022 - 08:49

    Duration:
    8/4/2022 10:00 am — 8/8/2022 12:00 pm

    Group Responsible:
    IT Fabric

    Affected Area:
    Processor Farm

    Expected Impact:
    Unable to login or sessions terminated on affected nodes

    Maintenance Type:
    Planned Maintenance/Downtime

    Description:
    The SDCC's interactive and Jupyter farm nodes will be shutdown for maintenance according to the following schedule:

    Thursday 8/4 (10:00 AM - 12:00 PM)
    rcas6001-6002, rcas6005-6010
    rcas2061-2068
    eic0101-0106
    astro0101
    sphnx01
    spar0101-0104
    jupyter10-11

    Monday 8/8 (10:00 AM - 12:00 PM)
    rcas6003-6004, rcas6011-6016
    rcas2069-2076
    eic0107-0112
    astro0104
    sphnx02
    spar0105-0108
    jupyter12-13

    Please logout of the affected hosts before their scheduled shutdown.

  • Mattermost Upgrade

    By SDCC News | Wed, 07/20/2022 - 05:26

    Duration:
    7/20/2022 9:00 pm — 7/20/2022 9:30 pm

    Group Responsible:
    IT Services

    Affected Area:
    Mattermost

    Expected Impact:
    Mattermost will be unavailable during the maintenance

    Maintenance Type:
    Planned Maintenance/Downtime

    Description:
    Mattermost will be upgraded to the latest version 7.1 and will be unavailable during the upgrade. Collapsed reply threads will be enabled for all users. Users can disable Collapsed reply threads by navigating to Settings -> Display -> Collapsed reply threads.

  • BNL Box Upgrade

    By SDCC News | Wed, 06/01/2022 - 07:20

    Duration:
    6/1/2022 2:00 pm — 6/1/2022 4:00 pm

    Group Responsible:
    Services & Tools

    Affected Area:
    BNL Box

    Expected Impact:
    Short and intermittent service interruption

    Maintenance Type:
    Transparent Upgrade/Maintenance

    Description:
    Update to Nextcloud 24.0.1

  • New Facility Risk Assessment Document posted to sdcc.bnl.gov

    By SDCC News | Fri, 05/27/2022 - 07:06

    Duration:
    5/27/2022 11:00 am

    Group Responsible:
    IT Fabric

    Affected Area:
    Building 725

    Expected Impact:
    Occupational safety directly related to the facility

    Maintenance Type:
    Information

    Description:
    The Facility Risk Assessment (FRA) document acts as a benchmark to understanding the risks associated with working in the SDCC in Building 725. The FRA is a useful tool for work planning purposes and gaining knowledge of the hazards associated with the SDCC in Building 725. The FRA has been posted to the Staff-Only section of the SDCC website. SDCC staff can find it here: https://www.sdcc.bnl.gov/staff-only/.

  • US-ATLAS dCache Maintenance

    By SDCC News | Fri, 05/13/2022 - 11:56

    Duration:
    5/17/2022 9:00 am — 5/17/2022 1:00 pm

    Group Responsible:
    Services & Tools

    Affected Area:
    US-ATLAS dCache

    Expected Impact:
    dCache storage not accessible

    Maintenance Type:
    Planned Maintenance/Downtime

    Description:
    The US-ATLAS dCache storage element will be upgraded to the latest dCache version (7.2.16). The ATLAS dCache service hosted at BNL will not be accessible during the upgrade. Anticipated downtime is less than 4 hours.

  • Upgrade of SDCC Globus endpoint on May 31st 2022

    By SDCC News | Tue, 05/03/2022 - 11:02

    Duration:
    5/31/2022 9:00 am — 5/31/2022 9:00 am

    Group Responsible:
    IT Services

    Affected Area:
    Globus File Transfers

    Expected Impact:
    Use of new Globus procedures

    Maintenance Type:
    Transparent Upgrade/Maintenance

    Description:
    The SDCC Globus endpoint is moving to version 5.4 and will require MFA for all file transfers. Currently two SDCC endpoints are available but after May 31st the old SDCC endpoint (d6ae63d8-503f-11e9-a620-0a54e005f950) will be retired and deleted. Please test and use the new SDCC endpoint for all file transfers (12782fb1-a599-4f18-b0fb-2e849681e214).

    See documentation for more details.

    https://www.sdcc.bnl.gov/information/services/globus-file-transfer-0

  • Datacenter CDCE Room Cooling Failure

    By SDCC News | Wed, 04/28/2021 - 11:36

    Tue Jun 15 16:41:34 EDT 2021

    This item has been posted to rhic-rcf-l@lists.bnl.gov, bnl-shared-tier3-l@lists.bnl.gov

    There was a major cooling failure in the CDCE room in SDCC's datacenter earlier today (6/15), starting around 12:30 PM EST, due to an issue with the chilled water system in the building. Temperatures rose quickly, triggering automated monitoring software shutdowns of compute nodes in that room around 1:00 PM in order to avoid equipment damage. This affected all ATLAS T1 compute nodes, and a large portion of the shared pool (all spool0XYZ systems). Parts of our RHEV system were also affected. The issue with the building chilled water circulation was repaired by approximately 3:00 PM, and the farm equipment was powered back online, and opened to jobs after the room room temperature stabilized at 3:30 PM.

    At this time we believe all affected services have been restored. If you continue to experience issues, please submit a ticket to RT.

    Chris Hollowell (hollowec@bnl.gov)