Documentation for various services and software provided by the SDCC

Software documentation

For Grid Users

Recent updates


Git Services

The SDCC provides Git services for use by the experiments

Astro

  • URL: https://git.racf.bnl.gov/astro/cgit/
    • Authentication is required using your BNL RHIC login account name & password and members of the astro group can read/write.
  • Back-end server: astrogit.rcf.bnl.gov
    • Authorized users can log in via ssh to the back-end server to manage the git repositories.

PHENIX

  • URL: 

Accessing clusters through the gateway

Steps for cluster access using SDCC gateways:

Enable SSH agent forwarding, and use your uploaded SSH key to log into either of the following SDCC gateways:

  • ssh01.sdcc.bnl.gov
  • ssh02.sdcc.bnl.gov

Institutional Cluster and Sky Cluster access

From a gateway, SSH to either of the following institutional cluster submit nodes:

  • icsubmit01.sdcc.bnl.gov
  • icsubmit02.sdcc.bnl.gov

The cluster has a debug partition used for debug/interactive sessions

Obtaining cluster allocation

In order to use/access the SDCC resources, you will need a valid allocation.

To have a valid allocation you need to fulfill one of the following:

  • Access to an allocation of computational or storage resources as a member of a project account
  • Commercial allocation by contacting CSI
  • Apply for a free allocation at PASS
    • Users wishing to access SDCC resources must first submit a proposal through the BNL PASS (Proposals, Allocation, Safety, Schedulin

Sky Cluster

Cluster Information:

The cluster consists of:

  • 64 worker nodes

The worker nodes detail:

  • Dell PowerEdge R640
  • 2 CPUs Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz
  • NUMA node0 CPU(s):0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34
  • NUMA node1 CPU(s):1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35
  • Thread(s) per core: 1
  • Core(s) per socket: 18
  • Socket(s): 2
  • NUMA node(s): 2
  • 192 GB Memory
  • InfiniBand EDR connectivity

Partitions:

KNL Cluster

Cluster Information:

The cluster consists of:

  • 142 worker nodes
  • 2 submit nodes
  • 2 master nodes

The worker nodes detail:

  • KOI S7200AP
  • 1 Intel(R) Xeon Phi(TM) CPU 7230 @ 1.30GHz
  • NUMA node0 CPU(s): 0-255
  • Thread(s) per core: 4
  • Core(s) per socket: 64
  • Socket(s): 1
  • NUMA node(s): 1
  • 192 GB Memory
  • Dual Rail OmniPath (Gen1) connectivity

Storage:

  • NaN

Partitions:

Institutional Cluster

About the Institutional Cluster (IC) at the SDCC.

Prerequisites:

  • Have a valid account with the SDCC
  • Have a valid account in slurm
    • Your liaison should contact us with your name or user id via the ticketing system

Cluster Information:

The cluster consists of:

  • 216 worker nodes
  • 2 submit nodes
  • 2 master nodes

The worker nodes detail:

Cluster Storage

Using the Institutional Cluster (IC), including information about central storage, backups, and transfers.

Currently, central storage for the Institutional Cluster (IC) is provided by IBM's Spectrum scale file system (GPFS), a high-performance, clustered filesystem that supports full POSIX filesystem semantics.

If you have used our previous cluster, your files from 'nano' (including /home, /work, and /gscr) are now stored at /hpcgpfs01/cfn/. Please copy those files to your current home directory.

Choose GPU Types

There are two different types of GPUs in the Institutional cluster. Each computer node either has 2 Tesla K80 or 2 Pascal P100 GPUs.  Each K80 appears as 2 GPU devices while each P100 appears as 1 GPU device.

To choose P100 nodes use    --constraint=pascal or -C pascal

To choose K80 nodes use     --constraint=tesla or -C tesla

Example job script using K80 GPUs:

Using GPUs

For running jobs using gpu's please add the following line in your sbatch script or srun command line:

--gres=gpu:4

gres stands for generic resource, gpu is the name of the resource and 4 is the count of gpus to be used (name[[:type]:count])

Job Ouput

Standard output ( STDOUT ) and standard error ( STDERR ) messages from your job are directly written to the output or stderr file name you specified in your batch script or to the default output file name (as slurm-jobid.out) in your submit directory ( $SLURM_SUBMIT_DIR ) and you can monitor them there during your run if you wish.

#SBATCH -o hostname_%j.out # File to which STDOUT will be written
#SBATCH -e hostname_%j.err # File to which STDERR will be written