Instructions how a user can use CRS
(Note: If mistakes/problems with instructions please let me know: Nigel)
Introduction
CRS is the phobos central reconstruction farm. It is primarily responsible for taking
the raw data and converting it into hits. Its secondary functions are to run simulations
etc. The primary function ALWAYS has priority over any other function of
CRS.
Only one user, "phobreco", can submit jobs to CRS. Therefore, for it to be
run with multiple users, a high degree of control must be exercised on the users who want
to use it. Therefore please do as requested, in the following documents, otherwise you
could end up destroying the work of your collegues and lossing your privilege to use CRS.
Description
- Get permission to run on the CRS farm (contact nigel@kiwi.chm.bnl.gov,
at least a full working day in advance). You should be allocated a number of nodes and
priority. Every time you want to run on CRS you must get permission to do so.
(There will be a page set up to manage this, but still working on that!)
- Setup to submit jobs in your own area (/phobos/u/<USERNAME>) see How to setup jobs in your own area
- To submit a job/s, logon to rsshgw.rhic.bnl.gov as "phobreco" (
obsolete logon to rcf as "phobreco" obsolete)
- logon to rcrsuser as "phobreco" i.e. at prompt type
> ssh rcrsuser
- Goto /phobos/u/<USERNAME>/phobreco/input/jobfiles
- Type: submit_job_linux_queue.pl -l <USERNAME> -n <NumberNodesToUse> -p
<Priority> -q <Queue Number>.<ENTER>
(obsolete: submit_job_linux.pl -l <USERNAME> -n <NumberNodesToUse>
-p <Priority> .<ENTER>)
(obsolete submit_job.pl -l <USERNAME> -n <NumberNodesToUse> -p
<Priority> .<ENTER> obsolete)
This submits <NumberNodesToUse> of the jobfiles in the subdirectory /submit to the
CRS farm and sends email to <USERNAME>@rcf.rhic.bnl.gov to let you know
it has been submitted. It also moves that jobfile to /submitted.
(When each job is finished, it will take the next jobfile in /submit and submit it etc )
Note: Remove jobfilename~ from the submit directory, else it will try to submit these
also.
Note: The <priority> will be assigned to you. In most cases it will be 100.
Note: The <Queue Number>: There are 3 queues which determine the speed of the
machines you will be runing on, Queue=3 has fastest machines, Queue=1 the slowest. The
queue number determines which set of machines your job will run.
- To monitor the submission of your scripts, type: /usr/crs/bin/tk_CRS_status_awc.pl &
<ENTER>
This pops up a window that shows the status of submitted jobs etc (Note: Must continue to
hit refresh to update)
- When your jobs are finished, you will recieve email informing you of if the job/s so
sucessfully completed or not.
And the next script in submit will be submitted, until the /submit directory is empty.
- In your local user account (/phobos/u/<USERNAME>), setup the following directory
structure

- The jobfile, is the file that you submit to the farm to run you job.(via typing:
submit_job.pl <USERNAME>, this in turn submits the jobfiles that are located in
submit directory). It is the control script that gets everything set up to run on a CRS
node, i.e. stages files on HPSS, tells it what control script to run on node etc. (For how
to make jobfiles etc, you need to use the jobmanager , click here
for description)
A typical jobfile file is shown
executable=/phobos/u/nigel/phobreco/input/scripts/controller_script
executableargs=2105,0
inputdir[0]=/phobos/data01/temp
inputfile[0]=PhoRaw002105s000.root
inputstreamtype[0]=UNIX
inputdir[1]=/phobos/u/nigel/phobreco/input/macros
inputfile[1]=Phat_env.C
inputstreamtype[1]=UNIX
inputdir[2]=/phobos/u/nigel/phobreco/input/macros
inputfile[2]=SiRawToHitModDefaults.dat
inputstreamtype[2]=UNIX
inputnumstreams=3
mergefactor=1
notify=phobreco@rcrsuser.rcf.bnl.gov
outputdir[0]=/phobos/data01/temp
outputfile[0]=PhoHit002105s000.root
outputstreamtype[0]=UNIX
outputnumstreams=1
stdoutdir=/phobos/u/nigel/phobreco/output/log
stdout=Logfile002105.out
stderrdir=/phobos/u/nigel/phobreco/output/err
stderr=Error002105.err
(osolete notify=phobreco@rcf.rhic.bnl.govobsolete)
- Interpretation of jobfile: (Details about jobfiles)
executable=The controller script that is run on the node. It sets up
environmental variables on that node,
and then executes the macro you want to run.
Example:
#! /bin/tcsh
eval `/phobos/common/bin/phobos_setup tcsh`
eval `/phobos/common/bin/phobos_alias
tcsh`
setphat /phobos/u/nigel/Phat
phat -b -q
"/phobos/u/nigel/phobreco/input/macros/sirawprocess_mod_batch.C($2,$3)"
executableargs=The arguments to be passed into the macro, the first one
is $2, second is $3 etc...
in/outputdir[],in/outputfile[],in/outputstreamtype[]= Define the
in/output directory,filename,and file type (UNIX or HPSS)
in/outputnumstreams = Number in/output streams
mergefactor= (don't know, set =1)
notify=the email address of where the node should send information that
it succeeded/failed when job completed
(This goes to phobreco, from where it is forwarded to the
address you specify when you type submit_job.pl <USERNAME>)
stdoutdir,stdout=Directory and filename of file containing standard output
(i.e.what you print to the screen)
stderrdir,stderr=Directory and filename of file containing standard error
(i.e.what you directed to stderr, i.e. cerr<<)
- You must ensure the directories and their files have the following
group privleges otherwise "phobreco" can not use them
/phobreco/output/err & /phobreco/output/log
are group readable,writable and executable
/phobrco/input/macros & /phobreco/input/scripts
are group readable and exectuable
/phobreco/input/jobfiles/submit & /phobreco/input/jobfiles/submitted
are group readable,writable and
exectuable
Check this with the comand ls -l, and look in the privelege field, eg :-rwxrwx---
shows group readable,writable and exectable.
- You are set up, once you have a jobfile/s in /phobreco/input/jobfiles/submit, a
controller script in /phobreco/input/scripts, a macro/s in /phobreco/input/macros.
TroubleShooting
This is difficult to do, but start simple, and work your way up to the complete macro
you want to work.