Reconstruction

The quick step-by-step procedure

The quick procedure to run the reconstruction on "all" the simulation files is:
  1. Log in as user bramreco on rcf.rhic.bnl.gov.
  2. Change to the script directory mdc1/scripts.
  3. If necessary, edit the EMAIL address in the auto_submit.pl script.
  4. Do an ls ../jsf to see how many times the files have been processed. There should be directory names like 4, 5, ... .
  5. Run the reco_all.pl script with one (or more) number(s) larger than the numbers found in the previous step as command line arguments (eg. ./reco_all.pl 10).
The output files are stored in the directory mdc1/rdo/number on the HPSS file server.
The log files (stdout) are written to the directory mdc1/log/number.
The error files (stderr) are written to the directory mdc1/errlog/number.

Job monitoring

The command crs_status.pl lists the states of the submitted jobs, while crs_node_status.pl lists the CRS nodes and the jobs assigned to them.

To see the size of the output files (in one directory), use the command ftpeek.csh and give the name of the output directory or file name as command line argument (eg. mdc1/rdo/number).

To create a summary of the reconstruction jobs (showing Geant job number, input file size, number of events processed, real time and CPU time used, and output file size), use the script summary.pl and give a list of log files as command line arguments (eg. log/number/*). The script will take a bit of time to process all the data.

The mail sent from the CRS system to the bramreco account when a job ends (successfully or with a crash), is forwarded into the file mbox and can be read by the command mailx -f.

Killing a CRS job

A running CRS job can be killed using the script crs_kill_job.pl in /u0b/throwe/mdc1/crs/bin. The argument to the script is the job id (job_nnnnnnnnn_n) shown by crs_node_status.pl.
It may take a while before the job disappears because it has to clean up the CRS node before it finishes. Don't try to kill a job twice. This will kill the clean up process. Ask Tom Throwe for help.

The real step-by-step procedure

The step by step procedure for running reconstruction jobs on the Central Reconstruction Server Farm is:
  1. Log in as user bramreco on rcf.rhic.bnl.gov.
  2. Change to the script directory mdc1/scripts.
  3. If necessary, edit the EMAIL address in the auto_submit.pl script.
  4. Create (or edit) job specification file(s) that describe(s) the input and output files you want.
  5. Run the make_crsjf.pl script to create one or more CRS job specification file(s) and optionally submit the job(s).

The job specification file

The job specification file is a plain text file that contains lines of the form:
   key=value
  
Everything after a # on a line is considered a comment. The valid keys, valid values and the default values are given in the table below.

Valid keysValid valuesDefault value
inputnumstreams 1 1
inputstreamtype[0] HPSS,UNIX HPSS
inputdir[0] (a)  
inputfile[0] (a,b)  
outputnumstreams 1 1
outputstreamtype[0] HPSS,UNIX,OBJECTIVITY HPSS
outputdir[0] (a)  
outputfile[0] (a,b) (c)
outputsuffix[0] (d)  
stdoutdir (a) /brahms/u/bramreco/mdc1/log
stdout (a,b) (c)
stderrdir (a) /brahms/u/bramreco/mdc1/errlog
stderr (a,b) (c)
mergefactor 1 1
notify   bramreco@rcf.rhic.bnl.gov
executable   /brahms/u/bramreco/mdc1/bin/bramreco
executableargs   (e)

  1. File names and relative directory names are individually restricted to 255 characters. Valid characters are the lower case letters (a-z), the digits (0-9), the underscore character (_) and the period (.).
  2. A comma separated list of file names
  3. If no list is given, a list will be generated based on the list of input files. For each input file a corresponding output file is given the name inputfilename.suffix, where suffix is outputsuffix[0] for outputfile[0], .stdout for stdout, and .stderr for stderr
  4. A suffix contains an optional leading period (.) and at least one, but no more than 253 characters in the set a-z, 0-9 and underscore (_).
  5. A list of GEANT run numbers extracted from the names of the input files.
An example of a minimal job specification file used for one GBRAHMS file:
   inputdir[0]=mdc1/gbrahms_output
   inputfile[0]=sim_138.cdat
   outputdir[0]=mdc1/dst
   outputfile[0]=dst_138.root
  

The CRS Job File generator

The script make_crsjf.pl generates and optionally submits CRS Job File(s). The script is invoked like this:
   make_crsjf.pl [switches] [--]  file...
  
where the switches and arguments are:
    -d       detailed output for debugging
    -s num   number of jobs to submit
    -u       use GMT for timestamp
    file...  name(s) of job specification file(s)
  
The generated CRS Job files have names crsjf_nnn where nnn is a three digit number (001, 002, ...), and are put in the directory mdc1/crsjf/todo.

A summary for each generated job file is written to crsjf_database.txt in mdc1/crsjf. This file is read at startup to extract the job number of the last job previously generated.

If the submit option is used, up to num CRS job files are submitted. The auto_submit.pl script will submit the remaining jobs. When the bramreco account receives e-mail, the mail is forwarded to the auto_submit.pl script. This script picks the (alphabetically) first job file in the todo directory, submits the job, and moves the job file to the archive directory.

To submit a job manually, use auto_submit.pl. (One can submit a job with crs_submit.pl but then one has to move the CRS job file out of the todo directory to avoid that the job is submitted once more by auto_submit.pl.)

The reconstruction code

The source code and Makefile for the reconstruction code is available in the directory mdc1/src.
Alv Kjetil Holme
Modified 30 September 1998