Hi Steve, On Fri, 29 Mar 2002 10:34:08 -0600 "Stephen J. Sanders" <ssanders@ku.edu> wrote concerning "calibrations on rcas": > Hi, > I'm trying to get si/tile calibration done on rcas0019. Unfortunately, > rcf has apparently implemented a time-out program that drops > my ssh connection before I can complete a replay. I don't know if > this is new, but I never encountered the problem before this week. > So... I am trying to run bratmain in batch. My problem is > that my in the batch environment, bratmain can't find the shared > library that contains my replay modules: > > Error in <TUnixSystem::DynamicPathName>: libmultReplay.so does not exist > in .:/afs/rhic/opt/brahms/new/lib > > What needs to be done to point to a local library? Make sure that you have the path to your libraries in the variable Unix.*.Root.DynamicPath in the .rootrc file in our home directory or the directory where you execute bratmain. What do you mean by batch? CRS, LSF, or `nice ... &'? For CRS (and CRASH), the documentation outlines what you need to do: You need to put your library in the same directory as your configuration script. For LSF and `nice ...&', you need the entry in the .rootrc file. I higly recommend you (all of you) use the LSF for long second, third, ladida analysis passes. Make a file like #!/bin/sh #BSUB -q brahms_cas #BSUB -o <standard out output file> #BSUB -e <standard error output file> #BSUB -J <name of your job> unset DISPLAY bratmain <configuration script> [<options>] The you can submit this to the LSF queue (brahms_cas) with bsub < <script name> See also man(1) pages bsub(1), bpeek(1), bjobs(1), bhosts(1), and so on. There's some documentation at [1] - see in particular the `Quick Start Guide' and the reference card at [2]. If you need to submit many jobs, I suggest you make a script like #!/bin/sh logdir=${HOME}/log outdir=${HOME}/out indir=${HOME}/in config=${HOME}/config.C user=`whoami` runs="$*" for run in $runs ; do # Write a temporary script cat > tmp.sh <<EOF #!/bin/sh #BSUB -q brahms_cas #BSUB -J ${user}_$run #BSUB -o ${logdir}/${run}.out #BSUB -e ${logdir}/${run}.err set -e unset DISPLAY bratmain ${config} \ -r $run \ -o ${outdir}/out${run}.root \ -H ${outdir}/hist${run}.root \ -i ${indir}/in${run}.root \ -v 5 EOF # Submit the temporary script bsub < tmp.sh # remove the temporary script rm -f tmp.sh done Then you can submit a number of runs to the queue doing ./mylsfsubmit <runs> The jobs will be queued and executed as soon as a processor on the CAS machines is avaliable. Notice, that the jobs have a high nicity (low priority), and will be preempted (pushed off the processor) if a normal program is started by a user on the same CAS node - hence, the use of LSF is `behaving nicely'. Each node in the CAS farm can at most run two LSF jobs. LSF will automatically choose the fastest avaliable CPU for the job execution. Notice, that /home/... is different for each machine. If you have loads of disk I/O, then you may want to make a directory in /home/`whoami` and output stuff there, and then copy the resulting file to ${HOME} when done: #!/bin/sh #BSUB -q brahms_cas #BSUB -o <standard out output file> #BSUB -e <standard error output file> #BSUB -J <name of your job> unset DISPLAY if test ! -d /home/`whoami`/lsfwork ; then mkdir -p /home/`whoami`/lsfwork fi cd /home/`whoami`/lsfwork bratmain <configuration script> [<options>] cp <output> ${HOME}/out I cannot stress how much I recommend this kind of batch processing. LSF is a very clever piece of software that will most probably do the job you need faster than anything else. Yours, Christian Holm Christensen ------------------------------------------- Address: Sankt Hansgade 23, 1. th. Phone: (+45) 35 35 96 91 DK-2200 Copenhagen N Cell: (+45) 28 82 16 23 Denmark Office: (+45) 353 25 305 Email: cholm@nbi.dk Web: www.nbi.dk/~cholm [1] http://www.rhic.bnl.gov/RCF/UserInfo/Software/LSF/ [2] http://www.platform.com/services/support/docs/lsfdoc42/pdf/manuals/lsf_4.2_qrefcard.pdf
This archive was generated by hypermail 2b30 : Fri Mar 29 2002 - 12:12:50 EST