From: Djamel Ouerdane (ouerdane@nbi.dk)
Date: Wed Mar 26 2003 - 10:30:14 EST
> As far as I can tell, Ionut submitted the jobs with the option > `-R ncpus' which will tell LSF to start the particular job on many > CPUs. However, Ionut did not specify how many CPUs, so it defaults to > some number, which I guess is 1-20. You can tell that from the output > of `bsub -l <job#>'. > > Ionut, I'm not so sure that starting the job on several CPUs make > sense. I looked into your files `/brahms/u/aic/work/work3/run.sh' and > `/brahms/u/aic/work/work3/calibrate.cpp' and it doesn't seem to be a > parallel program. Hence, what will happen is, that the same (exact > same) job will be executed on 1-20 hosts, each overwriting the same > output file `calibData_run00<xxxx>_seq<x>.root'. For the use of the > `ncpus' resource to make sense, I think you have to use some sort of > parallel execution environment, like PROOF or MLP, and your code must > be written to use that environment. Your `calibrate.cpp' does not > look like that. > > Perhaps a few words on what you're trying to do would help us out. > Meanwhile, some important jobs are pending. How can we wind around to avoid this situation ? Djam
This archive was generated by hypermail 2.1.5 : Wed Mar 26 2003 - 10:30:57 EST