Re: [Brahms-dev-l] Proof sessions

From: Flemming Videbaek <videbaek_at_bnl.gov> Date: Sat, 5 May 2007 13:03:45 -0400 · This archive was generated by hypermail 2.2.0 : Sat May 05 2007 - 13:04:43 EDT

Hi Selemon,

The problem is with the files and/or code used.
When I look at /var/log/ROOT.log 
I see --
  i.e. has nothing to do with condor.
Try to run your session with just a few 1000' events and get that to work before submitting.
I also recommend you go to each node and kill the proofserv (3 on 41, two on subsequent nodes)

Flemming

> FL.fSi1bMult
May  4 22:09:58 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FL.fSi1cMult
May  4 22:09:59 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FL.fSi1dMult
May  4 22:09:59 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FL.fSi1eMult
May  4 22:09:59 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FL.fSi1fMult
May  4 22:09:59 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FL.fSi1gMult
May  4 22:09:59 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FL.fSi1aEta
May  4 22:09:59 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FL.fSi1bEta
May  4 22:09:59 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FL.fSi1cEta
May  4 22:09:59 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FL.fSi1dEta
May  4 22:09:59 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FL.fSi1eEta
May  4 22:09:59 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FL.fSi1fEta
May  4 22:09:59 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FL.fSi1gEta
May  4 22:09:59 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FS.fVertexFlag
May  4 22:10:00 rcas0041 proofslave[9165]: tigist:slave 0.0:Error:<TTree::SetBranchAddress>:unknown branch -> FFS.fVertexFlag
May  4 22:10:02 rcas0041 proofslave[9165]: !!!cleanup!!!

--------------------------------------------
Flemming Videbaek
Physics Department 
Bldg 510-D
Brookhaven National Laboratory
Upton, NY11973

phone: 631-344-4106
cell:       631-681-1596
fax:        631-344-1334
e-mail: videbaek @ bnl gov
----- Original Message ----- 
From: "Bekele, Selemon" <bekeleku_at_ku.edu>
To: <brahms-dev-l_at_lists.bnl.gov>
Sent: Saturday, May 05, 2007 12:52 PM
Subject: [Brahms-dev-l] Proof sessions

> 
> Hi,
> 
>   I have been trying to run proof sessions 
> (6 centrality bins X 6 field settings = 30 sessions)
> with the master node on rcas0041. I run the sessions 
> sequentially for each centrality from a shell script 
> and only the very first session has finished
> since 9:00 PM friday night and the second session
> is suspended which means the subsequent runs could
> not be done.
> 
> Doing
> 
> rcas0041:> condor_status -claimed
> 
> I see:
> 
> vm1_at_rcas0041. LINUX       INTEL  0.820  claudius_at_bnl.gov     rcas2065.rcf.bn
> vm2_at_rcas0041. LINUX       INTEL  0.870  claudius_at_bnl.gov     rcas2065.rcf.bn
> vm1_at_rcas0042. LINUX       INTEL  0.000  claudius_at_bnl.gov     rcas2065.rcf.bn
> vm2_at_rcas0042. LINUX       INTEL  0.000  claudius_at_bnl.gov     rcas2065.rcf.bn
> vm1_at_rcas0043. LINUX       INTEL  0.800  claudius_at_bnl.gov     rcas2065.rcf.bn
> vm2_at_rcas0043. LINUX       INTEL  0.820  claudius_at_bnl.gov     rcas2065.rcf.bn
> vm1_at_rcas0044. LINUX       INTEL  0.830  claudius_at_bnl.gov     rcas2065.rcf.bn
> vm2_at_rcas0044. LINUX       INTEL  0.860  claudius_at_bnl.gov     rcas2065.rcf.bn
> vm1_at_rcas0045. LINUX       INTEL  0.000  claudius_at_bnl.gov     rcas2065.rcf.bn
> vm2_at_rcas0045. LINUX       INTEL  0.000  claudius_at_bnl.gov     rcas2065.rcf.bn
> vm1_at_rcas0046. LINUX       INTEL  0.000  claudius_at_bnl.gov     rcas2065.rcf.bn
> vm2_at_rcas0046. LINUX       INTEL  0.000  claudius_at_bnl.gov     rcas2065.rcf.bn
> vm1_at_rcas0047. LINUX       INTEL  0.750  claudius_at_bnl.gov     rcas2065.rcf.bn
> vm2_at_rcas0047. LINUX       INTEL  0.710  claudius_at_bnl.gov     rcas2065.rcf.bn
> 
> 
> It seems like the proof sessions are suspended because,
> I think, someone is running cpu intensive jobs on the
> BRAHMS rcas machines. I do not think changing to a different
> master node would help as all the BRAHMS machines seem to be
> taken.
> 
> Has anyone faced the same problem and found a quick solution or
> I just need to wait out until the machines become free?
> 
> Selemon,
> 
> _______________________________________________
> Brahms-dev-l mailing list
> Brahms-dev-l_at_lists.bnl.gov
> https://lists.bnl.gov/mailman/listinfo/brahms-dev-l
>
_______________________________________________
Brahms-dev-l mailing list
Brahms-dev-l_at_lists.bnl.gov
https://lists.bnl.gov/mailman/listinfo/brahms-dev-l