use or misuse ofLSF queues ??

From: Flemming Videbaek (videbaek@sgs1.hirg.bnl.gov)
Date: Mon Apr 01 2002 - 14:37:45 EST

  • Next message: Christian Holm Christensen: "Re: use or misuse ofLSF queues ??"

    It has been described recently how to use the LSF system to submit jobs. This does not mean it
    necessarely is reasonable to fill the queues with apperently cpu intensitive jobs.
    
    Let me explain why;
    
    The brahms data+user disk is served by a single SUN that is capable of delivering at most ~35 Mb/sec.
    If the CAS nodes are all loaded with jobs that attempts to read from the data disks at high rate > .9 Mb/sec
    one will get a very sluggish  response (as is the case presently).
    It is concevable we can get a seperate server machine for e.g. /brahms/u + a subsets of disk but this will be a while
    (4-6 months)
    
    Despite the jobs have been niced this does not help, and the time to compile/link is about a factor 5-100  worse than when
    no reading load is present.I started a linkage of brag about 20 minutes ago and it is still not complete
    
     I do not immediate know how to address this other than look into
    a) divide the pool of rcas into LSF and interactive one
    b) make additional queues with different charecteristics like 
       io (fast) small cpu time haigh bandwidth max one per machine (except for 0-4) 
       cpu intensive (e.g. simulations) 
    c) rcas005 will definitely go out of LSF queues it is the database machine.
    
    Until then I will appeal to people common sense not to load the system completely - the impact on interactive use
    is too much.
    
    
    
    
    
    ------------------------------------------------------
    Flemming Videbaek
    Physics Department
    Brookhaven National Laboratory
    
    tlf: 631-344-4106
    fax 631-344-1334
    e-mail: videbaek@bnl.gov
    



    This archive was generated by hypermail 2b30 : Mon Apr 01 2002 - 14:39:04 EST