It has been described recently how to use the LSF system to submit jobs. This does not mean it necessarely is reasonable to fill the queues with apperently cpu intensitive jobs. Let me explain why; The brahms data+user disk is served by a single SUN that is capable of delivering at most ~35 Mb/sec. If the CAS nodes are all loaded with jobs that attempts to read from the data disks at high rate > .9 Mb/sec one will get a very sluggish response (as is the case presently). It is concevable we can get a seperate server machine for e.g. /brahms/u + a subsets of disk but this will be a while (4-6 months) Despite the jobs have been niced this does not help, and the time to compile/link is about a factor 5-100 worse than when no reading load is present.I started a linkage of brag about 20 minutes ago and it is still not complete I do not immediate know how to address this other than look into a) divide the pool of rcas into LSF and interactive one b) make additional queues with different charecteristics like io (fast) small cpu time haigh bandwidth max one per machine (except for 0-4) cpu intensive (e.g. simulations) c) rcas005 will definitely go out of LSF queues it is the database machine. Until then I will appeal to people common sense not to load the system completely - the impact on interactive use is too much. ------------------------------------------------------ Flemming Videbaek Physics Department Brookhaven National Laboratory tlf: 631-344-4106 fax 631-344-1334 e-mail: videbaek@bnl.gov
This archive was generated by hypermail 2b30 : Mon Apr 01 2002 - 14:39:04 EST