[Brahms-dev-l] brahms network/disk IO problem.

From: Hironori Ito <hito@rcf.rhic.bnl.gov>
Date: Wed Oct 26 2005 - 12:17:27 EDT
Hello.  I recently found that our network traffic is really slow.  
(Well, I thought our IO is really slow long time ago, but I notice more 
now.)   After looking into this problem more carefully with RCF staff, 
we found the cause of this problem.  It is cause by traffic jam in RCF 
network.  Here is the current network layout.

1.  Our machines have 2 subnets 180 (newer machines with 1Gb/s Ethernet) 
and 200(older machines with 100Mb/s Ethernet). 
2.  Our data (rmine and panasis) is in 200
3.  Between the two subnets, there are several switches, which involves 
other experiments.
4.  Each switch is connected by 5Gb/s network.
5.  Now, each group has started to use one of following methods to 
access vast amount of local data disks.
    a.  dCache by PHENIX.
    b.  xrootd by STAR
    c.  rootd by PHOBOS and BRAHMS

As you can see, when one person in BRAHMS tries to access data from a 
newer machine, the network traffic goes through several network which 
can be saturated by other experiment.  Now, with the use of local disks 
by one of the above methods, the probability of this problem happening 
was multiplied exponentially.  For example, for last 1 week, one person 
in PHENIX with use of dCache was saturating one (or more) of 5Gb/s 
network traffic between two switches.  This resulted in very bad IO wait 
in BRAHMS machine (and possibly other experiment).   This problem will 
even get worse with generic queues of RCF machines becomes available 
(soon).  My preferred solution is to isolate our machines and disks from 
other experiment, but this does not seem to be that easy (for 
RCF/ITD).   Therefore, in short terms, it seems that we will keep 
suffering this problem.  (NOTE:  It is supposedly possible to increase 
the network traffic between switches to 8Gb/s from 5Gb/s, but that is 
not going to solve the fundamental problem of bad organization.  This 
sounds like traffic problem in New York city.  I guess we need express 
lane. :) )

Hiro
_______________________________________________
Brahms-dev-l mailing list
Brahms-dev-l@lists.bnl.gov
http://lists.bnl.gov/mailman/listinfo/brahms-dev-l
Received on Wed Oct 26 12:17:58 2005

This archive was generated by hypermail 2.1.8 : Wed Oct 26 2005 - 12:18:10 EDT