Hello. I recently found that our network traffic is really slow. (Well, I thought our IO is really slow long time ago, but I notice more now.) After looking into this problem more carefully with RCF staff, we found the cause of this problem. It is cause by traffic jam in RCF network. Here is the current network layout. 1. Our machines have 2 subnets 180 (newer machines with 1Gb/s Ethernet) and 200(older machines with 100Mb/s Ethernet). 2. Our data (rmine and panasis) is in 200 3. Between the two subnets, there are several switches, which involves other experiments. 4. Each switch is connected by 5Gb/s network. 5. Now, each group has started to use one of following methods to access vast amount of local data disks. a. dCache by PHENIX. b. xrootd by STAR c. rootd by PHOBOS and BRAHMS As you can see, when one person in BRAHMS tries to access data from a newer machine, the network traffic goes through several network which can be saturated by other experiment. Now, with the use of local disks by one of the above methods, the probability of this problem happening was multiplied exponentially. For example, for last 1 week, one person in PHENIX with use of dCache was saturating one (or more) of 5Gb/s network traffic between two switches. This resulted in very bad IO wait in BRAHMS machine (and possibly other experiment). This problem will even get worse with generic queues of RCF machines becomes available (soon). My preferred solution is to isolate our machines and disks from other experiment, but this does not seem to be that easy (for RCF/ITD). Therefore, in short terms, it seems that we will keep suffering this problem. (NOTE: It is supposedly possible to increase the network traffic between switches to 8Gb/s from 5Gb/s, but that is not going to solve the fundamental problem of bad organization. This sounds like traffic problem in New York city. I guess we need express lane. :) ) Hiro _______________________________________________ Brahms-dev-l mailing list Brahms-dev-l@lists.bnl.gov http://lists.bnl.gov/mailman/listinfo/brahms-dev-lReceived on Wed Oct 26 12:17:58 2005
This archive was generated by hypermail 2.1.8 : Wed Oct 26 2005 - 12:18:10 EDT