Please supply us with feedback in particular you working actively with CRS. ------------------------------------------------------ Flemming Videbaek Physics Department Brookhaven National Laboratory tlf: 631-344-4106 fax 631-344-1334 e-mail: videbaek@bnl.gov ----- Original Message ----- From: "Tony W. Chan" <tony@bnl.gov> To: <mcbreen@bnl.gov>; <videbaek@bnl.gov>; <chujo@bnl.gov>; <momchil@bnl.gov>; <burt@bnl.gov>; <nigel@bnl.gov>; <jeromel@bnl.gov>; <messer@bnl.gov> Cc: <throwe@bnl.gov> Sent: Tuesday, November 13, 2001 2:10 PM Subject: CRS batch software crashes > Hi, Nov. 13, 2001 > > I think (not 100% sure) that the CRS batch software crashes > are in part due to the doubling of the number of CRS nodes > and the small (128 MB) of RAM on the CRS master node (rcrsfm). > A newer system has been successfully tested over the last 2 > weeks. The new system has a CPU that is 3x times more powerful, > 4x more memory and 2x more disk space, and we would like to > replace rcrsfm with the new machine, such that the new machine > preserves the name "rcrsfm", while the old machine will be used > as a back-up and test bench. > > To accomplish this change will require 1 or 2 system reboots > of rcrsfm, which means that the users will experience 1 or 2 > disruptions of their usage of the CRS batch software. > > Here are the 2 choices: > ----------------------- > > 1) Move all 4 experiments to the new machine in one move. > Likely downtime is about 1-2 hours, with disruptions affecting > everyone simultaneously. Everyone has to agree on a time and > day for this to happen. > > 2) Move one experiment at a time to the new machine (temporarily > called "rcrsfm01") according to each experiment's schedule. > Likely downtime is 30-60 minutes per experiment. After all 4 are > on the new machine, one more downtime (30 minutes) to make the > name change to rcrsfm. > > While 2) is more complicated, it will allow us to check if my > suspicions above are correct or not. Choice 1) is the simpler > move, but if my suspicions are wrong, we will not be solving > the crash problems completely, but we will probably improve > software performance. > > Date and Time: > -------------- > > Wednesday/Thursday this week or Monday next week beginning > at 9 am, for either choices 1) or 2). > > Please let me know your choice soon, so we can start making > preparations. Thanks. > > Cheers, > > Tony > > >
This archive was generated by hypermail 2b30 : Tue Nov 13 2001 - 14:39:45 EST