Hi Flemming et al, On Tue, 5 Jun 2001 12:42:26 -0400 "Flemming Videbaek" <videbaek@sgs1.hirg.bnl.gov> wrote concerning ": MainModule - What is a run?": > Christian, > > It seems that you have defined a run to be the reading of one file > in MainModule as well, as in the definition of the BrIOModule. More precisly a "run-level" as I believe it says in the documentation. > when either an EOF is reached, or #events reached the limit the > control transfers out of the event loop call End() and then > subsequently goes on to next file. That is correct. > This is the correct behaviour if the next file is indeed a new run, > but not if it is another sequence of the same run. In that case you > really want just to continue reading and just go to End() when > everything is read. This would require that BrIOModule as well as BrMainModule had another mode of operation. > One reason here- e.g. for DB access you do not want this per seq > file; for writting to DB you would certainly not want it on each > sequence. Yes, that's clear. > One possible way of dealing with this is to add one more mode to the > BrIOModule e.g. kBrSeqFile with the implication that File is opened > at Begin() on Eof() the next file in list is openened (The specific > files should all be set by the AddFile() such that the IOModule > should only deal with it's list of files) That is indeed what I'd suggest. > -- or > the logic is built into the MainModule, but since it does not know > anything about files it is not so nice. I prefer your first alternative. I was indeed aware of this problem when I wrote BrMainModule and modified BrIOModule standard BrModule methods (Init, Begin, Event, End, Finish), but didn't make an effort to try to solve it, because: * When one is reading through raw data files, it should happen in some parallel environment like CRF. Here only one sequence is input per job, so there's no "multiple input problem" * Subsequently one will merge the output (one per sequence) of the pass over the raw files, into one file, representing a full run. * Then, when doing additional analysis, each file will represent one run, and Begin/End does indeed correspond to run boundaries. Ofcourse there may be situations where any of this may fail to work (The reconstructed data files can not fit into one file - i.e., hits the 2GB file limit, or one really wants to loop - in one job - over the individual sequences), and so it would be nice to have the additional functionality. However, back then I didn't consider it a paramount concern. Yours, Christian ----------------------------------------------------------- Holm Christensen Phone: (+45) 35 35 96 91 Sankt Hansgade 23, 1. th. Office: (+45) 353 25 305 DK-2200 Copenhagen N Web: www.nbi.dk/~cholm Denmark Email: cholm@nbi.dk
This archive was generated by hypermail 2b29 : Tue Jun 05 2001 - 14:22:55 EDT