Re: follow up of pid...

From: Bjorn H Samset (bjornhs@rcf2.rhic.bnl.gov)
Date: Wed Nov 20 2002 - 06:34:58 EST

  • Next message: Christian Holm Christensen: "Re: Bdst analyse"
    <lots of good discussion>
    
    Hi everyone - here are some of my ideas on the bdst-stuff:
    
    (This mail contains some general suggestions on how  to organize the code,
    not details about the analysis. Read if you are interested.)
    
    First of all I agree with everyone else that we need a general
    analysis framework that will work for all systems and consequently
    must have a lot of settable constants and overloadable functions. This
    means, as has been pointed out, that wee need some way of keeping track of
    what was actually set for a given analysis. I propose to include a new,
    obligatory outputfile that contains all this, and that should/must be
    stored with the data files.
    I.e. all programs _must_ be run with a new option
    -a --ascii-out    Outputfile (ascii) with all constants and overloaded
                      functions
    
    This file should then contain
    * all constants like Nsigma cuts
    * all offsets etc. found in (n*pre)loop-functions
    * if possible, a message saying that a function (like a pid-cut-function)
    has been overloaded and which piece of code contains the overloader
    
    Maybe not such a big improvement over what we have today, but at least
    it's (mostly) easy to implement. On a grander scale we could also consider
    implementing a global "analysis number" to keep track of this, in e.g.
    this way:
    * Make a small program that just keeps track of this number and who
    requested what. Interface (example):
    > bananumber -r -n bjornhs
    BAnaNumber
    ---------
    Request for ana number...
    User name: bjornhs
    Analysis number assigned: 0000123
    
    * This is then of course stored in a small database
    * Then run the analysis as above with output of a file, but with the
    option
    -A 123
    which auto-outputs the file anaConstants_0000123.txt and appends the
    ana-number in some way to the name of the outputfiles.
    (note that this allows for analysis on local networks - we don't want to
    be tied to the BNL-net for anything other than number requests)
    * Finally, return to bananumber and store the file in some central
    archive:
    > bananumber -n bjornhs -A 123 -f anaConstants_0000123.txt -c 'Analysis of
    FFS data with a wide cut in delta(beta)'
    BAnaNumber
    ---------
    Analysis number: 0000123
    Ascii file: anaConstants_0000123.txt
    Storing file and comment...
    Thank you.
    
    This of course requires a little effort ans work from the analyzers, but
    in a collaboration that is as speread out as this one I believe it would
    be worthwhile.
    
    So, on to the analysis itself. I have a very general suggestion which I
    throw in for discussion - feel free to slaughter the idea if you want.
    
    I have found the ntuples good to work with, also for final analysis. The
    requirement is of course that as much data as possible be avaliable
    through them, and this means that to avoid an unnecesearily large number
    of steps the program that makes the ntuples should be very general,
    modifiable and go all the way from dst to ntuple in one go. I.e. do event
    selection, track selection, pid, efficiency, maybe even acceptance, in one
    go.
    
    Could we here consider a new bratmain-like pipeline structure? Say we have
    a main program that has basic, general funtions for
    * preloop (and prepreloop and preprepreloop...)
    * event selection
    * track selection
    * pid
    * efficiency
    * geometry
    
    and that is run much like we run bratmain today, with a script. This
    script sets input and output files, lets you set a lot of constants,
    options, event selct methods etc., and you CAN include classes like
    BrPidOverloader *pidOverload = new BrPidOverloader("pid","my pid
    overloader"
    ... setters ...
    mainAna->AddOverloader(pidOverload)
    
    mainAna (BrModuleContainer or so) then checks if it has a module named
    "pid", a module named "eff" (i.e. hardcoded names!), and if so overloads
    the function.
    
    I think we could also include acceptances in this scheme if we want, by
    getting an acc number from the maps for each partilce based on its vertex,
    p etc.
    
    As far as I can see, if we include a preloop step or two the analysis
    should be modular enough to allow this kind of structure. It requires some
    heavy recoding, but I think it would be more clear in the end. Also, our
    entire analysis structure would the center on two main ideas:
    * bratmain + scripts, which work on event-by-event data
    * bratanamain + scripts, which work on trees/ntuples
    * a final ana script, which has to be written separatelt anyway...
    
    Well, those were a lot of very general ideas, and now it's lunchtime. I
    may come with some more specific input on the ana details a bit later, but
    please give the above a though (even a simple dismissive one ;-).
    
    Ping :-)
    
    --
    Bjorn H. Samset                           Phone: 22856465/92051998
    PhD student, heavy ion physics            Adr:   Schouterrassen 6
    Inst. of Physics, University of Oslo             0573 Oslo
                                  \|/
    ----------------------------> -*- <-----------------------------
                                  /|\
    


    This archive was generated by hypermail 2.1.5 : Wed Nov 20 2002 - 06:35:49 EST