Hi Kris et al, On Fri, 30 Aug 2002 15:46:07 -0500 Kris Hagel <hagel@comp.tamu.edu> wrote concerning "Re: What is the Bus error?": > Me the guru the implication being on linux HA Ha. Yeah, I didn't quite get that either :-) > Anyway, Andrei was in my office 30 seconds before he shot off the > mail and my response was that I had never seen that on anything > except a Motorola VME processor at which it meant something quite > different than I assume this message here does (assuming this > message means anything at all). Oh it does. See the glibc info pages: - Macro: int SIGBUS This signal is generated when an invalid pointer is dereferenced. Like `SIGSEGV', this signal is typically the result of dereferencing an uninitialized pointer. The difference between the two is that `SIGSEGV' indicates an invalid access to valid memory, while `SIGBUS' indicates an access to an invalid address. In particular, `SIGBUS' signals often result from dereferencing a misaligned pointer, such as referring to a four-word integer at an address not divisible by four. (Each kind of computer has its own requirements for address alignment.) The name of this signal is an abbreviation for "bus error". See also this [1], which I found be a search on "bus+error+linux" on google. > I have to admit, though, that I was not smart enough to suggest gdb. > I do feel, however, (and I told Andrei) that it smells like an > array overflow somewhere. See, you were right. If you'd checked the manual you'd know you were right :-) > In that case gdb may or may not be of use because jobs can run well > past the point where an array overflow occurs before it finds > something it doesn't like and give the random errors. Not in the case of SIGSEGV and SIGBUS. That dereferencing invalid memory always creates an immediate signal. And, you'll see that in GDB. The only time I got a SIGBUS was from Netscape (bloat-ware - use Galeon), and from Objectivity. In the latter case it was because I was working on a Red Hat 6.2 machine with glibc 2.1, and Objectivity was linked using glibc 2.0 (or something like that anyway). It took me a week or so to figure that one out! (I think Flemming, Kris, Konstantin and a few others remember my desperation.) The symptom was that the program wouldn't even start - hey, it wouldn't even go to the main function. > Anyway, it is undoubtly worth a try. When you do use GDB, use `backtrace' to see where the problem was, and go `up' until you hit it. Then `list' the code at that point, and start `print'ing the variables and addresses to see which variable had the problem. Make sure you've initialised everything in the CTOR, and that you call all needed Init member functions, and ladida. (Oh, and the names of the commands are not even cryptic at all!). > Gee do I miss VMS where one could specify /check=all + an > understandable debugger and things could be debugged in a > straightforward manner and the error message was related to what the > error was and not something semi-random. If you had the bother to read the manual, then you'd know what the `cryptic' message meant. It's true you can not get bounds check with GCC, and I doubt very much with almost any C/C++ compiler. With Fortran you can, as the language gives another set of quaranties than C/C++. Anyway, the way to get make arrays in C++, is _not_ to do int a[10]; int* a = new int[10] but rather use some sort of (possibly templated) container, like class IntArray { private: size_t _n; int* _data; public: IntArray(int n, int initVal=0) { _n = n; _data = new int[_n]; } ~IntArray() { delete [] _data; } int operator[](size_t i) const { if (i >= _n || i < 0) throw out_of_range("index out of range"); return _data[i]; } int& operator[](size_t i) { if (i >= _n || i < 0) throw out_of_range("index out of range"); return _data[i]; } ... } typedef valarray<int> IntArray; Also, plain C strings are depreciated for the same reason - instead use std::string objects. Or, if you really want the language to take care of all this for you, then you should use Java. I truely don't believe this to be an issue of the platform, rather than an language issue. An aside: Has anyone tried to compile BRAT with Intel's C++ compiler yet? > Kris > > P. S. The last statement is for Christian as I think he might be > getting lazy and I seldom fail to get him to write 10 pages of prose to > respond to statements like that and I think he needs to do that to keep > himself entertained over the weekend. Ha ha ha. Well, it's Monday, so no 10 pages of `why VMS is probably the second worse OS (after ... well, I don't need to say what do I?), and GNU/Linux is probably the second best after GNU/Hurd, and Fortran77/90/95 are at best a pain in the behind, Java is to slow, C is ugly, Intercal is funny, ML is cool if it could do more, C++ is bloody great, and _both_ vi and Emacs rule!' > Flemming Videbaek wrote: > > >Hi, > > > >Why don't you ask your local guru (Kris) to help you running the gdb on > >your job to see where it breaks > >this is really the only way. It probably made a core file. Which means you can do gdb <program name> core and go straight to the point of the signal. On Fri, 30 Aug 2002 18:03:28 -0400 (EDT) Andrey Makeev <makeev_a@rcf2.rhic.bnl.gov> wrote concerning "Re: What is the Bus error?": > the GDB output gives: > > (gdb) where ... > #9 0x40858000 in BrZdcSlewCalModule::Finish (this=0x8ae33b0) at > BrZdcSlewCalModule.cxx:329 ... > Looks like trouble is at > > BrZdcSlewCalModule::Finish (this=0x8ae33b0) at BrZdcSlewCalModule.cxx:329 > > But here is a copy from that module (with line numbers): > > 326: fCalibration->SetComment ("Slewpar1", "Generated by > BrZdcSlewCalModule: fit with a pol3 function"); > > 327: fCalibration->SetComment ("Slewpar2", "Generated by > BrZdcSlewCalModule: fit with a pol3 function"); > > 328: fCalibration->SetComment ("Slewpar3", "Generated by > BrZdcSlewCalModule: fit with a pol3 function"); > > 329: fCalibration->SetComment ("Slewpar4", "Generated by > BrZdcSlewCalModule: fit with a pol3 function"); > > 330: fCalibration->SetComment ("Slewpar5", "Generated by > BrZdcSlewCalModule: fit with a pol3 function"); First off, check that you have indeed allocated memory for the parameter "Slewpar4" via a Use message to the fCalibration object. Second off - and this is important, and I've raised that issue before: Do not make the comments automatically. The comment field is there for some genuinely useful information and _must_ be entered based on inspection of the quality of the calibrations by a human. Otherwise, that field has no meaning and should if anything be left empty (which it can't be :-). I twisted Djamel's arm until he saw reason and added the functionality to add comments, so take a look in the TOF calibration code. > and it doesn't give any clues why BE shows up, so m.b. Kris > is right, but I couldn't figure out at the moment any array > overflows in the code... It worked perfectly not long time > ago, and I haven't changed nothing in there. Did you usually link with the libNew.so array in ROOT? If so, then you probably have an uninitialised pointer somewhere. Are you using a newer compiler or something like that? (that is compared to when it worked). Yours, ____ | Christian Holm Christensen |_| | ------------------------------------------------------------- | | Address: Sankt Hansgade 23, 1. th. Phone: (+45) 35 35 96 91 _| DK-2200 Copenhagen N Cell: (+45) 24 61 85 91 _| Denmark Office: (+45) 353 25 305 ____| Email: cholm@nbi.dk Web: www.nbi.dk/~cholm | | [1] http://www.uwsg.iu.edu/hypermail/linux/kernel/9902.1/0011.html
This archive was generated by hypermail 2b30 : Mon Sep 02 2002 - 07:45:30 EDT