From: Christian Holm Christensen (cholm@hehi03.nbi.dk)
Date: Thu Nov 14 2002 - 09:13:14 EST
Hi Djam, Truls, Kris, et al, Djamel Ouerdane <ouerdane@nbi.dk> wrote concerning Re: Modifications to brop/monitor/abc [Thu, 14 Nov 2002 11:08:22 +0100 (CET)] ---------------------------------------------------------------------- > Hi guys, > > Can someone try with gcc 3.2 ? SIGSEGV has _nothing_ (absolutely NOTHING) to do with the compiler. SIGSEGVs occurs has a result of code trying to access none-allocated memory blocks- i.e., badly written code! Djam, you should really lay off your compiler-trip, and spend your time on something more productive, like putting tracking offsets into the DB :-) Truls Martin Larsen <t.m.larsen@fys.uio.no> wrote concerning Re: Modifications to brop/monitor/abc [Thu, 14 Nov 2002 11:01:21 +0100] ---------------------------------------------------------------------- > Hi Kris, > > I have seen the seg fault myself, but also before root 3.03.09 and > gcc 3.04. I have also spent a lot of time trying to figure out why > these popups produce this behaviour, but without any luck. If I > should come across any reason, I'll let you know. This seems to indicate a problem in either the client code (BROP) or in the library code (ROOT). > Kris Hagel wrote: > > > Hello, > > I completed (I think) the migration to signal/slot in the online > > monitor software. Cool. The signal/slot mechanism is far superior to the old `ProcessMessage' approach (GNOME--, Gtk-- uses the same approach for good reasons, and Qt has something that at least conceptually the same, but not as flexible as the Gtk-- approach though). > > This was necessitated by the fact that some of the routines > > still using the old message sending methods were not compiling > > after rootcint because of new "features" (I guess) of the > > 3.04 compiler on the piis. The new `features' of the GCC 3.x line of C++ compilers is that they are 99.99999% ISO/IEC standard compliant. Hence, if code did not compile with GCC 3.x, it most probably means that it wasn't valid C++ in the first place. > > Whatever the reason, the signal/slot is a cleaner way to do > > business. It's cleaner because the design is better. Rather than dispatching directly to the objects, there's a broker sitting in between making sure that stuff that needs to be handled is handled. Also, the code is cleaner as you no longer have to have a 2 or 3 nested `switch' statements - instead you have small member functions for each `event' you want to handle. Also you can do a lot more using the signal/slot approach. Unfortunately it's not type safe (at runtime or compile time) like libsigc++ used by Gtk--. > > There is one caveat. I (or the 3.04 compiler; or root 3.03.09) have > > manufactured a seg fault when a popup canvas is used by double > > clicking on a monitor pad. Everything in my tests appears to continue > > working after the seg fault, but I have the general idea that all seg > > faults are bad seg faults. Indeed they are. SIGSEGV indicates poorly written code. Have you checked that all member pointers are initialised to 0 and that you check all memory before dereferencing it? If you delete temporary objects, do you make sure that all references to those objects are deleted? Check if that the pop-up canvas gets a copy or a reference of the contained objects, and if they get a reference, make sure the objects live as long as the canvas is there. Simply checking the pointer when drawing the pop-up canvas is not be enough - the objects can be deleted later on in the event loop, out side of the canvas, leaving the canvas with an invalid pointer (SIGSEGV). Hence, you should make damn well sure that the objects in the pop-up is removed id they are deleted. Also watch out for double deletes. > > I did not manage to locate the problem and rationalized that not > > many people in brahms besides me use that feature anyway, so I > > went ahead and committed in the code. Normally, the one thing any developer can be sure of, is that if there's a feature in the application, it will be used, and most likely in a way the developer hasn't thought of. > > But a question to experts (Christian I guess, but anyone else if > > they know). How do I find where a seg fault happens? I tried > > with gdb and it says it is in InnerLoop and as far as I can tell > > it continues there after handling the signal (UNIX signal). First off, you obviously need to compile BROP with debugging symbols (I guess you did) - but to debug GUI code, you should also compile ROOT with debugging symbols (pass `--with-build=debug' to the ROOT `./configure' script), as a lot of the signal/slot handling is taking place in the library code and it's really helpful to be able to track through all of the signal/slot handling. Second off, to avoid having some symbols stripped off you're library/application you should set the optimisation level to no higher than 1 (which is done per default on SMP machines) - otherwise, the optimiser may remove symbols that isn't really used from your code. Having done all that, start your application with gdb: shell> gdb <application> (gdb) run <arguments to application> ... Program received SIGSEGV in ... (gdb) When you get to this point, make a backtrace (gdb) bt This will give you a stack trace of the execution. Here you can figure out how you got to what ever (member) function that gave the SIGSEGV. No go up the stack until you hit the first non-trivial (member) function. `non-trivial' means (member) functions other than the signal handler and related functions. (gdb) up ... Now start looking at the symbols in this piece of code. That is, print the addresses of pointers and ladida, and check if they are valid. (gdb) print foo ... (gdb) print *foo Now, this is the place where you're likely to run into trouble if you're using GCC 3.x. The problem is, that the GDB shipped with Red Hat 7.x doesn't really understand the run time ABI of GCC 3.x, and so you may not be able to inspect the memory pointed to by pointers. This is why I recommend using GCC 2.96-RH until we switch to Red Hat 8.x Djam, here's the only thing that has anything to do with the compiler: The ABI and debugging. Fact is, that if you want to use GCC 3.x, you damn well better update your Binutils, GDB, Glibc, and God knows what other pieces of software: in essence, switch to Red Hat 8.0. No one in their right mind would ever ship a compiler that creates SIGSEGV in client code at runtime - in fact, I think it would be hard to produce such a compiler. <VMS-rant> > > It makes me miss VMS again as I remember "back in the good old > > days" when a program crashed and the computer told you exactly where > > it crashed and why. </VMS-rant> That would imply that VMS binary code _always_ contained debugging symbols, with the associated performance penalty - you can't seriously believe that kind of a system to be superior to UNIX. It's the same `feature' you see on `modern' `operating systems' like Windoze: Binary code contains debugging symbols, and the signal handler starts a debugger that attaches to the process - that is just a plain waste of binary code and CPU cycles. Debuggers does not justify poor design. <rant> If you want the debugger to fire up automatically when the program receives a signal, you can write a signal-handler that does that, and install that as your signal handler. The pseudo-code would be: signal_handler() { pid = get_pid_of_process_that_got_signal(); name = get_process_name(); execute_debugger_and_attach(name, pid); } Compile the signal handler code into a shared library, say `mysighandler.so', always compile your code with `-g' and set the environment variable `LD_PRELOAD' to point at `mysighandler.so'. Now, your signalhandler will start gdb each time an application gets a signal. </rant> > > Who can help me? I hope the above will give you some ideas. > > I spent a few hours coverting to signal/slot and 2 1/2 days chasing > > this stupid seg fault and in the end have nothing to show for it. Ah, the joy of error handling. That's what you get when you do programs that are slightly more complicated than a simple `hello world' demo :-) Yours, ___ | Christian Holm Christensen |_| | ------------------------------------------------------------- | | Address: Sankt Hansgade 23, 1. th. Phone: (+45) 35 35 96 91 _| DK-2200 Copenhagen N Cell: (+45) 24 61 85 91 _| Denmark Office: (+45) 353 25 305 ____| Email: cholm@nbi.dk Web: www.nbi.dk/~cholm | |
This archive was generated by hypermail 2.1.5 : Thu Nov 14 2002 - 09:14:00 EST