Crash during startup
Hi! Checking out a fresh GiNaC-1.3 tree and configuring --disable-shared (after having called autogen.sh) leaves one with three crashing binaries in the test suite. It seems to be independend of compiler flags. A shared library works, as does ginsh. (@Jens and Cebix: I reproduced this on wino.) A stack backtrace indicates that GiNaC::basic::gethash() is being invoked on NULL: (gdb) run Starting program: /autofs/medium/home/kreckel/projects/GiNaC-1.4/check/checks Program received signal SIGSEGV, Segmentation fault. 0x08079827 in GiNaC::basic::gethash (this=0x0) at basic.h:254 254 if (flags & status_flags::hash_calculated) { (gdb) bt #0 0x08079827 in GiNaC::basic::gethash (this=0x0) at basic.h:254 #1 0x08078fcf in GiNaC::basic::is_equal (this=0x81feb70, other=@0x0) at basic.cpp:888 #2 0x08056bb2 in GiNaC::ex::is_equal (this=0xffffda30, other=@0x81fa5a4) at ex.h:399 #3 0x0805b2eb in GiNaC::ex::is_zero (this=0xffffda30) at ex.h:208 #4 0x08163bb8 in GiNaC::power::eval (this=0xffffdac0, level=1) at power.cpp:359 #5 0x0807e2d5 in GiNaC::ex::construct_from_basic (other=@0xffffdac0) at ex.cpp:287 #6 0x08050136 in ex (this=0x81fa8e4, other=@0xffffdac0) at ex.h:304 #7 0x08165c69 in GiNaC::power::evalf (this=0xffffdb60, level=0) at power.cpp:525 #8 0x081a279e in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at integral.cpp:206 #9 0x081a28a6 in global constructors keyed to _ZN5GiNaC8integral8reg_infoE () at container.h:130 #10 0x081a3825 in __do_global_ctors_aux () #11 0x0804dc19 in _init () #12 0x081a375b in __libc_csu_init () #13 0x557a57a2 in __libc_start_main () from /lib/tls/libc.so.6 #14 0xffffdc64 in ?? () Are we still having initialization order problems? Cheers -richy. -- Richard B. Kreckel <http://www.ginac.de/~kreckel/>
Happy new year! On Sun, 26 Dec 2004, I wrote:
Checking out a fresh GiNaC-1.3 tree and configuring --disable-shared (after having called autogen.sh) leaves one with three crashing binaries in the test suite. It seems to be independend of compiler flags. A shared library works, as does ginsh. (@Jens and Cebix: I reproduced this on wino.)
A stack backtrace indicates that GiNaC::basic::gethash() is being invoked on NULL:
(gdb) run Starting program: /autofs/medium/home/kreckel/projects/GiNaC-1.4/check/checks
Program received signal SIGSEGV, Segmentation fault. 0x08079827 in GiNaC::basic::gethash (this=0x0) at basic.h:254 254 if (flags & status_flags::hash_calculated) { (gdb) bt #0 0x08079827 in GiNaC::basic::gethash (this=0x0) at basic.h:254 #1 0x08078fcf in GiNaC::basic::is_equal (this=0x81feb70, other=@0x0) at basic.cpp:888 #2 0x08056bb2 in GiNaC::ex::is_equal (this=0xffffda30, other=@0x81fa5a4) at ex.h:399 #3 0x0805b2eb in GiNaC::ex::is_zero (this=0xffffda30) at ex.h:208 #4 0x08163bb8 in GiNaC::power::eval (this=0xffffdac0, level=1) at power.cpp:359 #5 0x0807e2d5 in GiNaC::ex::construct_from_basic (other=@0xffffdac0) at ex.cpp:287 #6 0x08050136 in ex (this=0x81fa8e4, other=@0xffffdac0) at ex.h:304 #7 0x08165c69 in GiNaC::power::evalf (this=0xffffdb60, level=0) at power.cpp:525 #8 0x081a279e in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at integral.cpp:206 #9 0x081a28a6 in global constructors keyed to _ZN5GiNaC8integral8reg_infoE () at container.h:130 #10 0x081a3825 in __do_global_ctors_aux () #11 0x0804dc19 in _init () #12 0x081a375b in __libc_csu_init () #13 0x557a57a2 in __libc_start_main () from /lib/tls/libc.so.6 #14 0xffffdc64 in ?? ()
Are we still having initialization order problems?
Chris, I'm afraid you introduced a new static initialization order problem when you sent us your integral.cpp file. You cannot initialize static ex integral::relative_integration_error like you do in integral.cpp:206. That does not take care of the initialization order. The ctor of class GiNaC::library_init is there to solve such problems. Please add a numeric and ex object representing the number 1e-8 in utils.h and utils.cpp and use that one instead. Would you please be so kind and sent a patch to this list for my review? Best wishes -richy. -- Richard B. Kreckel <http://www.ginac.de/~kreckel/>
On Wed, 5 Jan 2005, Richard B. Kreckel wrote: [...]
That does not take care of the initialization order. The ctor of class GiNaC::library_init is there to solve such problems. Please add a numeric and ex object representing the number 1e-8 in utils.h and utils.cpp and use that one instead.
Alternatives, if you absolutely want to keep utils.cpp/utilc.h decoupled from integral.cpp/integral.h: store that constant on the heap, if you prefer, and keep the pointer as POD. Or using a constant or numeric object without that ex wrapper might even work, too. After all, it's the attempt to use the complex power evaluator that leads to the crash in this case. Good night. -richy. -- Richard B. Kreckel <http://www.ginac.de/~kreckel/>
Dear Richy, Happy new year! On Wed, 5 Jan 2005, Richard B. Kreckel wrote:
Chris, I'm afraid you introduced a new static initialization order problem when you sent us your integral.cpp file. You cannot initialize static ex integral::relative_integration_error like you do in integral.cpp:206.
Hmmm... Wait a minute... If I understand correctly, this means that initialization of integral::relative_integration_error occurs before the initialization of the library_init of the same file integral.o as well as before the initialization of all other library_init objects in all GiNaCs other object files. Isn't this a bit strange??? I mean, can't the runtime environment (or whatever it is called) just initialize static objects in the same order as they are defined in the preprocessed C++-code??? If the order of static initialization were the same as the order of definition were the same, there would not be a problem, right?
Would you please be so kind and sent a patch to this list for my review?
I would suggest the change in the attached patch, since the functions that are used in this patch do not seem to involve any static variables. Unfortunately I do not know how to test this, because at my place no crash occured in the first place. Do you think this is sufficient to avoid a crash in all cases? Best wishes, Chris
Hi! On Thu, 6 Jan 2005, Chris Dams wrote:
Chris, I'm afraid you introduced a new static initialization order problem when you sent us your integral.cpp file. You cannot initialize static ex integral::relative_integration_error like you do in integral.cpp:206.
Hmmm... Wait a minute... If I understand correctly, this means that initialization of integral::relative_integration_error occurs before the initialization of the library_init of the same file integral.o as well as before the initialization of all other library_init objects in all GiNaCs other object files. Isn't this a bit strange???
Now that you mention it...
I mean, can't the runtime environment (or whatever it is called) just initialize static objects in the same order as they are defined in the preprocessed C++-code???
Sure. The language guarantees exactly that.
If the order of static initialization were the same as the order of definition were the same, there would not be a problem, right?
Right. But wait a minute! The problem comes from ex::is_zero() const in ex.h:208: There we have the inline member function bool is_zero() const { extern const ex _ex0; return is_equal(_ex0); } But have a look at utils.cpp. Initialization jumps from all the modules that include ex.h into the ctor of library_init and there all the numeric objects are initialized. But _ex0 is declared above that ctor and it is not initialized until the module utils.o is initialized itself! Just the jumps into that ctor do not as a by-product initialize all the static objects. If that analysis is correct there appears to be a loophole in the initialization order scheme. I wonder how that can be fixed without creating a big mess in utils.cpp...
Would you please be so kind and sent a patch to this list for my review?
I would suggest the change in the attached patch, since the functions that are used in this patch do not seem to involve any static variables. Unfortunately I do not know how to test this, because at my place no crash occured in the first place. Do you think this is sufficient to avoid a crash in all cases?
Your patch seems to work, thanks a lot! The patch below seems to fix the problem just as well. By virtue of ex::construct_from_double(double) it should be equivalent to your patch. If you have no objections, I'll commit it. diff -a -u -r1.2 integral.cpp --- integral.cpp 14 Oct 2004 15:36:45 -0000 1.2 +++ integral.cpp 6 Jan 2005 21:31:23 -0000 @@ -203,7 +203,7 @@ } int integral::max_integration_level = 15; -ex integral::relative_integration_error = power(10,-8).evalf(); +ex integral::relative_integration_error = 1e-8; ex subsvalue(const ex & var, const ex & value, const ex & fun) { Best wishes -richy. -- Richard B. Kreckel <http://www.ginac.de/~kreckel/>
Dear Richy, On Thu, 6 Jan 2005, Richard B. Kreckel wrote:
But wait a minute! The problem comes from ex::is_zero() const in ex.h:208: There we have the inline member function
bool is_zero() const { extern const ex _ex0; return is_equal(_ex0); }
But have a look at utils.cpp. Initialization jumps from all the modules that include ex.h into the ctor of library_init and there all the numeric objects are initialized. But _ex0 is declared above that ctor and it is not initialized until the module utils.o is initialized itself! Just the jumps into that ctor do not as a by-product initialize all the static objects.
Ah! Now, after some thinking, I think I understand it. _ex0 is initialized at the point when utils.o is initialized, which may well be after integral::relative_integration_error is intialized.
If that analysis is correct there appears to be a loophole in the initialization order scheme. I wonder how that can be fixed without creating a big mess in utils.cpp...
Well, it makes me wonder why there is such a thing as a library_init. After all, it appears that just writing ex integral::relative_integration_error = 1e-8; is also allowed. Why is not that done for all static objects, such as _ex0? The only thing is that other static objects, should not be using _ex0, as I did with my .evalf(). As far as I understand library_init was introduced to solve this "problem", however is the problem solvable at all? After all, _ex0 exists because it is initialized in some *.o file and, I think, it should not be initialized in multiple *.o files, because that would cause errors for multiple definitions. Since nobody appears to be guaranteeing anything about the order in which the different *.o files are intialized, _ex0 should not be used in a static object anywhere, no matter what kind of initialization gymnastics you are going to add to this.
Your patch seems to work, thanks a lot! The patch below seems to fix the problem just as well. By virtue of ex::construct_from_double(double) it should be equivalent to your patch. If you have no objections, I'll commit it.
+ex integral::relative_integration_error = 1e-8;
Fine with me! Best wishes, Chris
Hi! On Fri, 7 Jan 2005, Chris Dams wrote:
Ah! Now, after some thinking, I think I understand it. _ex0 is initialized at the point when utils.o is initialized, which may well be after integral::relative_integration_error is intialized.
This appears to be the problem.
If that analysis is correct there appears to be a loophole in the initialization order scheme. I wonder how that can be fixed without creating a big mess in utils.cpp...
Well, it makes me wonder why there is such a thing as a library_init. After all, it appears that just writing
ex integral::relative_integration_error = 1e-8;
is also allowed. Why is not that done for all static objects, such as _ex0?
Because of object sharing when there are more than one ex representing the same number. That's an efficiency argument.
The only thing is that other static objects, should not be using _ex0, as I did with my .evalf(). As far as I understand library_init was introduced to solve this "problem", however is the problem solvable at all?
Theoretically, by making sure the library_init ctor constructs it as well. Maybe using placement new or some such trick.
After all, _ex0 exists because it is initialized in some *.o file and, I think, it should not be initialized in multiple *.o files, because that would cause errors for multiple definitions.
It is initialized in utils.o since it is defined in utils.cpp. Note that it is correctly declare extern in all other translation units.
Since nobody appears to be guaranteeing anything about the order in which the different *.o files are intialized, _ex0 should not be used in a static object anywhere, no matter what kind of initialization gymnastics you are going to add to this.
Huh? Right now the gymnastics is not enough! That doesn't mean that there is no proper gymnastics.
Your patch seems to work, thanks a lot! The patch below seems to fix the problem just as well. By virtue of ex::construct_from_double(double) it should be equivalent to your patch. If you have no objections, I'll commit it.
+ex integral::relative_integration_error = 1e-8;
Fine with me!
Committed. Regards -richy. -- Richard B. Kreckel <http://www.ginac.de/~kreckel/>
Dear Richy, On Fri, 7 Jan 2005, Richard B. Kreckel wrote:
Theoretically, by making sure the library_init ctor constructs it as well. Maybe using placement new or some such trick.
Yes, that might be made to work. I hadn't thought about placement new. Another thing: I think it is wrong that count is a static int of library_init. This is given a value in utils.cpp, which may be after the point that library_init::library_init() does its work. Shouldn't it be a static of the constructor library_init::library_init()? The destructor isn't doing much anyway. Here's a patch. Best wishes, Chris
On Sat, 8 Jan 2005, Chris Dams wrote:
On Fri, 7 Jan 2005, Richard B. Kreckel wrote:
Theoretically, by making sure the library_init ctor constructs it as well. Maybe using placement new or some such trick.
Yes, that might be made to work. I hadn't thought about placement new.
It's the same idea as std::ios_base::Init in the C++ standard [27.4.2.1.6].
Another thing: I think it is wrong that count is a static int of library_init. This is given a value in utils.cpp, which may be after the point that library_init::library_init() does its work. Shouldn't it be a static of the constructor library_init::library_init()?
No, it's a POD integral type which comes pre-initialized by the linker. So making it a static member is okay.
The destructor isn't doing much anyway.
That's true. (And different from the standard library which has to flush cout, clog, etc.)
Here's a patch.
Your patch looks very interesting but something is wrong with it, although I am unable to see what. It does not fix the original problem if I revert the patch in integral.cpp. All the tests still segfault. Also, what's the point in self-assigning those pointers where they should really just be declarations? And then, the static member variable initialization is actually okay, as explained above. I wonder why you are unable to reproduce the bug. For me, it works as follows: 1) Revert the patch in integral.cpp 2) export CXXFLAGS="-ggdb" (Well, several others work too, possibly any and all) 3) ./configure --disable-shared 4) make 5) make check Since with GNU ld, initialization depends on the order given on the link line, please check your link line with mine. Here is how ginac/.libs/libginac.a is created: ar cru .libs/libginac.a add.o archive.o basic.o clifford.o color.o constant.o ex.o expair.o expairseq.o exprseq.o fail.o fderivative.o function.o idx.o indexed.o inifcns.o inifcns_trans.o inifcns_gamma.o inifcns_nstdsums.o integral.o lst.o matrix.o mul.o ncmul.o normal.o numeric.o operators.o power.o registrar.o relational.o remember.o pseries.o print.o structure.o symbol.o symmetry.o tensor.o utils.o wildcard.o input_parser.o input_lexer.o And here is how the segfaulting checks/checks executable is created: g++ -O -o checks check_numeric.o check_inifcns.o check_matrices.o check_lsolve.o genex.o checks.o ../ginac/.libs/libginac.a -L/usr/lib /usr/lib/libcln.so /usr/lib/libgmp.so Best wishes -richy. -- Richard B. Kreckel <http://www.ginac.de/~kreckel/>
participants (2)
-
Chris Dams
-
Richard B. Kreckel