On Mon, Aug 11, 2008 at 05:09:30PM +0200, Burcin Erocal wrote:
Are you denying that GiNaC is the best "symbol manipulation engine" I can find?
That depends on a definition of a "symbol manipulation engine", i.e. what kind of features do you expect from it (i.e. GiNaC does not implement polynomial factorization, which is necessary for symbolic integration). Also to use GiNaC (or Maxima) in an efficient manner you need to forget your Mathematica (Maple, whatever) experience and learn from scratch.
Your statements also seem to suggest that there is something wrong with interactive use.
GiNaC is not designed for this kind of use. Interactive manipulation is kind of pointless if the expressions in question consists of 10^6 -- 10^8 of terms, isn't it?
OK, so I chose a bad example for a benchmark, and this was pointed out before.
I think the benchmark is OK. My point is a bit different: the method you've used to construct the sum is very inefficient, at least for GiNaC and Maxima. Apparently it's also inefficient for Mathematica (I guess it's also inefficient for almost any CAS too, but I can't tell for sure): $ cat mma_bench1.m A = 0; For[i = 0, i <= 10000, i = i + 1, A = A + (x+y)^i; ]; $ time math < mma_bench.m >/dev/null real 0m15.535s user 0m15.509s sys 0m0.016s Hence Mathematica provides a more efficient method, it's called Sum: $ cat mma_bench2.m A = Sum[(x+y)^i, {i, 0, 10000}]; $ time math < mma_bench2.m >/dev/null real 0m0.035s user 0m0.028s sys 0m0.004s The point is: although both methods are correct, the non-obvious one performs _3 orders of magnitude_ better. Likewise, GiNaC (and Maxima) also provide efficient methods for doing some operations. You might want to learn them before doing some changes to data structures.
When I said modifying data structures, I meant that you can keep the arguments of an operator in a different structure, instead of a list.
I understand (BTW GiNaC uses arrays for that, not lists).
This would probably only make sense on very large expressions, and at this point, I don't know what the penalty for smaller expressions would be.
I don't agree. I think the large expression should be kept as flat as possible for (at least) two reasons: 1. Otherwise eval() and other inherently recursive functions/methods would run out of stack. 2. Any other structure requires more memory to store the same terms.
As for different ways of storing the arguments of an operator, we can look at methods for storing and calculating with polynomials. One method is to use a hash table to keep the monomials
The same method was used in ancient versions of GiNaC. That was a bad idea. It makes code much more complicated and does not really improve performance (in some situation it's even worse). N.B. You might want to grep GiNaC source for EXPAIRSEQ_USE_HASHTAB. To reiterate, a flat N-ary tree (which is what GiNaC::expairseq essentially is) seems to be the best way to store big expressions.
I got the above timing on this machine:
$ cat /proc/cpuinfo | grep name model name : Intel(R) Pentium(R) D CPU 3.40GHz
Although the build time depends on optimization options and a number of other factors, 25 minutes is way too long for such a box (even if you build both static and shared libraries). Unless you use some insane optimization options, this looks like a compiler and/or linker bug (see e.g. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=376066). What are exactly the compiler and binutils versions (as in g++ --version; ld --version)? Anyway, there are a number of ways to speed up the build. Jens already mentioned --disable-static. Another useful option is to run build in a parallel manner, i.e. invoking make as make -j N where N \approx number of CPU cores + 1, and appending '-pipe' option to CXXFLAGS. Note: this is useful even if your CPU is single core, because CLN consists of lots of small source files. Yet another method is to use a fast(er) shell, such as ash or pdksh, i.e. SHELL=/bin/ash CONFIG_SHELL=/bin/ash /bin/ash configure --disable-static This works because GNU libtool (which is responsible for building GiNaC and CLN, as well as a number of other Free and even not so free libraries) is a shell script. Of course, ash is not suitable as an interactive shell, but it's very good at parsing/running big scripts (such as GNU libtool). @developers: libtool version 2.2 is reportedly much faster than 1.5 (which is what we use), so we might want to upgrade to that version when releasing next version of GiNaC. Best regards, Alexei -- All science is either physics or stamp collecting.