On Mon, 11 Aug 2008 18:27:54 +0400 Alexei Sheplyakov <varg@theor.jinr.ru> wrote:
Hello!
On Mon, Aug 11, 2008 at 10:30:59AM +0200, Burcin Erocal wrote:
While GiNaC seems to be the best option for Sage,
GiNaC is a special purpose symbolic manipulation library. The purpose of GiNaC is to do "simple" manipulations with "large" expressions using a common programming language (as opposed to ill designed, unstable ones as offered by most of CASes). So, I don't think GiNaC is a good choice for Sage (which attempts to be a general purpose _interactive_ CAS).
I suggested using it in Sage for the symbolic manipulation engine only. I n your first two sentences, you seem to point out that GiNaC is good for this purpose. Sage relies on many other libraries to be "general purpose." It uses libSingular for multivariate polynomials, flint for univariate polynomials over integers, and so on. Are you denying that GiNaC is the best "symbol manipulation engine" I can find? Your statements also seem to suggest that there is something wrong with interactive use. GiNaC also provides support for this via ginsh, pyginac, etc.
I think all these problems can be resolved with some effort
Don't take me wrong, but I think it's better to put this effort elsewhere. I.e. most of "issues" you've mentioned are design decisions.
I am disappointed.
and even the speed of basic arithmetic in GiNaC can be improved with modifying it's data structures.
No matter what data representation you choose, there are always some expensive operations. Sure, one can optimize data structures so that some particular operation is fast (at the expence of slowing down other operations). So, to write a good code one need to be aware of the complexity of operations and use expensive operations consciously.
OK, so I chose a bad example for a benchmark, and this was pointed out before. When I said modifying data structures, I meant that you can keep the arguments of an operator in a different structure, instead of a list. This would probably only make sense on very large expressions, and at this point, I don't know what the penalty for smaller expressions would be. As for different ways of storing the arguments of an operator, we can look at methods for storing and calculating with polynomials. One method is to use a hash table to keep the monomials ,Magma, a commercial system that has the fastest polynomial arithmetic, uses this method. Another is to use GeoBuckets which is explained here: http://portal.acm.org/citation.cfm?id=275394 Singular, which happens to be the fastest free software library for multivariate polynomial arithmetic uses this. You can also use binary heaps as suggested here: http://portal.acm.org/citation.cfm?id=1086847 Michael Monagan and Roman Pierce use this method in the new blazingly fast multivariate polynomial libray they wrote for maple: http://www.cecm.sfu.ca/~mmonagan/papers/sdmp19.pdf <snip>
- cln duplicates functionality which is provided by different libraries already distributed with Sage (mpfr, mpir, flint)
The libraries duplicating CLN functionality [1] typically have several serious drawbacks: a. The syntax is really awkward (to put it very mildly). For example, instead of a simple z = x+a*y; one need to write several lines of unreadable code. b. The user is responsible for memory management. This is absolutely inacceptable.
None of these libraries gives any considerable performance gain, so why bother?
[1] As a matter of fact, CLN existed *long* before the libraries you've mentioned were developed.
at a considerable speed penalty.
Could you please give any benchmark backing up this claim?
- Creating symbolic functions at runtime is not supported.
That's a design decision.
- IIRC, GiNaC documentation suggests the printing order for the variables is stable between different sessions, regardless of creation order.
Could you please point out which document (and where) says that?
I might be remembering this wrong. I just put it down as an item from my notes while writing the wrapper. I don't think writing a pretty printer which sorts the terms would be hard.
The order of expressions (in particular, symbols) is not stable. This is a price for a fast comparison of expressions.
Anyway, since GiNaC is non-interactive, the order of terms doesn't actually matter: typically the output of a program is a small expression (quite often it is a single number), and the big intermediate expressions are not going to be examined by human (except for debugging).
- It takes ages to build The packages above took ~25 minutes to build on my desktop machine
Just for the record: building GiNaC and CLN takes ~30 minutes on my old Duron 800 box with 640 Mb of RAM. I wonder what kind of machine do you use as a desktop. That said, patches reducing the build time are wellcome (if they don't reduce the performance and don't make the code more complicated). <snip>
I got the above timing on this machine: $ cat /proc/cpuinfo | grep name model name : Intel(R) Pentium(R) D CPU 3.40GHz Cheers, Burcin