Richard B. Kreckel wrote:
Hi,
On Sat, 23 Mar 2002, Pearu Peterson wrote:
I have tried to run Richard Fateman's multiplication benchmark also with GiNaC (1.0.2,1.0.6)
<<<snip>>>
The culprit is mul::expand(), where two expanded polynomials (class add), one with n1 terms and another with n2 terms are first written out by brute force in an n1*n2 vector. This is then returned through an add constructor which sorts it (calling expairseq::canonize()) and then compacitifies it (calling expairseq::combine_same_terms_sorted_seq()).
This seems to me to be a design that should be revisited. For the univariate case it is particularly bad since multiplying polynomials of degree u and v gives you only degree u+v, and you will have allocated (u+1)*(v+1) cells.
For two large input polynomials this is clearly bogus, because it does not allow for early cancellations. It was done this way because the compexity of sorting is asymptotically ideal (n*log(n) comparisons). This seems to have been a short-sighted decision.
The sorting cost is going to be dominated by the coefficient operations most likely, even if it is n^2. Note that Maple doesn't bother to sort at all. I think that if you test the others against sorting of Maple's results, Maple will look far worse.
I can think of a gazillion ways how to repair this and will try to provide a fix really soon now. It needs some careful balancing, however. (Incidentally, I already had some trial-code in my private tree which performs much better.)
To me it appears that this test per se does not tell us very much about representation. Besides questions of representation, the actual implementation of expand() might differ between the systems compared by Richard Fateman.
There are not too many ways of multiplying sparse polynomials without using u*v multiplies and about u*v -size(answer) adds. For dense polynomials in a finite field, FFT can be used. My guess is that an FFT approach would not be faster on such a small ( :) ) problem as this one. But a careful implementation might win. In order to represent the inputs and the outputs, the FFT would probably have to be done quite a few times since the bignum coefficients would need to be built up by Chinese Remainder. In particular, I am amazed to see the difference between
Maple, Mathematica and MuPAD being so tiny. This contradicts my own experience with handling polynomials in these three systems. The pattern t(MuPAD) > t(Maple) > t(Mma) looks reasonable, but the differnece is usually *much* bigger.
Geez, and we thought polynomial expansion was trivial...
Incidentally, you should be able to do w^2 much faster than w*w. RJF
Regards -richy.
PS: Richard: 1234567890 < (2^31)-1, on page 1, third paragraph.
Thanks. I changed it. That number is large enough so that in my Lisp it is not a single-word "fixnum" .. some bits are used for type tags. I don't know anything about GiNaC internals, but such design decisions involving a unified treatment of arbitrary size integers can affect time, programmer convenience, data size etc. in a very complicated way. When this problem works on GiNaC I'd like to know its speed! RJF