Re: Possible memory leak? (was: update on fast mult benchmark (fwd))

24 Mar 2002

      Richard B. Kreckel wrote:
...
Hi,
On Sat, 23 Mar 2002, Pearu Peterson wrote:
...
I have tried to run Richard Fateman's multiplication benchmark also with
GiNaC (1.0.2,1.0.6)
<<<snip>>>
...
The culprit is mul::expand(), where two expanded polynomials (class
add), one with n1 terms and another with n2 terms are first written out by
brute force in an n1*n2 vector.  This is then returned through an add
constructor which sorts it (calling expairseq::canonize()) and then
compacitifies it (calling expairseq::combine_same_terms_sorted_seq()).
This seems to me to be a design that should be revisited.  For
the univariate case it is particularly bad since multiplying polynomials
of degree u and v gives you only degree u+v,  and you will have
allocated (u+1)*(v+1) cells.
...
For two large input polynomials this is clearly bogus, because it does not
allow for early cancellations.  It was done this way because the compexity
of sorting is asymptotically ideal (n*log(n) comparisons).  This seems
to have been a short-sighted decision.
The sorting cost is going to be dominated by the coefficient operations
  most likely, even if it is n^2.  Note that Maple doesn't bother to
sort at all.  I think that if you test the others against sorting of
Maple's results, Maple will look far worse.
...
I can think of a gazillion ways how to repair this and will try to provide
a fix really soon now.  It needs some careful balancing, however.  
(Incidentally, I already had some trial-code in my private tree which
performs much better.)
To me it appears that this test per se does not tell us very much about
representation.  Besides questions of representation, the actual
implementation of expand() might differ between the systems compared by
Richard Fateman.
There are not too many ways of multiplying sparse polynomials without
using u*v multiplies and  about  u*v -size(answer) adds.   For dense
polynomials in a finite field, FFT can be used.  My guess is that
an FFT approach would not be faster on such a small ( :) ) problem
as this one.  But a careful implementation might win.  In order
to represent the inputs and the outputs, the FFT would probably
have to be done quite a few times since the bignum coefficients
would need to be built up by Chinese Remainder.

 In particular, I am amazed to see the difference between
...
Maple, Mathematica and MuPAD being so tiny.  This contradicts my own
experience with handling polynomials in these three systems.  The pattern
t(MuPAD) > t(Maple) > t(Mma) looks reasonable, but the differnece is
usually *much* bigger.
Geez, and we thought polynomial expansion was trivial...
Incidentally, you should be able to do  w^2 much faster than w*w.

RJF
...
Regards
     -richy.
PS: Richard: 1234567890 < (2^31)-1, on page 1, third paragraph.
Thanks. I changed it.   That number is large enough so that in my
Lisp it is not a single-word "fixnum" .. some bits are used for
type tags. I don't know anything about GiNaC internals, but
  such design decisions involving a unified treatment of arbitrary
size integers can affect time, programmer convenience, data size
etc. in a very complicated way.

When this problem works on GiNaC I'd like to know its speed!
RJF