On Wed, Mar 21, 2007 at 11:25:21AM +0100, Stefan Weinzierl wrote:
I think you miss the point. A typical situation were the generation of C code is useful, is the following: You use GiNaC to compute symbolically a rather large expression which depends on a handful symbols. You then want to evaluate this expression numerically a large number of times for different values of the variables/symbols ( a typical example is Monte Carlo integration).
There are two large quantities in the game: The size of the expression and the number of evaluations.
Yes, if one about to calculate the expression ~ 10^N times, compiling it might give considerable speedup.
I believe an efficient way to do this is to write the expression as C code into a file, compile and reload the compiled function into memory.
There is certainly an overhead involved for the time needed to generate the C code and the compilation, but this occurs only once and is usually outweighted by the speed improvement of a single evaluation.
That depends.
To give an example I took an expression,
If it is not top secret, could you please post it here? Otherwise it is difficult to draw any conclusions. It is easy to construct an expression such that optimizer 1) won't changed them at all; 2) will die due to exhausted memory (or my patience, or both); 3) will make them even worse.
which I want to evaluate 10^6 times. A single evaluation with evalf() takes 0.55 s.
Well, things are not that simple. One can do partial evaluation with GiNaC (which is difficult in plain C/C++). For example, simple-minded code would do double x[N]; double y[N]; ex e = // some huge expression depending on x and y. double ed[N*N]; for (size_t i=0; i<=N-1; i++) for(size_t j=0; j<=N-1; j++) ed[i+j*(N-1)] = e.subs(lst(x==x[i], y==y[j])).evalf().to_double(); Smarter code[r] would do for (size_t i=0; i<=N-1; i++) { ex tmp = e.subs(x==x[i]).evalf(); for (size_t j=0; j<=N-1; j++) ed[i+j*(N-1)] = tmp.subs(y==y[j]).evalf().to_double(); } Such a technique is a viable alternative to dumping expression as a C code and compiling it. I admit it is almost useless when evaluation points are random (as in Monte Carlo).
If I use the plain print_csrc routine build into GiNaC, the code is generated in less than a second, compilation takes 74s, but the time for a single evaluation improves to 3.8 10^-3 s.
If I take the routine for the generation of optimized code, code generation takes 49s (yes, code optimization has a price, but you pay only once),
And the price grows faster-then-exponentially with the size of the expression...
compilation takes 7 s, and the time for a single evaluation is now 3.2 10^-4 s.
So the total cost for one million evaluations is evalf() : 550000 s standard csrc: 3875 s optimized: 376 s
Therefore I don't understand why you propose evalf(), the difference in cost are orders of magnitude.
Because the problem was (is?) not stated accurately enough. Anyway, I do agree that "optimizer" improves _some_ expressions.
A few more comments: I agree that GiNaC is not a compiler, but a nice feature is that it is based on C++ and should therefore allow the combination of symbolic and numerical code.
Yes, it allows that. One can use any C/C++ (and with some labour F*****N) libraries _without_ dumping expressions into C/C++/whatever code.
Generating optimized C code is a feature I miss in GiNaC.
To some extent GiNaC was desinged to _avoid_ code generation: "This border line between analytical and numerical methods, however, has quite often remained a painful obstacle in the successful application of computer algebra systems (CASs). Usually, some results are obtained with the help of a CAS and these results are integrated later into some other program. This is not only restricted to numerical results, where one frequently translates the output of some CAS to C or even FORTRAN. It is questionable whether the distinction between one language for writing down symbolical algebra, one for obtaining numerical results and maybe a third one for integrating everything in a graphical user interface has any reason other than an historical one." [cs.sc/0004015] However, GiNaC does support code generation, and even got that evil GiNaC::excompiler thing...
Finally, one should make the attempt to generate "the" optimal code, anything which performs better than the standard csrc output is already an improvment.
It performs better for _particular kind of expressions you happen to work with_. I doubt it will improve something in general case. That was point of my previous post (mostly, I don't like the particular implementation you've suggested, but that's another story). That said, I've got a similar "optimizer" too. :) Basically, it collect()s certain type of (sub)expressions and runs normal() on coefficients. (BTW your code would look simpler if you use collect()). It works very good for some kinds of expressions, but in general case it just sucks horribly. Best regards, Alexei -- All science is either physics or stamp collecting.