Re: C++ code generation: any way of deflating large expressions?
Hello, I also worked on such a program : I have to convert huge expressions in optimized C++ code (to automate design of finite element code). I tried to make something which does not calculate two times the same thing. There's an algorithm to use the less temporary variables as possible (in order to help the compiler to optimize). It also looks for factorizations with * and + (A=x*y; B=sin(x*y*z) -> B=sin(A*z)... ) and for the cost of operations (A=x/y; B=z/y; gives double tmp0=1/y; A=x*tmp0; B=z*tmp0)... It can generate code for huge expressions in a few seconds. It is a bit crappy because it does not use the internal representation given by GiNaC. But if someone wants to clean it... For now it is usable enough for me. A simple example : Optimized_ex_output mm("float"); symbol x ("x"), y ("y"), z ("z"), u("u"), v("v"); mm.add ("A{1]", x/u*v, true); mm.add ("B", (x+y)/u*v+cos((x+y+z)*y), false); mm.write(std::cout,2); // 2 spaces before each line mm.display_tree(); // graph output produces : float reg0 = v; float reg1 = z; float reg2 = u; float reg3 = x; float reg4 = y; float reg5 = reg3; reg3 += reg4; reg2 = 1 / reg2; float reg6 = reg2; reg2 *= reg3; reg2 *= reg0; reg0 *= reg6; reg3 += reg1; reg4 *= reg3; reg0 *= reg5; A{1] += reg0; reg4 = cos( reg4 ); reg4 += reg2; float B = reg4; (and a nice dot graph) In the near future, I've planed to allow generation of vectorial code to produce 4x the same calculi on different data. I've also planed to generate code for GPUs but not for the near future. Hugo LECLERC.
Oops... There was a mistake in the files I've attached (in the scheduling part, in order to test the effectiveness...). I send a correct version. Hugo LECLERC
Dear Hugo, On Thu, 10 Jul 2003 hugo_lec@club-internet.fr wrote: [...]
There's an algorithm to use the less temporary variables as possible (in order to help the compiler to optimize).
[...] Could you please elaborate a little bit on that statement? As it stands I find it slightly irritiating. Is this assessment based on real experience? Given a compiler that transforms its parsed tree into SSA form at some stage (as GCC 3.5 will), besides those phi-expressions tons of temporaries are generated anyway in the IR. More or less by definition, later optimizations must cope with and reduce them. So: is there a point in trying to keep the number of temporaries low or is that just fiction? Maybe an example of the effect you are alluding at would be helpful. Cheers -richy. -- Richard B. Kreckel <Richard.Kreckel@GiNaC.DE> <http://www.ginac.de/~kreckel/>
participants (3)
-
Hugo Leclerc
-
hugo_lec@club-internet.fr
-
Richard B. Kreckel