Term ordering and compiling C++ code

older
Re: [GiNaC-list] Term ordering and...

Doug

13 May 2010 13 May '10

4:11 p.m.

Hi, I'm running into a term ordering problem like the one mentioned in this post. http://permalink.gmane.org/gmane.comp.mathematics.ginac.general/1107 Relevant details about my application are below, for the curious. The short of it, though, is that the C++ code I am generating with ginac is so large (50MB) that the compiler can't handle it, and moreover, the values being represented by ginac symbols are array accesses that the compiler can't optimize away. So I need to create temporary values and do sub-expression elimination on the C++ code just so the compiler can have a chance. I have successfully done this using search-and-replace on the C++ code for other applications, but I guess I was just lucky that ginac printed terms in a consistent order. Is this a problem other ginac users have confronted? From the answer to the above post, it sounds like I need to write my own print function that can ensure expressions are printed in a predictable order. It looks like I can also inspect the expression tree and possibly do a proper job of common sub-expression elimination. But if there is an easier approach, I'd be very grateful for a pointer or two. Thanks, Doug Baker ------------------- Background Details ------------------- My application is related to visual odometry. I am computing the gradient and hessian for a function of 10 variables that involves converting quaternions to rotation matrices, applying those matrices to 3D points and projecting those points onto a 2D plane. I am generating code with ginac that is initially 50MB of C++ code. (And thank you for ginac being able to handle that monstrosity with such grace.) The input symbols are all array elements (e.g. symbol x("state[t+X]")) or function calls (e.g. symbol rTi11("rTi.getValue<1,1>()")). If I take the printed output from ginac as is, the C++ compiler can't properly optimize it because it doesn't know that state[t+X] is a constant or that getValue is constant. Moreover, the expressions have huge repeated expressions, and these bog the compiler down, too. I've been able to use a simple search-and-replace approach in other problems, and on this one that same approach has gotten me from 50MB to 6MB or so, but the term ordering issue is biting me now. For example, I have one sum (a+b+c+d) that occurs 232608 times. But there are 4! = 24 permutations of that sum that I'd have to replace. It's worse than that, though, because that's not the only expression like it. Some are bigger, like this: (-2.0*sK/sKSIJ+4.0*sK*sS2/pKSIJ2+-4.0*sI*sJ*sS/pKSIJ2). This has something like 72 permutations, not even counting the permutations of pKSIJ and pKSIJ2, which are themselves expressions like (a+b+c+d). And not only are there variations on this due to term ordering, similar expressions occur with different variables (e.g. with sJ instead of sK, etc.) So when all these expressions can be output in various orders, it's no longer feasible to generate all the permutations for search-and-replace. Support NPR 20 seconds at a time. www.twentysecondsatatime.org

Attachments:

attachment.html (text/html — 3.3 KB)

Show replies by date

jros

13 May 13 May

4:31 p.m.

Hello, Although I don't have a solution for your problem, as I'm myself addressing similar problems matching common subexpressions to variables, in down top manner, I think that such a functionality is implicitly implemented in GiNaC. If I understand GiNaC internal structure correctly, subexpressions common to two expresions, are frequently shared internally, to save memory. So it must be possible to write a print algorithm that goes trough an/some expression/s tree, and that replaces every shared subexpression (let say sum product) with a variable, that again is assigned a expression that would be printed in the same way using recursion. Probably allowing/disallowing some kind automatic simplifications (so that subexpression sharing expected value increases) can probably help to obtain improved results. I wonder what do the developers think about this. Cheers, Javier On Thu, 2010-05-13 at 07:11 -0700, Doug wrote:

...

Hi,

I'm running into a term ordering problem like the one mentioned in this post. http://permalink.gmane.org/gmane.comp.mathematics.ginac.general/1107

Relevant details about my application are below, for the curious. The short of it, though, is that the C++ code I am generating with ginac is so large (50MB) that the compiler can't handle it, and moreover, the values being represented by ginac symbols are array accesses that the compiler can't optimize away. So I need to create temporary values and do sub-expression elimination on the C++ code just so the compiler can have a chance. I have successfully done this using search-and-replace on the C++ code for other applications, but I guess I was just lucky that ginac printed terms in a consistent order.

Is this a problem other ginac users have confronted? From the answer to the above post, it sounds like I need to write my own print function that can ensure expressions are printed in a predictable order. It looks like I can also inspect the expression tree and possibly do a proper job of common sub-expression elimination. But if there is an easier approach, I'd be very grateful for a pointer or two.

Thanks, Doug Baker

------------------- Background Details -------------------

My application is related to visual odometry. I am computing the gradient and hessian for a function of 10 variables that involves converting quaternions to rotation matrices, applying those matrices to 3D points and projecting those points onto a 2D plane.

I am generating code with ginac that is initially 50MB of C++ code. (And thank you for ginac being able to handle that monstrosity with such grace.) The input symbols are all array elements (e.g. symbol x("state[t+X]")) or function calls (e.g. symbol rTi11("rTi.getValue<1,1>()")).

If I take the printed output from ginac as is, the C++ compiler can't properly optimize it because it doesn't know that state[t+X] is a constant or that getValue is constant. Moreover, the expressions have huge repeated expressions, and these bog the compiler down, too.

I've been able to use a simple search-and-replace approach in other problems, and on this one that same approach has gotten me from 50MB to 6MB or so, but the term ordering issue is biting me now. For example, I have one sum (a+b+c+d) that occurs 232608 times. But there are 4! = 24 permutations of that sum that I'd have to replace. It's worse than that, though, because that's not the only expression like it. Some are bigger, like this: (-2.0*sK/sKSIJ+4.0*sK*sS2/pKSIJ2 +-4.0*sI*sJ*sS/pKSIJ2). This has something like 72 permutations, not even counting the permutations of pKSIJ and pKSIJ2, which are themselves expressions like (a+b+c+d). And not only are there variations on this due to term ordering, similar expressions occur with different variables (e.g. with sJ instead of sK, etc.) So when all these expressions can be output in various orders, it's no longer feasible to generate all the permutations for search-and-replace.

Support NPR 20 seconds at a time. www.twentysecondsatatime.org

_______________________________________________ GiNaC-list mailing list GiNaC-list@ginac.de https://www.cebix.net/mailman/listinfo/ginac-list

Doug

8:47 p.m.

Javier, All, I think what we need can be done with two print functions, as described below. It seems straightforward enough that I can implement this myself, but I wonder if it already exists somewhere? Is this something the developers would consider adding to Ginac? Thanks, Doug --------------------- Perhaps this is a clean solution. One could write a print function that outputs C++ code that explicitly names and declares sub-expressions. You also have a companion print function that just emits the name of the top-level expression. Call them print_declarations() and print_variable_name(). For example, suppose you have an expression like this: ex myex = a + b*pow(c+d+e,2.0) + f/(g*h + i*j); For simplicity's sake, let each symbol just be a variable (e.g. symbol a("my_a")), but of course they can be anything. print_declarations() can emit these declarations below during a post-order traversal of the expression tree. // Symbols double a = my_a; double b = my_b; [...] double j = my_j; // Expressions double e1 = a; double e2 = b; double e3 = c; double e4 = d; double e5 = e; double e6 = e3 + e4 + e5; double e7 = pow(e6, 2.0); double e8 = e2 * e7; double e9 = f; double e10 = g; double e11 = h; double e12 = e10*e11; double e13 = i; double e14 = j; double e15 = e13*e14; double e16 = e12 + e15; double e17 = e9 / e16; double e18 = e1 + e8 + e17; print_variable_name() can then just emit e18 for the example expression, to be used however you would normally use C++ code that you print. Then, assuming ginac is smart about not having separate expression tree nodes for duplicated sub-expressions, you will declare each sub-expression just once. The order ginac uses for expressions does not matter at all. Support NPR 20 seconds at a time. www.twentysecondsatatime.org --- On Thu, 5/13/10, jros <jros@unavarra.es> wrote: From: jros <jros@unavarra.es> Subject: Re: [GiNaC-list] Term ordering and compiling C++ code To: "GiNaC discussion list" <ginac-list@ginac.de> Date: Thursday, May 13, 2010, 10:31 AM Hello, Although I don't have a solution for your problem, as I'm myself addressing similar problems matching common subexpressions to variables, in down top manner, I think that such a functionality is implicitly implemented in GiNaC. If I understand GiNaC internal structure correctly, subexpressions common to two expresions, are frequently shared internally, to save memory. So it must be possible to write a print algorithm that goes trough an/some expression/s tree, and that replaces every shared subexpression (let say sum product) with a variable, that again is assigned a expression that would be printed in the same way using recursion. Probably allowing/disallowing some kind automatic simplifications (so that subexpression sharing expected value increases) can probably help to obtain improved results. I wonder what do the developers think about this. Cheers, Javier On Thu, 2010-05-13 at 07:11 -0700, Doug wrote: Hi, I'm running into a term ordering problem like the one mentioned in this post. http://permalink.gmane.org/gmane.comp.mathematics.ginac.general/1107 Relevant details about my application are below, for the curious. The short of it, though, is that the C++ code I am generating with ginac is so large (50MB) that the compiler can't handle it, and moreover, the values being represented by ginac symbols are array accesses that the compiler can't optimize away. So I need to create temporary values and do sub-expression elimination on the C++ code just so the compiler can have a chance. I have successfully done this using search-and-replace on the C++ code for other applications, but I guess I was just lucky that ginac printed terms in a consistent order. Is this a problem other ginac users have confronted? From the answer to the above post, it sounds like I need to write my own print function that can ensure expressions are printed in a predictable order. It looks like I can also inspect the expression tree and possibly do a proper job of common sub-expression elimination. But if there is an easier approach, I'd be very grateful for a pointer or two. Thanks, Doug Baker ------------------- Background Details ------------------- My application is related to visual odometry. I am computing the gradient and hessian for a function of 10 variables that involves converting quaternions to rotation matrices, applying those matrices to 3D points and projecting those points onto a 2D plane. I am generating code with ginac that is initially 50MB of C++ code. (And thank you for ginac being able to handle that monstrosity with such grace.) The input symbols are all array elements (e.g. symbol x("state[t+X]")) or function calls (e.g. symbol rTi11("rTi.getValue<1,1>()")). If I take the printed output from ginac as is, the C++ compiler can't properly optimize it because it doesn't know that state[t+X] is a constant or that getValue is constant. Moreover, the expressions have huge repeated expressions, and these bog the compiler down, too. I've been able to use a simple search-and-replace approach in other problems, and on this one that same approach has gotten me from 50MB to 6MB or so, but the term ordering issue is biting me now. For example, I have one sum (a+b+c+d) that occurs 232608 times. But there are 4! = 24 permutations of that sum that I'd have to replace. It's worse than that, though, because that's not the only expression like it. Some are bigger, like this: (-2.0*sK/sKSIJ+4.0*sK*sS2/pKSIJ2+-4.0*sI*sJ*sS/pKSIJ2). This has something like 72 permutations, not even counting the permutations of pKSIJ and pKSIJ2, which are themselves expressions like (a+b+c+d). And not only are there variations on this due to term ordering, similar expressions occur with different variables (e.g. with sJ instead of sK, etc.) So when all these expressions can be output in various orders, it's no longer feasible to generate all the permutations for search-and-replace. Support NPR 20 seconds at a time. www.twentysecondsatatime.org _______________________________________________ GiNaC-list mailing list GiNaC-list@ginac.de https://www.cebix.net/mailman/listinfo/ginac-list -----Inline Attachment Follows----- _______________________________________________ GiNaC-list mailing list GiNaC-list@ginac.de https://www.cebix.net/mailman/listinfo/ginac-list

jros

14 May 14 May

1:13 a.m.

Yes, thats the idea. I suppose that can be implemented. An obvious simplification, is to avoid declarations like double e1 = a; for symbols, as declaration double a = my_a; makes the symbol already available. I would consider such a functionality really important for GiNaC. Lots of people is using GiNaC, to output code (thats my case, matlab/octave/ and c), and this would produce highly optimal code in terms of space and speed as lots of computations can be avoided. It would be nice if some power developer could say something about the proper way to go, and impressions about possibilities to make it a part of GiNaC. Thank you, Javier On Thu, 2010-05-13 at 11:47 -0700, Doug wrote:

...

Javier, All,

I think what we need can be done with two print functions, as described below. It seems straightforward enough that I can implement this myself, but I wonder if it already exists somewhere? Is this something the developers would consider adding to Ginac?

Thanks, Doug

---------------------

Perhaps this is a clean solution. One could write a print function that outputs C++ code that explicitly names and declares sub-expressions. You also have a companion print function that just emits the name of the top-level expression. Call them print_declarations() and print_variable_name().

For example, suppose you have an expression like this: ex myex = a + b*pow(c+d+e,2.0) + f/(g*h + i*j); For simplicity's sake, let each symbol just be a variable (e.g. symbol a("my_a")), but of course they can be anything. print_declarations() can emit these declarations below during a post-order traversal of the expression tree.

// Symbols double a = my_a; double b = my_b; [...] double j = my_j; // Expressions double e1 = a; double e2 = b; double e3 = c; double e4 = d; double e5 = e; double e6 = e3 + e4 + e5; double e7 = pow(e6, 2.0); double e8 = e2 * e7; double e9 = f; double e10 = g; double e11 = h; double e12 = e10*e11; double e13 = i; double e14 = j; double e15 = e13*e14; double e16 = e12 + e15; double e17 = e9 / e16; double e18 = e1 + e8 + e17;

print_variable_name() can then just emit e18 for the example expression, to be used however you would normally use C++ code that you print.

Then, assuming ginac is smart about not having separate expression tree nodes for duplicated sub-expressions, you will declare each sub-expression just once. The order ginac uses for expressions does not matter at all.

Support NPR 20 seconds at a time. www.twentysecondsatatime.org

--- On Thu, 5/13/10, jros <jros@unavarra.es> wrote:

From: jros <jros@unavarra.es> Subject: Re: [GiNaC-list] Term ordering and compiling C++ code To: "GiNaC discussion list" <ginac-list@ginac.de> Date: Thursday, May 13, 2010, 10:31 AM

Hello,

Although I don't have a solution for your problem, as I'm myself addressing similar problems matching common subexpressions to variables, in down top manner, I think that such a functionality is implicitly implemented in GiNaC.

If I understand GiNaC internal structure correctly, subexpressions common to two expresions, are frequently shared internally, to save memory.

So it must be possible to write a print algorithm that goes trough an/some expression/s tree, and that replaces every shared subexpression (let say sum product) with a variable, that again is assigned a expression that would be printed in the same way using recursion.

Probably allowing/disallowing some kind automatic simplifications (so that subexpression sharing expected value increases) can probably help to obtain improved results.

I wonder what do the developers think about this.

Cheers,

Javier

On Thu, 2010-05-13 at 07:11 -0700, Doug wrote:

> Hi, > > I'm running into a term ordering problem like the one > mentioned in this post. > http://permalink.gmane.org/gmane.comp.mathematics.ginac.general/1107 > > Relevant details about my application are below, for the > curious. The short of it, though, is that the C++ code I am > generating with ginac is so large (50MB) that the compiler > can't handle it, and moreover, the values being represented > by ginac symbols are array accesses that the compiler can't > optimize away. So I need to create temporary values and do > sub-expression elimination on the C++ code just so the > compiler can have a chance. I have successfully done this > using search-and-replace on the C++ code for other > applications, but I guess I was just lucky that ginac > printed terms in a consistent order. > > Is this a problem other ginac users have confronted? From > the answer to the above post, it sounds like I need to write > my own print function that can ensure expressions are > printed in a predictable order. It looks like I can also > inspect the expression tree and possibly do a proper job of > common sub-expression elimination. But if there is an > easier approach, I'd be very grateful for a pointer or two. > > Thanks, > Doug Baker > > ------------------- > Background Details > ------------------- > > My application is related to visual odometry. I am > computing the gradient and hessian for a function of 10 > variables that involves converting quaternions to rotation > matrices, applying those matrices to 3D points and > projecting those points onto a 2D plane. > > I am generating code with ginac that is initially 50MB of C > ++ code. (And thank you for ginac being able to handle that > monstrosity with such grace.) The input symbols are all > array elements (e.g. symbol x("state[t+X]")) or function > calls (e.g. symbol rTi11("rTi.getValue<1,1>()")). > > If I take the printed output from ginac as is, the C++ > compiler can't properly optimize it because it doesn't know > that state[t+X] is a constant or that getValue is constant. > Moreover, the expressions have huge repeated expressions, > and these bog the compiler down, too. > > I've been able to use a simple search-and-replace approach > in other problems, and on this one that same approach has > gotten me from 50MB to 6MB or so, but the term ordering > issue is biting me now. For example, I have one sum (a+b+c > +d) that occurs 232608 times. But there are 4! = 24 > permutations of that sum that I'd have to replace. It's > worse than that, though, because that's not the only > expression like it. Some are bigger, like this: > (-2.0*sK/sKSIJ+4.0*sK*sS2/pKSIJ2+-4.0*sI*sJ*sS/pKSIJ2). This > has something like 72 permutations, not even counting the > permutations of pKSIJ and pKSIJ2, which are themselves > expressions like (a+b+c+d). And not only are there > variations on this due to term ordering, similar expressions > occur with different variables (e.g. with sJ instead of sK, > etc.) So when all these expressions can be output in > various orders, it's no longer feasible to generate all the > permutations for search-and-replace. > > > Support NPR 20 seconds at a time. > www.twentysecondsatatime.org > > > _______________________________________________ > GiNaC-list mailing list > GiNaC-list@ginac.de > https://www.cebix.net/mailman/listinfo/ginac-list

-----Inline Attachment Follows-----

_______________________________________________ GiNaC-list mailing list GiNaC-list@ginac.de https://www.cebix.net/mailman/listinfo/ginac-list

Doug

2:25 a.m.

Javier, The output code might be a tad simpler, but the code that generates the code might be less simple, because now you need a special case for the leaves/symbols. Compilers are really good at detecting a = my_a; and e1 = a;, so that wouldn't worry me at all. In my case, the output code is already so complex that it doesn't much matter. I agree that a feature like this would make Ginac more attractive for other people in my company who use similar symbolic code for computing jacobians and hessians. For the record, the code Ginac prints gets evaluated at several hertz, like 10 or 20 Hz, so speed is critical. Support NPR 20 seconds at a time. www.twentysecondsatatime.org --- On Thu, 5/13/10, jros <jros@unavarra.es> wrote: From: jros <jros@unavarra.es> Subject: Re: [GiNaC-list] Term ordering and compiling C++ code To: "GiNaC discussion list" <ginac-list@ginac.de> Date: Thursday, May 13, 2010, 7:13 PM Yes, thats the idea. I suppose that can be implemented. An obvious simplification, is to avoid declarations like double e1 = a; for symbols, as declaration double a = my_a; makes the symbol already available. I would consider such a functionality really important for GiNaC. Lots of people is using GiNaC, to output code (thats my case, matlab/octave/ and c), and this would produce highly optimal code in terms of space and speed as lots of computations can be avoided. It would be nice if some power developer could say something about the proper way to go, and impressions about possibilities to make it a part of GiNaC. Thank you, Javier On Thu, 2010-05-13 at 11:47 -0700, Doug wrote: Javier, All, I think what we need can be done with two print functions, as described below. It seems straightforward enough that I can implement this myself, but I wonder if it already exists somewhere? Is this something the developers would consider adding to Ginac? Thanks, Doug --------------------- Perhaps this is a clean solution. One could write a print function that outputs C++ code that explicitly names and declares sub-expressions. You also have a companion print function that just emits the name of the top-level expression. Call them print_declarations() and print_variable_name(). For example, suppose you have an expression like this: ex myex = a + b*pow(c+d+e,2.0) + f/(g*h + i*j); For simplicity's sake, let each symbol just be a variable (e.g. symbol a("my_a")), but of course they can be anything. print_declarations() can emit these declarations below during a post-order traversal of the expression tree. // Symbols double a = my_a; double b = my_b; [...] double j = my_j; // Expressions double e1 = a; double e2 = b; double e3 = c; double e4 = d; double e5 = e; double e6 = e3 + e4 + e5; double e7 = pow(e6, 2.0); double e8 = e2 * e7; double e9 = f; double e10 = g; double e11 = h; double e12 = e10*e11; double e13 = i; double e14 = j; double e15 = e13*e14; double e16 = e12 + e15; double e17 = e9 / e16; double e18 = e1 + e8 + e17; print_variable_name() can then just emit e18 for the example expression, to be used however you would normally use C++ code that you print. Then, assuming ginac is smart about not having separate expression tree nodes for duplicated sub-expressions, you will declare each sub-expression just once. The order ginac uses for expressions does not matter at all. Support NPR 20 seconds at a time. www.twentysecondsatatime.org --- On Thu, 5/13/10, jros <jros@unavarra.es> wrote: From: jros <jros@unavarra.es> Subject: Re: [GiNaC-list] Term ordering and compiling C++ code To: "GiNaC discussion list" <ginac-list@ginac.de> Date: Thursday, May 13, 2010, 10:31 AM Hello, Although I don't have a solution for your problem, as I'm myself addressing similar problems matching common subexpressions to variables, in down top manner, I think that such a functionality is implicitly implemented in GiNaC. If I understand GiNaC internal structure correctly, subexpressions common to two expresions, are frequently shared internally, to save memory. So it must be possible to write a print algorithm that goes trough an/some expression/s tree, and that replaces every shared subexpression (let say sum product) with a variable, that again is assigned a expression that would be printed in the same way using recursion. Probably allowing/disallowing some kind automatic simplifications (so that subexpression sharing expected value increases) can probably help to obtain improved results. I wonder what do the developers think about this. Cheers, Javier On Thu, 2010-05-13 at 07:11 -0700, Doug wrote: Hi, I'm running into a term ordering problem like the one mentioned in this post. http://permalink.gmane.org/gmane.comp.mathematics.ginac.general/1107 Relevant details about my application are below, for the curious. The short of it, though, is that the C++ code I am generating with ginac is so large (50MB) that the compiler can't handle it, and moreover, the values being represented by ginac symbols are array accesses that the compiler can't optimize away. So I need to create temporary values and do sub-expression elimination on the C++ code just so the compiler can have a chance. I have successfully done this using search-and-replace on the C++ code for other applications, but I guess I was just lucky that ginac printed terms in a consistent order. Is this a problem other ginac users have confronted? From the answer to the above post, it sounds like I need to write my own print function that can ensure expressions are printed in a predictable order. It looks like I can also inspect the expression tree and possibly do a proper job of common sub-expression elimination. But if there is an easier approach, I'd be very grateful for a pointer or two. Thanks, Doug Baker ------------------- Background Details ------------------- My application is related to visual odometry. I am computing the gradient and hessian for a function of 10 variables that involves converting quaternions to rotation matrices, applying those matrices to 3D points and projecting those points onto a 2D plane. I am generating code with ginac that is initially 50MB of C++ code. (And thank you for ginac being able to handle that monstrosity with such grace.) The input symbols are all array elements (e.g. symbol x("state[t+X]")) or function calls (e.g. symbol rTi11("rTi.getValue<1,1>()")). If I take the printed output from ginac as is, the C++ compiler can't properly optimize it because it doesn't know that state[t+X] is a constant or that getValue is constant. Moreover, the expressions have huge repeated expressions, and these bog the compiler down, too. I've been able to use a simple search-and-replace approach in other problems, and on this one that same approach has gotten me from 50MB to 6MB or so, but the term ordering issue is biting me now. For example, I have one sum (a+b+c+d) that occurs 232608 times. But there are 4! = 24 permutations of that sum that I'd have to replace. It's worse than that, though, because that's not the only expression like it. Some are bigger, like this: (-2.0*sK/sKSIJ+4.0*sK*sS2/pKSIJ2+-4.0*sI*sJ*sS/pKSIJ2). This has something like 72 permutations, not even counting the permutations of pKSIJ and pKSIJ2, which are themselves expressions like (a+b+c+d). And not only are there variations on this due to term ordering, similar expressions occur with different variables (e.g. with sJ instead of sK, etc.) So when all these expressions can be output in various orders, it's no longer feasible to generate all the permutations for search-and-replace. Support NPR 20 seconds at a time. www.twentysecondsatatime.org _______________________________________________ GiNaC-list mailing list GiNaC-list@ginac.de https://www.cebix.net/mailman/listinfo/ginac-list -----Inline Attachment Follows----- _______________________________________________ GiNaC-list mailing list GiNaC-list@ginac.de https://www.cebix.net/mailman/listinfo/ginac-list -----Inline Attachment Follows----- _______________________________________________ GiNaC-list mailing list GiNaC-list@ginac.de https://www.cebix.net/mailman/listinfo/ginac-list

jros

3:27 a.m.

I see we have similar problems. I'm using GiNaC to do Multibody simulations, with some applications in realtime control. May you are interested on my work related to GiNaC http://www.imac.unavarra.es/3d_mec/pages/lib3d_mec-ginac.php Reference: J.Ros et al. LIB3D MEC–GINAC, A LIBRARY FOR SYMBOLIC MULTIBODY DYNAMICS. MULTIBODY DYNAMICS 2007, ECCOMAS Thematic Conference. Milano, Italy. 2007. The implementation of what I call atomization of expresions, is not optimal at all, and uses a completely different approach to the one commented here. (Beware that the version in the we is not the last one) Javier On Thu, 2010-05-13 at 17:25 -0700, Doug wrote:

...

Javier,

The output code might be a tad simpler, but the code that generates the code might be less simple, because now you need a special case for the leaves/symbols. Compilers are really good at detecting a = my_a; and e1 = a;, so that wouldn't worry me at all.

In my case, the output code is already so complex that it doesn't much matter.

I agree that a feature like this would make Ginac more attractive for other people in my company who use similar symbolic code for computing jacobians and hessians. For the record, the code Ginac prints gets evaluated at several hertz, like 10 or 20 Hz, so speed is critical.

Support NPR 20 seconds at a time. www.twentysecondsatatime.org

--- On Thu, 5/13/10, jros <jros@unavarra.es> wrote:

From: jros <jros@unavarra.es> Subject: Re: [GiNaC-list] Term ordering and compiling C++ code To: "GiNaC discussion list" <ginac-list@ginac.de> Date: Thursday, May 13, 2010, 7:13 PM

Yes, thats the idea. I suppose that can be implemented. An obvious simplification, is to avoid declarations like

double e1 = a;

for symbols, as declaration

double a = my_a;

makes the symbol already available.

I would consider such a functionality really important for GiNaC. Lots of people is using GiNaC, to output code (thats my case, matlab/octave/ and c), and this would produce highly optimal code in terms of space and speed as lots of computations can be avoided.

It would be nice if some power developer could say something about the proper way to go, and impressions about possibilities to make it a part of GiNaC.

Thank you,

Javier

On Thu, 2010-05-13 at 11:47 -0700, Doug wrote:

> Javier, All, > > I think what we need can be done with two print functions, > as described below. It seems straightforward enough that I > can implement this myself, but I wonder if it already exists > somewhere? Is this something the developers would consider > adding to Ginac? > > Thanks, > Doug > > --------------------- > > Perhaps this is a clean solution. One could write a print > function that outputs C++ code that explicitly names and > declares sub-expressions. You also have a companion print > function that just emits the name of the top-level > expression. Call them print_declarations() and > print_variable_name(). > > For example, suppose you have an expression like this: > ex myex = a + b*pow(c+d+e,2.0) + f/(g*h + i*j); > For simplicity's sake, let each symbol just be a variable > (e.g. symbol a("my_a")), but of course they can be anything. > print_declarations() can emit these declarations below > during a post-order traversal of the expression tree. > > // Symbols > double a = my_a; > double b = my_b; > [...] > double j = my_j; > // Expressions > double e1 = a; > double e2 = b; > double e3 = c; > double e4 = d; > double e5 = e; > double e6 = e3 + e4 + e5; > double e7 = pow(e6, 2.0); > double e8 = e2 * e7; > double e9 = f; > double e10 = g; > double e11 = h; > double e12 = e10*e11; > double e13 = i; > double e14 = j; > double e15 = e13*e14; > double e16 = e12 + e15; > double e17 = e9 / e16; > double e18 = e1 + e8 + e17; > > print_variable_name() can then just emit e18 for the example > expression, to be used however you would normally use C++ > code that you print. > > Then, assuming ginac is smart about not having separate > expression tree nodes for duplicated sub-expressions, you > will declare each sub-expression just once. The order ginac > uses for expressions does not matter at all. > > Support NPR 20 seconds at a time. > www.twentysecondsatatime.org > > --- On Thu, 5/13/10, jros <jros@unavarra.es> wrote: > > > From: jros <jros@unavarra.es> > Subject: Re: [GiNaC-list] Term ordering and > compiling C++ code > To: "GiNaC discussion list" <ginac-list@ginac.de> > Date: Thursday, May 13, 2010, 10:31 AM > > Hello, > > Although I don't have a solution for your problem, > as I'm myself addressing similar problems > matching common subexpressions to variables, in down > top manner, I think that such a functionality > is implicitly implemented in GiNaC. > > If I understand GiNaC internal structure correctly, > subexpressions common to two expresions, > are frequently shared internally, to save memory. > > So it must be possible to write a print algorithm > that goes trough an/some expression/s tree, and that > replaces > every shared subexpression (let say sum product) > with a variable, that again is assigned a expression > that would be printed > in the same way using recursion. > > Probably allowing/disallowing some kind automatic > simplifications (so that subexpression sharing > expected value increases) can probably help to > obtain improved results. > > I wonder what do the developers think about this. > > Cheers, > > Javier > > > On Thu, 2010-05-13 at 07:11 -0700, Doug wrote: > > > Hi, > > > > I'm running into a term ordering problem like the > > one mentioned in this post. > > http://permalink.gmane.org/gmane.comp.mathematics.ginac.general/1107 > > > > Relevant details about my application are below, > > for the curious. The short of it, though, is that > > the C++ code I am generating with ginac is so > > large (50MB) that the compiler can't handle it, > > and moreover, the values being represented by > > ginac symbols are array accesses that the compiler > > can't optimize away. So I need to create > > temporary values and do sub-expression elimination > > on the C++ code just so the compiler can have a > > chance. I have successfully done this using > > search-and-replace on the C++ code for other > > applications, but I guess I was just lucky that > > ginac printed terms in a consistent order. > > > > Is this a problem other ginac users have > > confronted? From the answer to the above post, it > > sounds like I need to write my own print function > > that can ensure expressions are printed in a > > predictable order. It looks like I can also > > inspect the expression tree and possibly do a > > proper job of common sub-expression elimination. > > But if there is an easier approach, I'd be very > > grateful for a pointer or two. > > > > Thanks, > > Doug Baker > > > > ------------------- > > Background Details > > ------------------- > > > > My application is related to visual odometry. I > > am computing the gradient and hessian for a > > function of 10 variables that involves converting > > quaternions to rotation matrices, applying those > > matrices to 3D points and projecting those points > > onto a 2D plane. > > > > I am generating code with ginac that is initially > > 50MB of C++ code. (And thank you for ginac being > > able to handle that monstrosity with such grace.) > > The input symbols are all array elements (e.g. > > symbol x("state[t+X]")) or function calls (e.g. > > symbol rTi11("rTi.getValue<1,1>()")). > > > > If I take the printed output from ginac as is, the > > C++ compiler can't properly optimize it because it > > doesn't know that state[t+X] is a constant or that > > getValue is constant. Moreover, the expressions > > have huge repeated expressions, and these bog the > > compiler down, too. > > > > I've been able to use a simple search-and-replace > > approach in other problems, and on this one that > > same approach has gotten me from 50MB to 6MB or > > so, but the term ordering issue is biting me now. > > For example, I have one sum (a+b+c+d) that occurs > > 232608 times. But there are 4! = 24 permutations > > of that sum that I'd have to replace. It's worse > > than that, though, because that's not the only > > expression like it. Some are bigger, like this: > > (-2.0*sK/sKSIJ+4.0*sK*sS2/pKSIJ2 > > +-4.0*sI*sJ*sS/pKSIJ2). This has something like 72 > > permutations, not even counting the permutations > > of pKSIJ and pKSIJ2, which are themselves > > expressions like (a+b+c+d). And not only are > > there variations on this due to term ordering, > > similar expressions occur with different variables > > (e.g. with sJ instead of sK, etc.) So when all > > these expressions can be output in various orders, > > it's no longer feasible to generate all the > > permutations for search-and-replace. > > > > > > Support NPR 20 seconds at a time. > > www.twentysecondsatatime.org > > > > > > _______________________________________________ > > GiNaC-list mailing list > > GiNaC-list@ginac.de > > https://www.cebix.net/mailman/listinfo/ginac-list > > > -----Inline Attachment Follows----- > > _______________________________________________ > GiNaC-list mailing list > GiNaC-list@ginac.de > https://www.cebix.net/mailman/listinfo/ginac-list > >

-----Inline Attachment Follows-----

_______________________________________________ GiNaC-list mailing list GiNaC-list@ginac.de https://www.cebix.net/mailman/listinfo/ginac-list

_______________________________________________ GiNaC-list mailing list GiNaC-list@ginac.de https://www.cebix.net/mailman/listinfo/ginac-list

Martin Ettl

19 May 19 May

5:37 p.m.

New subject: Ginac and OpenMP

Hello, i try to do some symbolic computation in parallel using openMP. This is offered from gcc since version 4.2 on Linux. Here a reduced version of what i have tried: #include <iostream> #include <stdexcept> #include <ginac/ginac.h> #include <omp.h> using namespace std; using namespace GiNaC; int main() { symbol omega("om"),u("u"),du("du"); ex F = -(pow(omega,12)*u) - (pow(u,13)); unsigned int ui; #pragma omp parallel for private(F) shared(ui,u,omega,du) for(ui = 0; ui < 10 ; ui++) std::cout << F.diff(u,ui) << std::endl; } Compile it with : g++ -fopenmp -o test test.cpp -lginac For every iteration of the for loop, the result is still zero. Has anybody else have had the same (or similiar) problem. Have i made misst something? Many thanks in advance Martin -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

Martin Ettl

5:46 p.m.

New subject: Ginac and OpenMP

-------- Original-Nachricht --------

...

Datum: Wed, 19 May 2010 17:37:44 +0200 Von: "Martin Ettl" <ettl.martin@gmx.de> An: jros@unavarra.es, GiNaC discussion list <ginac-list@ginac.de> Betreff: [GiNaC-list] Ginac and OpenMP

...

Hello,

i try to do some symbolic computation in parallel using openMP. This is offered from gcc since version 4.2 on Linux.

Here a reduced version of what i have tried:

#include <iostream> #include <stdexcept> #include <ginac/ginac.h> #include <omp.h> using namespace std; using namespace GiNaC;

int main() {

symbol omega("om"),u("u"),du("du"); ex F = -(pow(omega,12)*u) - (pow(u,13)); unsigned int ui; #pragma omp parallel for private(F) shared(ui,u,omega,du) for(ui = 0; ui < 10 ; ui++) std::cout << F.diff(u,ui) << std::endl;

}

Compile it with : g++ -fopenmp -o test test.cpp -lginac

For every iteration of the for loop, the result is still zero. Has anybody else have had the same (or similiar) problem. Have i made misst something?

Many thanks in advance

Martin BTW,

switching off OpenMP prints the following correct result: -om^12*u-u^13 -13*u^12-om^12 -156*u^11 -1716*u^10 -17160*u^9 -154440*u^8 -1235520*u^7 -8648640*u^6 -51891840*u^5 -259459200*u^4 Maybe Ginac is not usable with OpenMP? Cheers, Martin -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

Richard B. Kreckel

22 May 22 May

11:19 p.m.

Hi! jros wrote:

...

Although I don't have a solution for your problem, as I'm myself addressing similar problems matching common subexpressions to variables, in down top manner, I think that such a functionality is implicitly implemented in GiNaC.

If I understand GiNaC internal structure correctly, subexpressions common to two expresions, are frequently shared internally, to save memory.

This is entirely correct, but...

...

So it must be possible to write a print algorithm that goes trough an/some expression/s tree, and that replaces every shared subexpression (let say sum product) with a variable, that again is assigned a expression that would be printed in the same way using recursion.

...first, this sharing is entirely transparent for the user...

...

Probably allowing/disallowing some kind automatic simplifications (so that subexpression sharing expected value increases) can probably help to obtain improved results.

...

I wonder what do the developers think about this.

jros

24 May 24 May

8:17 a.m.

On Sat, 2010-05-22 at 23:19 +0200, Richard B. Kreckel wrote:

...

Hi!

jros wrote:

...
Although I don't have a solution for your problem, as I'm myself addressing similar problems matching common subexpressions to variables, in down top manner, I think that such a functionality is implicitly implemented in GiNaC.

If I understand GiNaC internal structure correctly, subexpressions common to two expresions, are frequently shared internally, to save memory.

This is entirely correct, but...

...
So it must be possible to write a print algorithm that goes trough an/some expression/s tree, and that replaces every shared subexpression (let say sum product) with a variable, that again is assigned a expression that would be printed in the same way using recursion.

...first, this sharing is entirely transparent for the user...

You mean that we can not look to the smart ptr of a expression, or some of its subexpressions like add and mul to see if there are referenced by more than one expression. The idea would be: if an element of a subexpression is referenced more than once we can call it atom, and printout (for C code) the expression using the atom (avoiding expansion of the subexpression), and also print atom definition (for C code). It would be nice to be able to print expresions in GiNaC to see th level of sharing that it is using. Is sharing also enforced when solving linear equations?. I think this is a really important place for optimization if done.

...

...
Probably allowing/disallowing some kind automatic simplifications (so that subexpression sharing expected value increases) can probably help to obtain improved results.

...and second, sharing is currently not pursued aggressively! Rather, it is exploited whenever it is trivial to do so. (The product rule of differentiation is an example where exactly the same terms pop up again and again so exploiting sharing comes at no cost.)

It would be nice if sharing aggressiveness could be changed at runtime. without affecting performance for at least for the level of aggressiveness of the actual implementation. In this example e1=a+b e2=a+b+c I suppose that no sharing is implied like e2=e1+c, that would be computationally expensive. But, if instead e2=e1+c then e1 is referenced by e2?? (I suppose yes). If this is the case I consider that the level of sharing is respectably good :) . I mean, just defining the expresions with care would give good results. The sharing of expressions when diff is really nice. If parenthesization is kept to a maximum (no expansions made if they are not needed), as I think is the case, the sharing structure is kept as long as possible, and that is very good also. So, in my opinion using, what I understand is, the current level of sharing, I mean no changes at all, would allow to print C output code in a very optimized way.

...

...
I wonder what do the developers think about this.

Well, I think that if the size of generated code is so prohibitively large and compiler CSE doesn't help you may be better off writing your own algorithm collecting similar terms in an associative array. You could then artificially introduce temporaries, in order to help the compiler. This would boil down to a more aggressive variant of GiNaC's subexpression sharing. What do you think?

I suppose you refer to http://en.wikipedia.org/wiki/Associative_array (I'm not familiar with this). What would be the "key" and the "value"??. I think that you propose to go beyond the level of sharing of GiNaC, trying to find common expresions that are not shared by GiNaC. Do you? So you start traversing the structure and you push in the Associative array whatever subexpression you find, the "key" will be the subexpression and the "value" a new defined symbol for the atom (and may be the number of references to this atom), then you substitute the subexpression in the expression with the new atom. If the subexpression was already in the array, then no new insertion is made and the subexpression is substituted for the matching atom (the value). I suppose that to push a subexpression, you first (recursively) apply recursively the previous recipe to it (so it gets atomized before pushing). I suppose that a important issue is to decide when a expression should be consider atomized, and I think that this is dependent of the internal structure of GiNaC, for example: If a - b *( c +d +e) *f is a expression. atom1= c + d + e atom2=-b*atom1*f atom3=a + atom2 I suppose that after this, if the expression b *( c +d +e) *f is atomized, a new atom will appear atom4=b*atom1*f Instead of avoiding the creation of an atom having into account b *( c +d +e) *f=-atom2 I suppose that this would need things like, inserting the subexpression, only if it or its negative are not in the array. Nevertheless it seems to me that sharing in GiNaC automatically deals with this (or almost). Fine grained atomization like, atom1=c+d atom2=atom1+e seems more difficult and inefficient to implement, due to the internal expression representation. In my particular implementation, each time a operation is done the result is atomized, and all the expresions are kept atomized. To that end I define the special type atom (descent of symbol), that have a pointer to my equivalent (although less optimal) implementation of the associative array. This special type allows things like implementing making diff print to work flawlessly with atomized expresions (as if they were not). As I dare not overloading all the GiNaC operators, I only overloading the operators of a matrix class in which all the operations that I need are made. This implementation (certainly improvable), is optimal in some senses (single representation, maximum sharing an minimum memory, low cost of atomization), nevertheless fine tunning needs a deep knowledge of the internals of GiNaC. Thank you very much, Javier

...

Bye -richy.

Doug

2:23 p.m.

All, I have written a class to solve the problem. I expect to post it here in the next day or two. Because I need to parse each expression anew every time it occurs, it is slow to generate printable C++ code, but it does work, and the code compiles and executes very fast. (Slow is relative, of course. On my problem, for which GiNaC generates 50MB of code, this new class takes 5 minutes. The result is about 700K of code.) The parsing would be sped up significantly if there was a unique identifier for nodes in a GiNaC expression tree. Just to see how fast, I pretended ex::gethash() was unique. Run time was reduced from 5 minutes to well under 1 second. Is there a method other than this list where I can request that a serial number be added to expression tree nodes? Thanks, Doug Support NPR 20 seconds at a time. www.twentysecondsatatime.org --- On Mon, 5/24/10, jros <jros@unavarra.es> wrote: From: jros <jros@unavarra.es> Subject: Re: [GiNaC-list] Term ordering and compiling C++ code To: "GiNaC discussion list" <ginac-list@ginac.de> Date: Monday, May 24, 2010, 2:17 AM On Sat, 2010-05-22 at 23:19 +0200, Richard B. Kreckel wrote: Hi! jros wrote:

...

Although I don't have a solution for your problem, as I'm myself addressing similar problems matching common subexpressions to variables, in down top manner, I think that such a functionality is implicitly implemented in GiNaC.

If I understand GiNaC internal structure correctly, subexpressions common to two expresions, are frequently shared internally, to save memory.

This is entirely correct, but...

...

So it must be possible to write a print algorithm that goes trough an/some expression/s tree, and that replaces every shared subexpression (let say sum product) with a variable, that again is assigned a expression that would be printed in the same way using recursion.

...first, this sharing is entirely transparent for the user... You mean that we can not look to the smart ptr of a expression, or some of its subexpressions like add and mul to see if there are referenced by more than one expression. The idea would be: if an element of a subexpression is referenced more than once we can call it atom, and printout (for C code) the expression using the atom (avoiding expansion of the subexpression), and also print atom definition (for C code). It would be nice to be able to print expresions in GiNaC to see th level of sharing that it is using. Is sharing also enforced when solving linear equations?. I think this is a really important place for optimization if done.

...

Probably allowing/disallowing some kind automatic simplifications (so that subexpression sharing expected value increases) can probably help to obtain improved results.

...and second, sharing is currently not pursued aggressively! Rather, it is exploited whenever it is trivial to do so. (The product rule of differentiation is an example where exactly the same terms pop up again and again so exploiting sharing comes at no cost.) It would be nice if sharing aggressiveness could be changed at runtime. without affecting performance for at least for the level of aggressiveness of the actual implementation. In this example e1=a+b e2=a+b+c I suppose that no sharing is implied like e2=e1+c, that would be computationally expensive. But, if instead e2=e1+c then e1 is referenced by e2?? (I suppose yes). If this is the case I consider that the level of sharing is respectably good :) . I mean, just defining the expresions with care would give good results. The sharing of expressions when diff is really nice. If parenthesization is kept to a maximum (no expansions made if they are not needed), as I think is the case, the sharing structure is kept as long as possible, and that is very good also. So, in my opinion using, what I understand is, the current level of sharing, I mean no changes at all, would allow to print C output code in a very optimized way.

...

I wonder what do the developers think about this.

Well, I think that if the size of generated code is so prohibitively large and compiler CSE doesn't help you may be better off writing your own algorithm collecting similar terms in an associative array. You could then artificially introduce temporaries, in order to help the compiler. This would boil down to a more aggressive variant of GiNaC's subexpression sharing. What do you think? I suppose you refer to http://en.wikipedia.org/wiki/Associative_array (I'm not familiar with this). What would be the "key" and the "value"??. I think that you propose to go beyond the level of sharing of GiNaC, trying to find common expresions that are not shared by GiNaC. Do you? So you start traversing the structure and you push in the Associative array whatever subexpression you find, the "key" will be the subexpression and the "value" a new defined symbol for the atom (and may be the number of references to this atom), then you substitute the subexpression in the expression with the new atom. If the subexpression was already in the array, then no new insertion is made and the subexpression is substituted for the matching atom (the value). I suppose that to push a subexpression, you first (recursively) apply recursively the previous recipe to it (so it gets atomized before pushing). I suppose that a important issue is to decide when a expression should be consider atomized, and I think that this is dependent of the internal structure of GiNaC, for example: If a - b *( c +d +e) *f is a expression. atom1= c + d + e atom2=-b*atom1*f atom3=a + atom2 I suppose that after this, if the expression b *( c +d +e) *f is atomized, a new atom will appear atom4=b*atom1*f Instead of avoiding the creation of an atom having into account b *( c +d +e) *f=-atom2 I suppose that this would need things like, inserting the subexpression, only if it or its negative are not in the array. Nevertheless it seems to me that sharing in GiNaC automatically deals with this (or almost). Fine grained atomization like, atom1=c+d atom2=atom1+e seems more difficult and inefficient to implement, due to the internal expression representation. In my particular implementation, each time a operation is done the result is atomized, and all the expresions are kept atomized. To that end I define the special type atom (descent of symbol), that have a pointer to my equivalent (although less optimal) implementation of the associative array. This special type allows things like implementing making diff print to work flawlessly with atomized expresions (as if they were not). As I dare not overloading all the GiNaC operators, I only overloading the operators of a matrix class in which all the operations that I need are made. This implementation (certainly improvable), is optimal in some senses (single representation, maximum sharing an minimum memory, low cost of atomization), nevertheless fine tunning needs a deep knowledge of the internals of GiNaC. Thank you very much, Javier Bye -richy. -----Inline Attachment Follows----- _______________________________________________ GiNaC-list mailing list GiNaC-list@ginac.de https://www.cebix.net/mailman/listinfo/ginac-list

Alexei Sheplyakov

8:51 p.m.

Hello, On Mon, May 24, 2010 at 02:17:41AM -0400, jros wrote:

...

In this example

e1=a+b e2=a+b+c

I suppose that no sharing is implied like e2=e1+c, that would be computationally expensive. But, if instead

e2=e1+c

then e1 is referenced by e2?? (I suppose yes).

No. GiNaC tries to keep trees (which what GiNaC::add essentially is) as flat as possible (see expairseq::construct_from_epvector). This save *a lot* of memory and makes collecting similar terms more efficient. So +( +(a, b), c) gets transformed into +(a, b, c), and nothing is shared between e1 and e2. Best regards, Alexei

Alexei Sheplyakov

9:27 p.m.

New subject: GiNaC is not a compiler (Was: Term ordering and compiling C++ code)

Hello, On Thu, May 13, 2010 at 10:31:48AM -0400, jros wrote:

...

Probably allowing/disallowing some kind automatic simplifications (so that subexpression sharing expected value increases) can probably help to obtain improved results.

I wonder what do the developers think about this.

GiNaC is not a compiler. It's data structures and algorithms are not suitable for finding common subexpressions (and other optimization algorithms). For example, GiNaC tries to make expression tree as flat as possible (to save memory and make collecting similar terms more efficient). Thus, (a + b) + c is automatically transformed into a + b + c (which is probably not what you want for finding common subexpressions). However, it saves memory and makes collecting similar terms more efficient. The bottom line is: use the right tool. If you need a compiler, use a compiler, not a symbolic computation engine. Best regards, Alexei P.S. Attached is the code which I use for converting a GiNaC expression into LLVM (http://llvm.org) intermediate representation. It reads the expression, converts it into LLVM IR, applies some common optimizations, and saves the output as a LLVM (pseudo) asm. Later one can apply further optimizations (using the `opt' utility), and compile into a native code (using llc). To compile any `intersting' expression (> 10^5 terms), pass the -regalloc=local argument to llc (the default register allocator is way too memory hungry).

Martin Ettl

28 May 28 May

11:05 p.m.

New subject: GiNaC is not a compiler (Was: Term ordering and compiling C++ code)

Hello Alexei, i have some trouble to compile the llvm sample, you provided on my Ubuntu system. With what llvm and ginac version you have tested it? I am using llvm-2.7, ginac-1.5.7 and g++-4.4 on Ubuntu 10.04 (32-Bit) Best regards and thanks in advance Martin -------- Original-Nachricht --------

...

Datum: Mon, 24 May 2010 22:27:52 +0300 Von: Alexei Sheplyakov <alexei.sheplyakov@gmail.com> An: jros@unavarra.es, GiNaC discussion list <ginac-list@ginac.de> Betreff: [GiNaC-list] GiNaC is not a compiler (Was: Term ordering and compiling C++ code)

...

Hello,

On Thu, May 13, 2010 at 10:31:48AM -0400, jros wrote:

...
Probably allowing/disallowing some kind automatic simplifications (so that subexpression sharing expected value increases) can probably help to obtain improved results.

I wonder what do the developers think about this.

GiNaC is not a compiler. It's data structures and algorithms are not suitable for finding common subexpressions (and other optimization algorithms). For example, GiNaC tries to make expression tree as flat as possible (to save memory and make collecting similar terms more efficient). Thus, (a + b) + c is automatically transformed into a + b + c (which is probably not what you want for finding common subexpressions). However, it saves memory and makes collecting similar terms more efficient.

The bottom line is: use the right tool. If you need a compiler, use a compiler, not a symbolic computation engine.

Best regards, Alexei

P.S.

Attached is the code which I use for converting a GiNaC expression into LLVM (http://llvm.org) intermediate representation. It reads the expression, converts it into LLVM IR, applies some common optimizations, and saves the output as a LLVM (pseudo) asm. Later one can apply further optimizations (using the `opt' utility), and compile into a native code (using llc). To compile any `intersting' expression (> 10^5 terms), pass the

-regalloc=local

argument to llc (the default register allocator is way too memory hungry).

-- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

Alexei Sheplyakov

29 May 29 May

10:33 a.m.

New subject: Compiling GiNaC -> LLVM program

Hello, On Fri, May 28, 2010 at 11:05:11PM +0200, Martin Ettl wrote:

...

i have some trouble to compile the llvm sample, you provided on my Ubuntu system.

Could you please run make V= and post the error message(s)?

...

With what llvm and ginac version you have tested it?

LLVM 2.7, GiNaC 1.5 from git, GCC 4.{1,3,4} on Debian squeeze (x86_64) Best regards, Alexei

Martin Ettl

11:59 a.m.

New subject: Compiling GiNaC -> LLVM program

Hello, i allready managed it to compile it on my system. It was necessary to change the makefile. I changed llvm-confing-2.7 to llvm-config in the whole makefile. But running the compiled programm fails: $ ./ginac_jit bigex.txt bigex.ll ginac_jit: Instructions.cpp:1604: void llvm::BinaryOperator::init(llvm::Instruction::BinaryOps): Assertion `getType()->isIntOrIntVectorTy() && "Tried to create an integer operation on a non-integer type!"' failed. Aborted Maybe i made something wrong? Best regards and many thanks for you help! Martin -------- Original-Nachricht --------

...

Datum: Sat, 29 May 2010 11:33:30 +0300 Von: Alexei Sheplyakov <alexei.sheplyakov@gmail.com> An: GiNaC discussion list <ginac-list@ginac.de> Betreff: Re: [GiNaC-list] Compiling GiNaC -> LLVM program

...

Hello,

On Fri, May 28, 2010 at 11:05:11PM +0200, Martin Ettl wrote:

...
i have some trouble to compile the llvm sample, you provided on my Ubuntu system.

Could you please run

make V=

and post the error message(s)?

...
With what llvm and ginac version you have tested it?

LLVM 2.7, GiNaC 1.5 from git, GCC 4.{1,3,4} on Debian squeeze (x86_64)

Best regards, Alexei

_______________________________________________ GiNaC-list mailing list GiNaC-list@ginac.de https://www.cebix.net/mailman/listinfo/ginac-list

-- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

Martin Ettl

31 May 31 May

6:24 p.m.

New subject: Compiling GiNaC -> LLVM program

Any ideas why it does not work? Many thanks in advance! Cheers, Martin  -------- Original-Nachricht --------

...

Datum: Sat, 29 May 2010 11:59:40 +0200 Von: "Martin Ettl" <ettl.martin@gmx.de> An: GiNaC discussion list <ginac-list@ginac.de> Betreff: Re: [GiNaC-list] Compiling GiNaC -> LLVM program

...

Hello,

i allready managed it to compile it on my system. It was necessary to change the makefile. I changed llvm-confing-2.7 to llvm-config in the whole makefile.

But running the compiled programm fails:

$ ./ginac_jit bigex.txt bigex.ll ginac_jit: Instructions.cpp:1604: void llvm::BinaryOperator::init(llvm::Instruction::BinaryOps): Assertion `getType()->isIntOrIntVectorTy() && "Tried to create an integer operation on a non-integer type!"' failed. Aborted

Maybe i made something wrong?

Best regards and many thanks for you help!

Martin

-------- Original-Nachricht --------

...
Datum: Sat, 29 May 2010 11:33:30 +0300 Von: Alexei Sheplyakov <alexei.sheplyakov@gmail.com> An: GiNaC discussion list <ginac-list@ginac.de> Betreff: Re: [GiNaC-list] Compiling GiNaC -> LLVM program

...
Hello,

On Fri, May 28, 2010 at 11:05:11PM +0200, Martin Ettl wrote:

...
i have some trouble to compile the llvm sample, you provided on my Ubuntu system.

Could you please run

make V=

and post the error message(s)?

...
With what llvm and ginac version you have tested it?

LLVM 2.7, GiNaC 1.5 from git, GCC 4.{1,3,4} on Debian squeeze (x86_64)

Best regards, Alexei

_______________________________________________ GiNaC-list mailing list GiNaC-list@ginac.de https://www.cebix.net/mailman/listinfo/ginac-list

-- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 _______________________________________________ GiNaC-list mailing list GiNaC-list@ginac.de https://www.cebix.net/mailman/listinfo/ginac-list

-- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

Alexei Sheplyakov

2 Jun 2 Jun

12:04 a.m.

New subject: Compiling GiNaC -> LLVM program

Hello, Martin, On Mon, May 31, 2010 at 06:24:12PM +0200, Martin Ettl wrote:

...

But running the compiled programm fails:

$ ./ginac_jit bigex.txt bigex.ll ginac_jit: Instructions.cpp:1604: void llvm::BinaryOperator::init(llvm::Instruction::BinaryOps): Assertion `getType()->isIntOrIntVectorTy() && "Tried to create an integer operation on a non-integer type!"' failed.

...

Any ideas why it does not work?

I'm unable to reproduce this error. I've tried compiling 32-bit version, and... it works just fine. Perhaps the bug can be triggered only by some specific (sub)expression. Or you use a different versions of LLVM. Could you please post - the `bigex.txt' file - the output of the following command: dpkg -l llvm Best regards, Alexei

Alexei Sheplyakov

24 May 24 May

9:38 p.m.

Hello, On Thu, May 13, 2010 at 10:31:48AM -0400, jros wrote:

...

Probably allowing/disallowing some kind automatic simplifications (so that subexpression sharing expected value increases) can probably help to obtain improved results.

I wonder what do the developers think about this.

GiNaC is not a compiler. It's data structures and algorithms are not suitable for finding common subexpressions (and other optimization tasks). For example, GiNaC tries to make expression tree as flat as possible. Thus, (a + b) + c is automatically transformed into a + b + c (which is probably not what you want for finding common subexpressions). However, it saves memory and makes collecting similar terms more efficient. The bottom line is: use the right tool. If you need a compiler, use a compiler, not a symbolic computation engine. Best regards, Alexei P.S. Attached is the code which I use for converting a GiNaC expression into LLVM (http://llvm.org) intermediate representation. It reads the expression, converts it into LLVM IR, applies some common optimizations, and saves the output as a LLVM (pseudo) asm. Later one can apply further optimizations (using the `opt' utility), and compile into a native code (using llc). To compile any `intersting' expression (> 10^5 terms), pass the -regalloc=local argument to llc (the default register allocator is way too memory hungry).

5507

Age (days ago)

5526

Last active (days ago)

List overview

Download

18 comments

5 participants

participants (5)

Alexei Sheplyakov
Doug
jros
Martin Ettl
Richard B. Kreckel