Slowdown in e.g. expansion with more variables defined

older
compiling date 22 / 03 / 2026

Vitaly Magerya

30 Mar 2026 30 Mar '26

12:47 p.m.

Hi, all. I've noticed a worrying performance issue, and I was wondering if you could shed some light onto it. In ginsh, the time to do a simple expand() seems to get worse the more variables you've defined. Here's a test script: FIRST_RUN; time(e1=expand((2*x+3*y+5*z)^100)); time(e2=expand((2*x+3*y+5*z)^100)); time(e3=expand((2*x+3*y+5*z)^100)); time(e4=expand((2*x+3*y+5*z)^100)); time(e5=expand((2*x+3*y+5*z)^100)); time(e6=expand((2*x+3*y+5*z)^100)); time(e7=expand((2*x+3*y+5*z)^100)); time(e8=expand((2*x+3*y+5*z)^100)); time(e9=expand((2*x+3*y+5*z)^100)); UNASSIGN;; unassign('e1'): unassign('e2'): unassign('e3'): unassign('e4'): unassign('e5'): unassign('e6'): unassign('e7'): unassign('e8'): unassign('e9'): SECOND_RUN; time(f1=expand((2*x+3*y+5*z)^100)); time(f2=expand((2*x+3*y+5*z)^100)); time(f3=expand((2*x+3*y+5*z)^100)); time(f4=expand((2*x+3*y+5*z)^100)); time(f5=expand((2*x+3*y+5*z)^100)); time(f6=expand((2*x+3*y+5*z)^100)); time(f7=expand((2*x+3*y+5*z)^100)); time(f8=expand((2*x+3*y+5*z)^100)); time(f9=expand((2*x+3*y+5*z)^100)); The output is: FIRST_RUN 0.00923s 0.008251s 0.007379s 0.007837s 0.015366s 0.015251s 0.019319s 0.020222s 0.021082s UNASSIGN SECOND_RUN 0.00717s 0.006453s 0.006901s 0.007438s 0.015191s 0.012014s 0.015484s 0.018273s 0.019675s As you can see the expansion time for e9 is 2x that of e1, and if I'll continue, it will keep rising, to e.g. 20x at e100 and beyond. However, if I'll unassign the variables, the expansion time will be back to the original value. My guess would be that some kind of a caching system is at fault for this. Do you have any ideas as to which one exactly, and how to solve this? Thanks in advance, Vitaly

Show replies by date

Richard B. Kreckel

30 Mar 30 Mar

11:34 p.m.

Hi Vitaly, On 3/30/26 12:47 PM, Vitaly Magerya wrote:

...

I've noticed a worrying performance issue, and I was wondering if you could shed some light onto it.

In ginsh, the time to do a simple expand() seems to get worse the more variables you've defined. Here's a test script:[...]

...

As you can see the expansion time for e9 is 2x that of e1, and if I'll continue, it will keep rising, to e.g. 20x at e100 and beyond.

However, if I'll unassign the variables, the expansion time will be back to the original value.

My guess would be that some kind of a caching system is at fault for this. Do you have any ideas as to which one exactly, and how to solve this?

This is obviously due to the fact that, in ginsh, assigning a symbol triggers a substitution of that symbol in all other assignments. See commit 22716a2cc7. Off the top of my head, I don't know how this could be fixed. -richy. -- Richard B. Kreckel <https://in.terlu.de/~kreckel/>

Vitaly Magerya

31 Mar 31 Mar

12:41 a.m.

On Mon, 30 Mar 2026 at 23:34, Richard B. Kreckel <kreckel@in.terlu.de> wrote:

...

This is obviously due to the fact that, in ginsh, assigning a symbol triggers a substitution of that symbol in all other assignments.

See commit 22716a2cc7.

Oh, wow. Thanks a lot. Yes, that seems to be it. In my use case there are many variables, all assigned only once, and none are used before assignment (as it would be in e.g. C++), so the subs() from that commit are guaranteed to do nothing... I think I'll need to revert it; the performance penalty is too great. I do wish there was a way to opt out. I'm not happy about carrying patches. At least for my use case, the "new" behavior from 1.4.4 [1] is the way to go. I guess there are two types of assignment: retroactive ("old", aka current, aka Mathematica-style), and scope-inducing ("new", aka reverted in 22716a2cc7). Maybe a compromise would be a separate operator for each? I.e. "=" for one and ":=" for the other? Then two hash tables could be maintained: one for substitutions in newly parsed expressions, and one for patching evaluation results? "=" would add the mapping to both, ":=" would only add to the first? This way I could switch to ":=" and not suffer the performance penalty. Or maybe it would be possible to introduce the two hash tables, and only add to the first if the symbol being assigned has already been used (which is never the case in my usage)? Then we get the same effect, but no need for new syntax. In any case, thanks again. [1] https://www.ginac.de/pipermail/ginac-devel/2024-December/002655.html Best regards, Vitaly

Richard B. Kreckel

10:51 p.m.

Hi Vitaly, We should now use our nifty new issue tracker! ;-) <https://codeberg.org/ginac/ginac/issues/4> On 3/31/26 12:41 AM, Vitaly Magerya wrote:

...

In my use case there are many variables, all assigned only once, and none are used before assignment (as it would be in e.g. C++), so the subs() from that commit are guaranteed to do nothing... I think I'll need to revert it; the performance penalty is too great.

Surprising to see that many people use ginsh for heavy-lifting!

...

I do wish there was a way to opt out. I'm not happy about carrying patches. At least for my use case, the "new" behavior from 1.4.4 [1] is the way to go.

Okay, so we agree that this is "just" a performance issue. So, this problem is only in ginsh, which suggests that it should also be fixed in ginsh.

...

I guess there are two types of assignment: retroactive ("old", aka current, aka Mathematica-style), and scope-inducing ("new", aka reverted in 22716a2cc7). Maybe a compromise would be a separate operator for each? I.e. "=" for one and ":=" for the other? Then two hash tables could be maintained: one for substitutions in newly parsed expressions, and one for patching evaluation results? "=" would add the mapping to both, ":=" would only add to the first? This way I could switch to ":=" and not suffer the performance penalty.

Or maybe it would be possible to introduce the two hash tables, and only add to the first if the symbol being assigned has already been used (which is never the case in my usage)? Then we get the same effect, but no need for new syntax.

Yes, you are going into the right direction. This idea won't cure the behavior in general. But it will boost up your case. Another idea worth pursuing might be to keep a set of symbols contained in the rhs of each assigned symbol together with the assignemnt itself. When assigning a new symbol, and before doing the substitution in all existing assignments, ginsh could check if it makes sense to do the substitution prior to actually calling .subs(). This won't cure the O(N) behavior but it will make the slope much more flat. Or even a combination of both ideas since they address different cases. I don't think this is hard. Feel free to have a stab at it. -richy. -- Richard B. Kreckel <https://in.terlu.de/~kreckel/>

Richard B. Kreckel

9 Apr 9 Apr

11:02 a.m.

Hi Vitaly, Walking this lane a little bit more (assuming the current behavior should be kept). On 3/31/26 10:51 PM, Richard B. Kreckel wrote:

...

Another idea worth pursuing might be to keep a set of symbols contained in the rhs of each assigned symbol together with the assignemnt itself. When assigning a new symbol, and before doing the substitution in all existing assignments, ginsh could check if it makes sense to do the substitution prior to actually calling .subs(). This won't cure the O(N) behavior but it will make the slope much more flat. This boils down to keeping track of unassigned (free) symbols in ginsh in addition to the assigned signals which we already track, and to also track in which assigned symbol every free symbol occurs in.

An assigned symbol can always be freed by unassign('x'). And a free symbol can be assigned by e.g. y=42*x and at this moment we subs(y==42*x) in all assigned symbols where y occurs in and tell the free symbol x that it now occurs in all assigned symbols where previously y was in. Effectively, all free symbols in y (x in our example) become free symbols in the assigned symbols where we just substituted. That is a little bit tedious but straightforward and should fix your performance problems, I guess. A slightly more complete implementation would verify, after the substitution, whether the free symbols just introduced in the assigned symbol *really* occurs in the assigned symbols expression, using .has(), constructing a symbols_map, or similar. After all, the symbol might have cancelled in the substitution! Doing this would descend into the expression and be a rather expensive operation. Is that worth it? Not sure. It might be better to lazily omit this extra verification. What would be the cost? At most an possible extra substitution later because ginsh doesn't know that the symbol it is substituting hasn't cancelled from the assigned symbol. (The result expression would be the same.) Bye, -richard. PS: Just in case, I added a little test suite for ginsh. -- Richard B. Kreckel <https://in.terlu.de/~kreckel/>

Richard B. Kreckel

10 Apr 10 Apr

10:36 a.m.

Hi Vitaly, On 4/9/26 11:02 AM, Richard B. Kreckel wrote: [...]

...

This boils down to keeping track of unassigned (free) symbols in ginsh in addition to the assigned signals which we already track, and to also track in which assigned symbol every free symbol occurs in.

An assigned symbol can always be freed by unassign('x').

And a free symbol can be assigned by e.g. y=42*x and at this moment we subs(y==42*x) in all assigned symbols where y occurs in and tell the free symbol x that it now occurs in all assigned symbols where previously y was in. Effectively, all free symbols in y (x in our example) become free symbols in the assigned symbols where we just substituted.

That is a little bit tedious but straightforward and should fix your performance problems, I guess.

A slightly more complete implementation would verify, after the substitution, whether the free symbols just introduced in the assigned symbol *really* occurs in the assigned symbols expression, using .has(), constructing a symbols_map, or similar. After all, the symbol might have cancelled in the substitution! Doing this would descend into the expression and be a rather expensive operation. Is that worth it? Not sure. It might be better to lazily omit this extra verification. What would be the cost? At most an possible extra substitution later because ginsh doesn't know that the symbol it is substituting hasn't cancelled from the assigned symbol. (The result expression would be the same.) I tried implementing this on a new branch named ginsh-performance. It passes the new ginsh test suite. \o/

Please do try with your applications! (Not sure if it is free of bugs and worth the effort.) -richy. -- Richard B. Kreckel <https://in.terlu.de/~kreckel/>

107

Age (days ago)

118

Last active (days ago)

List overview

Download

5 comments

2 participants

participants (2)

Richard B. Kreckel
Vitaly Magerya