Performance of simplify_indexed on long chains of Dirac matrices
Dear all, I'm wondering about the performance of simplify_indexed applied to a long chain of Dirac matrices in D dimensions. For example, simplifying g^i1 g^i2 g^i3 g^i4 g^i5 g^i6 g^i7 g^i8 g^i9 g^i1 g^i2 g^i3 g^i4 g^i5 g^i6 g^i7 g^i8 g^i9 requires about 15 seconds on my machine. Is this something one would normally expect with GiNaC, or am I lacking some essential optimizations? I'm using Fedora 20 64-bit with newest GiNaC from the Git repository. The machine is an i5-3570 with 16GB RAM. The source code --------------------------------------------------------------------- #include <iostream> #include <ginac/ginac.h> using namespace std; using namespace GiNaC; int main() { symbol D("D"); varidx i1(symbol("i1"), D), i2(symbol("i2"), D), i3(symbol("i3"), D), i4(symbol("i4"), D), i5(symbol("i5"), D), i6(symbol("i6"), D), i7(symbol("i7"), D), i8(symbol("i8"), D), i9(symbol("i9"), D); cout << (simplify_indexed(dirac_gamma(i1)* dirac_gamma(i2)* dirac_gamma(i3)* dirac_gamma(i4)* dirac_gamma(i5)* dirac_gamma(i6)* dirac_gamma(i7)* dirac_gamma(i8)* dirac_gamma(i9)* dirac_gamma(i1.toggle_variance())* dirac_gamma(i2.toggle_variance())* dirac_gamma(i3.toggle_variance())* dirac_gamma(i4.toggle_variance())* dirac_gamma(i5.toggle_variance())* dirac_gamma(i6.toggle_variance())* dirac_gamma(i7.toggle_variance())* dirac_gamma(i8.toggle_variance())* dirac_gamma(i9.toggle_variance()) )).expand() << endl; return 0; } --------------------------------------------------------------------- is compiled and run via c++ test.cpp -o test -Ofast -lcln -lginac -Wl,-rpath,/usr/local/lib time ./test Cheers, Vladyslav
Dear Vladyslav,
On Sun, 04 Jan 2015 00:48:23 +0100, Vladyslav Shtabovenko <v.shtabovenko@tum.de> said: VSh> I'm wondering about the performance of simplify_indexed applied VSh> to a long chain of Dirac matrices in D dimensions.
You raised a valid concern. Let me provide some background. Historically, GiNaC was dealing with Dirac matrices associated to the Lorentz metric only. At some point the code was expanded to treat an algebra with a general commuting/anticommuting relation between generators. Since the generality reduced performance, the original code optimised for Dirac matrices was preserved as much as possible. I checked your example, it did not call to general time-consuming cliffordunit::contract_with(), canonicalize_clifford() and get_metric(). Instead, it (as it is supposed to do) made numerous calls to clifford::eval_ncmul() and diracgamma::contract_with(). I believe that the later two functions are mainly as they were in the original GiNaC code optimised for Dirac matrices. So, a further investigation of the code is required.... Best wishes, Vladimir -- Vladimir V. Kisil email: kisilv@maths.leeds.ac.uk www: http://www.maths.leeds.ac.uk/~kisilv/ Book: Geometry of Mobius Transformations http://www.worldscientific.com/worldscibooks/10.1142/p835
Dear Vladimir, I didn't have a look at the source code yet, but as far as I know, there are essentially two ways to simplify D-dimensional Dirac matrix chains of type g^mu ... g_mu (if there are more, I'm always eager to learn) 1) By repeatedly applying the anticommutation relations to move g_mu through all other matrices to g^mu. This is the easiest but also the slowest way to implement. 2) By using a recursion formula as found in Veltman's Gammatrica or in Rolf Mertig's FeynCalc paper. This improves the performance but still can take much time for long chains. One can speed up the recursion formula even further by caching the results that have already been computed. In Mathematica language that's called "memoization". This technique is used e.g. in FeynCalc. Since simplify_indexed doesn't have many analogues in other HEP tools (people mostly care only about traces), it might be reasonable to look at the performance of dirac_trace instead. With GiNaC this (dirac_trace(dirac_gamma(i1)* dirac_gamma(i2)* dirac_gamma(i3)* dirac_gamma(i4)* dirac_gamma(i5)* dirac_gamma(i6)* dirac_gamma(i7)* dirac_gamma(i1.toggle_variance())* dirac_gamma(i2.toggle_variance())* dirac_gamma(i3.toggle_variance())* dirac_gamma(i4.toggle_variance())* dirac_gamma(i5.toggle_variance())* dirac_gamma(i6.toggle_variance())* dirac_gamma(i7.toggle_variance()) )).expand() takes around 5 seconds on my machine, a similar code with FORM Symbol d; Dimension d; AutoDeclare Indices i; Local M2 = g_(1,i1,i2,i3,i4,i5,i6,i7,i1,i2,i3,i4,i5,i6,i7); tracen,1; Print M2; .sort .end requires less than 0.01 seconds. With FeynCalc $LoadPhi = False; $LoadTARCER = False; $LoadFeynArts = True; << HighEnergyPhysics`FeynCalc`; Tr[GAD[i1, i2, i3, i4, i5, i6, i7].GAD[i1, i2, i3, i4, i5, i6, i7]] // AbsoluteTiming I get the result in 0.007 seconds. So I think that it would be very nice if someone of the GiNaC developers could have a look at this issue. Cheers, Vladyslav Am 04.01.2015 um 13:29 schrieb Vladimir V. Kisil:
Dear Vladyslav,
On Sun, 04 Jan 2015 00:48:23 +0100, Vladyslav Shtabovenko <v.shtabovenko@tum.de> said: VSh> I'm wondering about the performance of simplify_indexed applied VSh> to a long chain of Dirac matrices in D dimensions.
You raised a valid concern. Let me provide some background. Historically, GiNaC was dealing with Dirac matrices associated to the Lorentz metric only. At some point the code was expanded to treat an algebra with a general commuting/anticommuting relation between generators. Since the generality reduced performance, the original code optimised for Dirac matrices was preserved as much as possible.
I checked your example, it did not call to general time-consuming cliffordunit::contract_with(), canonicalize_clifford() and get_metric(). Instead, it (as it is supposed to do) made numerous calls to clifford::eval_ncmul() and diracgamma::contract_with(). I believe that the later two functions are mainly as they were in the original GiNaC code optimised for Dirac matrices.
So, a further investigation of the code is required....
Best wishes, Vladimir
On Sun, 04 Jan 2015 18:53:51 +0100, Vladyslav Shtabovenko <v.shtabovenko@tum.de> said: VSh> 1) By repeatedly applying the anticommutation relations to move VSh> g_mu through all other matrices to g^mu. This is the easiest VSh> but also the slowest way to implement.
This is what GiNaC is using, as far as I understand. VSh> (people mostly care only about traces), it might be reasonable VSh> to look at the performance of dirac_trace instead. IMHO, this routine still uses clifford::eval_ncmul(), which I think is the principal bottleneck. VSh> takes around 5 seconds on my machine, a similar code with FORM VSh> requires less than 0.01 seconds. With FeynCalc Probably, GiNaC will never do a particular task as good as a specialised software. GiNaC allows to mix all types of objects in a single expression, so all simplifying algorithms need to be of a very plain generic nature. VSh> So I think that it would be very nice if someone of the GiNaC VSh> developers could have a look at this issue. I am not familiar with Dirac matrices close enough to optimise the performance beyond the simple anticommutation. It will be good if you can come with some optimisation suggestions of clifford::eval_ncmul(). Best wishes, Vladimir -- Vladimir V. Kisil email: kisilv@maths.leeds.ac.uk www: http://www.maths.leeds.ac.uk/~kisilv/ Book: Geometry of Mobius Transformations http://www.worldscientific.com/worldscibooks/10.1142/p835
Dear Vladimir, for the optimization strategy I would have a look at what FORM does. The algorithm is described here <http://www.nikhef.nl/~form/maindir/documentation/reference/online/online.html#gammaalgebra> and the corresponding C code is available on GitHub: <https://github.com/vermaseren/form/blob/master/sources/opera.c> Cheers, Vladyslav Am 04.01.2015 um 20:36 schrieb Vladimir V. Kisil:
On Sun, 04 Jan 2015 18:53:51 +0100, Vladyslav Shtabovenko <v.shtabovenko@tum.de> said: VSh> 1) By repeatedly applying the anticommutation relations to move VSh> g_mu through all other matrices to g^mu. This is the easiest VSh> but also the slowest way to implement.
This is what GiNaC is using, as far as I understand.
VSh> (people mostly care only about traces), it might be reasonable VSh> to look at the performance of dirac_trace instead.
IMHO, this routine still uses clifford::eval_ncmul(), which I think is the principal bottleneck.
VSh> takes around 5 seconds on my machine, a similar code with FORM VSh> requires less than 0.01 seconds. With FeynCalc
Probably, GiNaC will never do a particular task as good as a specialised software. GiNaC allows to mix all types of objects in a single expression, so all simplifying algorithms need to be of a very plain generic nature.
VSh> So I think that it would be very nice if someone of the GiNaC VSh> developers could have a look at this issue.
I am not familiar with Dirac matrices close enough to optimise the performance beyond the simple anticommutation. It will be good if you can come with some optimisation suggestions of clifford::eval_ncmul().
Best wishes, Vladimir
Dear Vladyslav, I made some attempts but failed to optimise the GiNaC performance in contracting Dirac gammas. In my opinion the reason is, that the competitors are tailored for this particular problem and the ideology of GiNaC does not allow it to be as effective. GiNaC is highly extendable from its design. A user may define a new object which will be able to contract with the pre-defined Dirac gammas in any manner, which the user will define in his code. So, the core GiNaC does not have a choice but proceed in very general (an inefficient!) manner: contract or permute one Dirac gamma with something else, then make an expansion of the new expression and then recursively apply all the previous steps to every new sub-expression. So, the number of sub-expression and recursive calls (exponentially?) grows with the increment of number of Dirac gammas in the initial product. If you do need a fast computations with Dirac gammas in GiNaC, then you have to derive a new class for them straight from GiNaC::basic and implement those algorithms, which other software use. Present gammas are derived from GiNaC::indexed and have to obey all generic rules implemented in simplify_indexed() method, which are very costly. Best wishes, Vladimir -- Vladimir V. Kisil email: kisilv@maths.leeds.ac.uk www: http://www.maths.leeds.ac.uk/~kisilv/ Book: Geometry of Mobius Transformations http://www.worldscientific.com/worldscibooks/10.1142/p835
On Tue, 06 Jan 2015 02:04:41 +0100, Vladyslav Shtabovenko <v.shtabovenko@tum.de> said:
VS> Dear Vladimir, for the optimization strategy I would have a look VS> at what FORM does. VS> The algorithm is described here VS> <http://www.nikhef.nl/~form/maindir/documentation/reference/online/online.html#gammaalgebra> VS> and the corresponding C code is available on GitHub: VS> <https://github.com/vermaseren/form/blob/master/sources/opera.c> VS> Cheers, Vladyslav VS> Am 04.01.2015 um 20:36 schrieb Vladimir V. Kisil: >>>>>>> On Sun, 04 Jan 2015 18:53:51 +0100, Vladyslav Shtabovenko >>>>>>> <v.shtabovenko@tum.de> said: VSh> 1) By repeatedly applying the anticommutation relations to move VSh> g_mu through all other matrices to g^mu. This is the easiest VSh> but also the slowest way to implement. >> >> This is what GiNaC is using, as far as I understand. >> VSh> (people mostly care only about traces), it might be reasonable VSh> to look at the performance of dirac_trace instead. >> >> IMHO, this routine still uses clifford::eval_ncmul(), which I >> think is the principal bottleneck. >> VSh> takes around 5 seconds on my machine, a similar code with FORM VSh> requires less than 0.01 seconds. With FeynCalc >> >> Probably, GiNaC will never do a particular task as good as a >> specialised software. GiNaC allows to mix all types of objects in >> a single expression, so all simplifying algorithms need to be of >> a very plain generic nature. >> VSh> So I think that it would be very nice if someone of the GiNaC VSh> developers could have a look at this issue. >> >> I am not familiar with Dirac matrices close enough to optimise >> the performance beyond the simple anticommutation. It will be >> good if you can come with some optimisation suggestions of >> clifford::eval_ncmul(). >> >> Best wishes, Vladimir >>
Dear Vladimir, thanks for your efforts and the clarification. Personally, I'm using GiNaC mostly for cross-checking results obtained with other tools, so that the performance is not the main issue for me. However, it was interesting to learn the background of the slowness on large traces. Cheers, Vladyslav Am 24.01.2015 um 00:46 schrieb Vladimir V. Kisil:
Dear Vladyslav,
I made some attempts but failed to optimise the GiNaC performance in contracting Dirac gammas. In my opinion the reason is, that the competitors are tailored for this particular problem and the ideology of GiNaC does not allow it to be as effective.
GiNaC is highly extendable from its design. A user may define a new object which will be able to contract with the pre-defined Dirac gammas in any manner, which the user will define in his code. So, the core GiNaC does not have a choice but proceed in very general (an inefficient!) manner: contract or permute one Dirac gamma with something else, then make an expansion of the new expression and then recursively apply all the previous steps to every new sub-expression. So, the number of sub-expression and recursive calls (exponentially?) grows with the increment of number of Dirac gammas in the initial product.
If you do need a fast computations with Dirac gammas in GiNaC, then you have to derive a new class for them straight from GiNaC::basic and implement those algorithms, which other software use. Present gammas are derived from GiNaC::indexed and have to obey all generic rules implemented in simplify_indexed() method, which are very costly.
Best wishes, Vladimir
participants (2)
-
Vladimir V. Kisil
-
Vladyslav Shtabovenko