Gentlemen, this mail is a little bit related to the mail from Richy last Friday (class function revisited ...). If you are planning to revise this issue, I have some suggestions what could be useful in the future. Suppose I use symbolic manipulations to arrive at a formula like ex f = sin(x) + tgamma(1+x) + pow(x,5) + more complicated stuff and I would like to do a Monte Carlo integration int( f , x=0..1) and I would like to get an accuracy of 2 or 3 digits in a reasonable amount of time. The fastest way would certainly be to print f as C-code, edit the file and compile it with the Monte-Carlo integration routine. But suppose I'm too lazy to do this print/edit/compile cycle. I just want to do in ONE program a) calculate the function by symbolic manipulations b) evaluate f a few thousand times. This will certainly be never as fast (in CPU time) as the print/edit/compile method, but if the additional CPU time is of the same size as the time I would need to edit and compile the thing, it would be more convenient. For point b) double precision would be more than enough. Now at the moment, some functions are evaluated in CLN with arbitrary precision. This is probably overkill. Worse, some functions (like tgamma) do not have a numerical evaluation at all at the moment. To implement a numerical evaluation for these functions to arbitrary precission requires a fair amount of work, but for double precision algorithms are likely to exist (for example the GNU Scientific Library). My suggestion would therefore be, -that class numeric gets a status flag, which signifies "only interested in double precision" - and class function gets a pointer "evalf_with_double_precision", which points to the user-supplied evaluation routine. What do you think ? Best wishes, Stefan
Hi, Thanks very much for your suggestion, Stefan. On Fri, 29 Jun 2001, Stefan Weinzierl wrote:
this mail is a little bit related to the mail from Richy last Friday (class function revisited ...).
Well, I was almost beaten up by C.B. for my suggestion to write Sin(x)...
Suppose I use symbolic manipulations to arrive at a formula like
ex f = sin(x) + tgamma(1+x) + pow(x,5) + more complicated stuff
and I would like to do a Monte Carlo integration
int( f , x=0..1)
and I would like to get an accuracy of 2 or 3 digits in a reasonable amount of time.
The fastest way would certainly be to print f as C-code, edit the file and compile it with the Monte-Carlo integration routine.
Alternatively emit the file, automatically add the necessary boilerplate, compile it and link it back in using dlopen(3). On systems that support dlopen, such as Linux, with a little effort the whole procedure can be entirely autmated, as far as I can see.
But suppose I'm too lazy to do this print/edit/compile cycle. I just want to do in ONE program a) calculate the function by symbolic manipulations b) evaluate f a few thousand times.
This will certainly be never as fast (in CPU time) as the print/edit/compile method, but if the additional CPU time is of the same size as the time I would need to edit and compile the thing, it would be more convenient.
For point b) double precision would be more than enough. Now at the moment, some functions are evaluated in CLN with arbitrary precision. This is probably overkill.
Worse, some functions (like tgamma) do not have a numerical evaluation at all at the moment.
To implement a numerical evaluation for these functions to arbitrary precission requires a fair amount of work, but for double precision algorithms are likely to exist (for example the GNU Scientific Library).
My suggestion would therefore be,
-that class numeric gets a status flag, which signifies "only interested in double precision"
- and class function gets a pointer "evalf_with_double_precision", which points to the user-supplied evaluation routine.
I am not exactly sure how this is supposed to eventually work out. Remember that a datatype derived from basic has already a sizeof of at least 5 words (including vptr), i.e. 20 Bytes or so. If I multiply some of these I have several function calls and lots of code to roll throuh my CPU. Contrast this with an operation in double precision, which occupies two words for each operand and the operation is done without calls at the speed of hardware. Further, I have only vague ideas where this should be stored. CLN has several data types for fixed-sized floats -- short floats (cl_SF), floats (cl_FF), double floats (cl_DF) -- as well as one data type for arbitrary precision, cl_LF which is the one we are using. cl_SF (and on 64-bit architectures also cl_FF) are immediate, i.e. are not heap allocated. CLN does this by using a union representation that either holds a pointer to the heap or an immediate object and distinguishing between these by cleverly tagging some bits. Ideally, we would use this flexibility to store doubles in class `numeric'. This could probably be done, even without introducing a status flag since we could directly use the tests provided by CLN. Since a double is usually 64 bits, CLN unfortunately has no chance to do the same with machine doubles. But we use these in order to be able to support arbitrary precision! CLN has more than one implementation for each numerical function. In fact, it usually has four of them, one for each numerical class since it is braindead to trigger some infinite precision algorithm of sin(x) if only a fixed size float is requested. So, maybe what should be done is augment these functions by incomplete sets of quadruples? Given that there is no clear way of getting rid of the overall bloat, is it really worth doing this for a double since a directly compiled program will always execute 1000 times faster? I don't know... Regards -richy. -- Richard B. Kreckel <Richard.Kreckel@Uni-Mainz.DE> <http://wwwthep.physik.uni-mainz.de/~kreckel/>
On Fri, 29 Jun 2001, Richard B. Kreckel wrote:
Suppose I use symbolic manipulations to arrive at a formula like
ex f = sin(x) + tgamma(1+x) + pow(x,5) + more complicated stuff
and I would like to do a Monte Carlo integration
int( f , x=0..1)
and I would like to get an accuracy of 2 or 3 digits in a reasonable amount of time.
I tried something similar in Maple about 3 years ago (for insiders: Dirk's integral representation for 2loop 2point functions), a 2fold Monte Carlo integration on a not-too-complex function (logs of rational arguments times a rational function). I rewrote Vegas in Maple and used its evalhf() function which evaluates expressions with the floating point hardware in double precision (because - again for insiders - xloops automatically emitted the function as a C program, compiled it, linked it with Vegas, spawned an external process and read the results from a file). I thought it would be faster to stay inside Maple, at least for some "preview mode". However Maple turned out to be unusably slow, IIRC a factor 500. I expect similar results for a GiNaC in double-precision mode.
Alternatively emit the file, automatically add the necessary boilerplate, compile it and link it back in using dlopen(3). On systems that support dlopen, such as Linux, with a little effort the whole procedure can be entirely autmated, as far as I can see.
I think it is worth writing a prototype for this and include it in the distribution (or at least documentation) if it is generic enough. cint can use a similar trick (#pragma compile). Alex -- Alexander Frink E-Mail: Alexander.Frink@Uni-Mainz.DE Institut fuer Physik Phone: +49-6131-3923391 Johannes-Gutenberg-Universitaet D-55099 Mainz, Germany
On Fri, 29 Jun 2001, Alexander Frink wrote:
I expect similar results for a GiNaC in double-precision mode.
Alternatively emit the file, automatically add the necessary boilerplate, compile it and link it back in using dlopen(3). On systems that support dlopen, such as Linux, with a little effort the whole procedure can be entirely autmated, as far as I can see.
I think it is worth writing a prototype for this and include it in the distribution (or at least documentation) if it is generic enough. cint can use a similar trick (#pragma compile).
Hi, thinking about it, I believe that an automated approach based on dlopen is indeed the better alternative. Stefan
On Mon, 2 Jul 2001, Stefan Weinzierl wrote:
On Fri, 29 Jun 2001, Alexander Frink wrote:
I expect similar results for a GiNaC in double-precision mode.
Well, it should be womewhat faster. Those for-loops in interpreted CAS are so incredibly sloooowww... Just assembling those matrices for the Lewis-Wester checks takes some while in Maple and MuPAD whereas it is a no-timer in GiNaC.
Alternatively emit the file, automatically add the necessary boilerplate, compile it and link it back in using dlopen(3). On systems that support dlopen, such as Linux, with a little effort the whole procedure can be entirely autmated, as far as I can see.
I think it is worth writing a prototype for this and include it in the distribution (or at least documentation) if it is generic enough. cint can use a similar trick (#pragma compile).
Hi,
thinking about it, I believe that an automated approach based on dlopen is indeed the better alternative.
Maybe someone should research this libltdl-thing which comes with the libtool distribution. From the README: : This is GNU libltdl, a system independent dlopen wrapper for GNU libtool. : : It supports the following dlopen interfaces: : * dlopen (Solaris, Linux and various BSD flavors) : * shl_load (HP-UX) : * LoadLibrary (Win16 and Win32) : * load_add_on (BeOS) : * GNU DLD (emulates dynamic linking for static libraries) : * libtool's dlpreopen Regards -richy. -- Richard Kreckel <Richard.Kreckel@Uni-Mainz.DE> <http://wwwthep.physik.uni-mainz.de/~kreckel/>
participants (3)
-
Alexander Frink
-
Richard B. Kreckel
-
Stefan Weinzierl