Hi, Thanks for your comments, I reply inline below. On 24 May 2014 10:11, Alexei Sheplyakov wrote:
Non-deterministic term order is OK: GiNaC is designed for processing expressions consisting of (many) millions of terms (expressions of this size are very common in high energy physics). The intermediate expressions (which typically are the most sizable) are not going to be printed other than for debugging purposes. Also the output of GiNaC is mostly processed by other programs rather than a human users.
I happen to know what kind of expressions one can get in high energy physics quite well. Determinism has nothing to do with humans reading output, but with reproducibility of results. Reproducibility is an important property of any computational program, even Monte-Carlo simulations are deterministic. Being able to compare two 10G files of expressions by simple diff or md5sum/etc is rather convenient.
This is just plain wrong. The hash seed should be computed from the type information of the *current* object rather than from the type of the object which initiated the very first call to calchash(). The idea is that object having a different type are typically different so they should have different hash values (even if their operands are the same).
You are right that it isn't doing what was intended and any class which doesn't overload calchash will use the same seed, which is a mistake. It is of course arguable whether different classes must use different seed, because the only drawback of a hash collision for different types is one extra typeid call in basic::compare. So ensuring lack of collisions for different types might be not really that important, but it is also easy so why not do it.
In theory one can use a lookup table in order to not recompute the hash seed for the same type twice (GiNaC uses a different approach - make the hash function fast enough so caching gives no advantages).
If that approach gave the same term order every run of a program everyone would be happy.
In general any patch which makes the hash function substantially slower for nothing good at all will be rejected.
I'm attaching a revised version of a patch, it adds virtual functions to every derived class of basic, which effectively cache the class name hashes in local static variables. It is rather small (if noticeable) price to pay for greater convenience for users. I kept old hash in basic::gethashseed() for any class that does not overload it, though it is best not to have such classes. If someone derives from a basic descendant, but doesn't overload gethashseed() they will share the hash_seed which will result in an extra call to typeid in basic::compare (doubtfully noticeable). make_hash_seed is moved to utils.h
If someone needs a predictable output, please write the corresponding print_context.
Yeah, it always puzzled me why GiNaC doesn't have a sorted_print function but suggests users to write their own. Best, Valery