Faster unarchiving of large equation systems
Dear All, I’m using GiNaC for automatic derivation of the Jacobian for very large non-linear equation systems for use in the Sundials suite of solvers. The need for parallelisation arose and I used MPI for the job since GiNaC is not well suited for multi threading due to its ref counting scheme. So I setup appropriate MPI broadcasting code for GiNaC ex() objects using the available archive classes in GiNaC and boost based memory stream IO. All went well except for the unarchiving performance…. The problem lies in symbol::read_archive where for each unarchived symbol the list of ALL symbols is searched for a symbol of the same name: void symbol::read_archive(const archive_node &n, lst &sym_lst) { …... // If symbol is in sym_lst, return the existing symbol for (auto & s : sym_lst) { if (is_a<symbol>(s) && (ex_to<symbol>(s).name == tmp_name)) { *this = ex_to<symbol>(s); …… For cases with 500K symbols this becomes unbearably slow (and this has to happen on each MPI-node) My solution in my little local GiNaC branch was to introduce a second read_archive interface called read_archive_MPI which instead of a GiNaC::lst &sym_lst gets a std::map<std::string, ginac::ex> map that allows for quickly finding a previously stored symbol of the same name: void symbol::read_archive_MPI(const archive_node &n, SymbolMap &sym_lst) { inherited::read_archive_MPI(n, sym_lst); serial = next_serial++; std::string tmp_name; n.find_string("name", tmp_name); // If symbol is in sym_lst, return the existing symbol SymbolMap::iterator it = sym_lst.find(tmp_name); if(it != sym_lst.end()) { *this = ex_to<symbol>((it->second)); This approach is ABI compatible to older code but requires read_archive_MPI members in all ginac::basic derived objects. I’d like to stress that this does not introduce any MPI dependencies in GiNaC, the MPI serialisation code is separate and merely utilises the new archiving system. None the less I figured I might share this with you since I deem it a reasonable improvement for large equation systems and makes GiNaC available for MPI computing. Let me know what you think Klaus
Hi Klaus, On 06.12.18 07:29, Kemlath Orcslayer via GiNaC-devel wrote:
This approach is ABI compatible to older code but requires read_archive_MPI members in all ginac::basic derived objects. I’d like to stress that this does not introduce any MPI dependencies in GiNaC, the MPI serialisation code is separate and merely utilises the new archiving system.
Thanks for sharing your experience with us! If unarchiving is relevant for such large numbers of symbols, then the current design with a sym_lst is quite bad. You are the first to report this here, though. Note that adding a virtual member function does generally break the ABI of derived classes! They have to be recompiled. Why not fix this by adding a second signature to read_archive, i.e. by just dropping your _MPI prefix? We also have to consider that user-written subclasses might have implemented their own read_archive function. Maybe, we could have a virtual base class of read_archive_MPI (or the overload without _MPI) that interfaces to the existing read_archive function so we could keep API compatibility for some time? Anyhow, it sounds interesting. Can you post your code, please? Cheers -richy. -- Richard B. Kreckel <https://in.terlu.de/~kreckel/>
participants (2)
-
Kemlath Orcslayer
-
Richard B. Kreckel