Hi Alexey, On 21.10.20 10:45, Alexey Sheplyakov wrote:
1. Cross-module/whole program analysis (which is a proper name for LTO) is a well established optimization method. And it's exactly the right way for the compiler to know that
a) cl_{SF,FF,DF,LF}_idecode are called mostly by cl_F_idecode (so it's a good idea to inline those 4 functions) b) every cl_I is created through cl_I_from_NDS (so it's a good idea to inline cl_I_from_NDS)
etc.
2. Most (all?) major C++ compilers (GCC, clang, msvc) provide cross-module/whole program analysis. In fact it has been available for (almost) two decades. LTO branch has been merged into GCC trunk back in 2009. Link time code generation (which is msvc name for cross-module analysis) has been available since visual studio 7.0 (released in 2002).
Therefore I prefer the "Stop playing tricks and use the LTO" solution.
We need evidence, please, and benchmarks! I tried compiling CLN with -flto added to CXXFLAGS and it turned out my machine has too little memory (it has 32GiB). It appears to me this isn't quite ready for showtime. -richy. -- Richard B. Kreckel <https://in.terlu.de/~kreckel/>