Hi! On Wed, Sep 24, 2003 at 12:07:40AM +0200, Richard B. Kreckel wrote:
AMD released the Opteron processor family today leaving people with the budget to buy new hardware wondering what exactly to purchase next.
Well, for those among us who don't have the budget to always buy the latest kick-ass machines (with their "SDRAM memory" and "hardware accelerated 3D" and other crazy stuff), the GiNaC Retro Hardware Testing Labs are proud to present what you've all been waiting for: The ultimate CAS shootout at 2x200 MHz - No rules, no mercy. Two CPUs enter, one CPU leaves. (then, after a while, the other CPU leaves, as soon as I manage to get the heat sink off the f*cking thing...) The contestants: System 1 - ppc: Umax Pulsar, Dual PowerPC 604e ("Extreme"?) at 200 MHz L1 cache: 32KB I, 32KB D per CPU Apple Tsunami board (also used in PowerMac 9500) L2 cache: 512KB for both CPUs, at 50 MHz 50 MHz system bus 144MB EDO RAM, 60ns Yellow Dog Linux 2.3 (based on Red Hat 7.2) Kernel 2.4.19-4asmp GCC 2.95.4 System 2 - x86: Dual Pentium Pro 512K at 200 MHz L1 cache: 8KB I, 8KB D per CPU L2 cache: 512KB per CPU, at 200 MHz Intel Providence (PR440FX) board 66 MHz system bus 256MB registered EDO RAM, 60ns Red Hat Linux 7.3 Kernel 2.4.20-20.7smp GCC 2.96 Both machines were equipped with Matrox Millennium graphics cards and SCSI hard disks (ppc: 4GB IBM Fast Narrow; x86: 2GB Conner Fast Wide). The Umax Pulsar features a fan that appears to be optimized for maximum noise output. Jet pilots should feel right at home with this computer. The Intel machine, on the other hand, sports a hard disk that I could still hear while standing under the shower. Ear protection should be worn at all times when running both systems in the same room. But on to the benchmarks... The tests consisted of compiling GiNaC 1.0.15 (GiNaC >=1.1 would have required GCC 3), and running its standard benchmark suite. The compiler options used were ppc: -g -O2 -mcpu=604e x86: -g -O2 -march=pentiumpro and GiNaC was configured with the --disable-static option (the shared library will be the one used most by applications, anyway). For the compilation test, only the time required for compiling the library and tools (ginsh/viewgar) was measured, not the time for compiling the benchmark suite. The library was built with "make -j 2" ("make -j 3" was slower by about 30s on both machines). ppc x86 ---------------------------------------------------------------------- compile GiNaC 1.0.15 25m 34s 16m 42s The Pentium Pro really shines here, which may be due to its faster and larger (combined) L2 cache. But this comparison isn't quite fair really, as the compilers are of course using different backends on both systems and producing different output. So, without further ado, on to the real tests: ppc x86 ---------------------------------------------------------------------- commutative expansion and substitution, size 100 1.43s 1.62s commutative expansion and substitution, size 200 7.32s 7.14s ratio [5.12] [4.41] Laurent series expansion of Gamma function, order 20 9.91s 7.429s Laurent series expansion of Gamma function, order 25 38.74s 28.339s ratio [3.91] [3.81] determinant of symbolic 10x10 Vandermonde matrix 6.55s 6.86s determinant of symbolic 12x12 Vandermonde matrix 56.57s 63.28s ratio [8.64] [9.22] determinant of symbolic 8x8 Toeplitz matrix 4.82s 5.65s determinant of symbolic 9x9 Toeplitz matrix 18.98s 21.12s ratio [3.94] [3.74] Lewis-Wester test A (divide factorials) 0.38s 0.56s Lewis-Wester test B (sum of rational numbers) 0.04s 0.059s Lewis-Wester test C (gcd of big integers) 0.4s 0.619s Lewis-Wester test D (normalized sum of rational fcns) 1.5s 1.689s Lewis-Wester test E (normalized sum of rational fcns) 1.28s 1.489s Lewis-Wester test F (gcd of 2-var polys) 0.17s 0.19s Lewis-Wester test G (gcd of 3-var polys) 3.91s 4.459s Lewis-Wester test H (det of 80x80 Hilbert) 23.12s 27.66s Lewis-Wester test I (invert rank 40 Hilbert) 7.37s 8.6s Lewis-Wester test K (invert rank 70 Hilbert) 47.17s 54.45s ratio [6.40] [6.33] Lewis-Wester test J (check rank 40 Hilbert) 3.95s 5.05s Lewis-Wester test L (check rank 70 Hilbert) 22.25s 28.36s ratio [5.63] [5.62] Lewis-Wester test M1 (26x26 sparse, det) 0.88s 1.189s Lewis-Wester test O1 (three 15x15 dets) (average) 109.783s 90.246s Lewis-Wester test P (det of sparse rank 101) 2.86s 4.19s Lewis-Wester test P' (det of less sparse rank 101) 14.66s 17.51s computation of antipodes in Yukawa theory (total) 192.64s 172.27s timing Fateman's polynomial expand benchmark 362.21s 293.579s Now, this comes as a bit of a surprise. After reading the MuPAD benchmarks published at http://www.heise.de/ct/english/96/11/270/ running on machines very similar to mine, I really expected the Pentium Pro to wipe the floor with the PowerPC here, but it's actually the other way round. The 604e wins almost all categories, with some notable exceptions: the Gamma series expansion, O1, the Yukawa thing, and the expand benchmark. On the other hand, judging from the "ratio" lines above, the performance of the Pentium Pro appears to scale better with larger data sets (again with one exception: the Vandermonde determinants). This, no doubt, is due to the faster cache and generally better memory interface of the Intel machine. But still, my next personal machine won't be a Pentium Pro, and it won't be a "G2" PowerMac, either. The VCS 2600 is going cheap on eBay, though... Bye, Christian -- / Physics is an algorithm \/ http://www.uni-mainz.de/~bauec002/