Schnell_pi has been designed to be as fast as possible. This is why it uses a AGM algorithm. The number of floating point operations needed for the complete calculation of N digits of Pi scales as N*(log2(N)^2), where log2(N) is the base 2 logarithm of N (in the present implementation, N has to be a power of 2). With the present implementation of Schnell_pi (version 1.0), the number is close to 12*N*(log2(N)^2), i.e. around 5 G (1 G = 2^30 = 1.073 billion) floating-point operations.
The fastest competitors of Schnell_pi are programs using Ramanujan type formulae (among which the Chudnovsky formula is apparently the fastest) computed using the Binary Splitting method. Although the number of operations for these programs scale as N*(log2(N)^3), i.e. with an additional power of log2(N), the prefactor is so small that they involve less floating-point operations up to several billion digits. However, less operations does not necessarily imply a faster program. The details of the implementation come into play. Indeed, a careful implementation of the AGM algorithm with FFT based multiplication requires less data movement between the processor and the memory than a Chudnovsky/Binary Splitting algorithm. As such data movement is the main bottleneck for computing a large number of digits, AGM can compete with Chudnovsky/Binary Splitting, even if the number of floating-point operations is roughly doubled for 1 million digits. The two fastest programs (for PC) that I know are:
In all tests but one, Schnell_pi is the fastest program, typically by 10 or 20%. The only exception is for 131,072 digits on a Pentium III 800 MHz, where Schnell_pi needs 1.29 second and QuickPi only 1.18 second. However, Schnell_pi is limited to powers of 2 number of digits, which is not the case for PiFast and QuickPi. For intermediate numbers of digits (for example 1.5 million), the latter programs are faster.
Here are the results:
Pentium II 266 MHz with 128 MB of SDRAM 66 MHz | |||
Number of digits | Time for Schnell_pi | Time for PiFast | Time for QuickPi |
64 k | 1.45 seconds | 2.37 seconds | 1.85 seconds |
128 k | 3.54 seconds | 4.78 seconds | 3.79 seconds |
256 k | 8.26 seconds | 11.21 seconds | 8.46 seconds |
512 k | 18.67 seconds | 24.94 seconds | 19.29 seconds |
1 M | 42.02 seconds | 56.63 seconds | 47.63 seconds |
2 M | 94.41 seconds | 128.63 seconds | 117.47 seconds |
4 M | 213.96 seconds | 300.99 seconds | 281.41 seconds |
8 M | 493.3 seconds | 679.8 seconds | 719.7 seconds |
Pentium III 500 MHz with 256 MB of SDRAM 100 MHz | |||
Number of digits | Time for Schnell_pi | Time for PiFast | Time for QuickPi |
64 k | 0.83 seconds | 1.54 seconds | 1.07 seconds |
128 k | 1.97 seconds | 2.86 seconds | 2.13 seconds |
256 k | 4.55 seconds | 6.54 seconds | 4.78 seconds |
512 k | 10.43 seconds | 14.34 seconds | 11.03 seconds |
1 M | 23.41 seconds | 32.46 seconds | 27.39 seconds |
2 M | 53.10 seconds | 73.44 seconds | 68.01 seconds |
4 M | 117.16 seconds | 172.13 seconds | 162.38 seconds |
8 M | 270.83 seconds | 390.57 seconds | 385.96 seconds |
16 M | 609.18 seconds | 879.20 seconds | 886.05 seconds |
Pentium III 800 MHz with 1 GB of SDRAM 133 MHz | |||
Number of digits | Time for Schnell_pi | Time for PiFast | Time for QuickPi |
64 k | 0.54 seconds | 0.88 seconds | 0.61 seconds |
128 k | 1.29 seconds | 1.93 seconds | 1.18 seconds |
256 k | 2.96 seconds | 4.17 seconds | 3.07 seconds |
512 k | 6.77 seconds | 9.06 seconds | 7.10 seconds |
1 M | 15.59 seconds | 20.27 seconds | 18.35 seconds |
2 M | 37.26 seconds | 45.09 seconds | 46.30 seconds |
4 M | 83.55 seconds | 107.00 seconds | 112.75 seconds |
8 M | 195.17 seconds | 234.09 seconds | 269.35 seconds |
16 M | 439.07 seconds | 541.79 seconds | Wrong result |
32 M | 1014.96 seconds | 1225.83 seconds | |
64 M | 2275.29 seconds | 2708.26 seconds |
PiFast should be able to compute 128 M digits with 1 GB of memory, but crashes for some unknown reason.
I could also run tests with Schnell_pi only on a bigger machine:
Pentium III 933 MHz with 4 GB of SDRAM 133 MHz | |
Number of digits | Time for Schnell_pi |
128 M | 4749.57 seconds |
256 M | 10549.46 seconds |
and on a Athlon machine:
Athlon 1.2 GHz with 1 GB of DDR SDRAM PC2100 | |
Number of digits | Time for Schnell_pi |
64 k | 0.28 seconds |
1 M | 8.80 seconds |
8 M | 110.6 seconds |
64 M | 1190 seconds |
I am interested in any other timing results.
Back to Schnell_pi homepage | What is Schnell_pi? | How to run Schnell_pi? | To do |