As our Institute has licenses for
Gaussian™ 09, Revision D.01 on both
Windows™ and
Linux®, it was interesting to compare performance on a single machine where both operating systems are installed. It has
Intel® i3 2-core CPU (4 threads), 4 GB of RAM and
Samsung ST500 hard drive – standard desktop PC, actually. Usually people test computational software on a single CPU core, but we ran both variants. Various calculation types were tested for some of our molecules.
At first glance, there is a big inconsistency in results: for single-core computations
Gaussian 09 on
Linux performs significantly better than on
Windows™, but for multiple-core test the situation seems to be opposite. But if we divide "Job cpu time" values by the number of CPU treads for Linux values, the situation looks more logical... At least, these numbers are comparable.
Windows 7
Job type |
Calculation with single CPU core |
Calculation with 4 CPU cores |
Calculation with 4 CPU cores, norming to single core |
|
Opt |
Freq |
Stable=Opt |
Polar |
Polar + SCRF |
Total: |
|
Hours |
Min. |
Sec. |
2 |
30 |
40 |
6 |
56 |
41 |
1 |
23 |
33 |
1 |
28 |
37 |
1 |
42 |
0 |
14 |
1 |
31 |
|
Hours |
Min. |
Sec. |
1 |
21 |
55 |
3 |
38 |
8 |
0 |
33 |
0 |
0 |
43 |
47 |
0 |
50 |
3 |
7 |
6 |
53 |
|
Hours |
Min. |
Sec. |
0 |
20 |
28.75 |
0 |
54 |
32 |
0 |
8 |
15 |
0 |
10 |
56.75 |
0 |
12 |
30.75 |
1 |
46 |
43.25 |
|
| Total: 50,491 s | Total: 25,613 s | Total: 6,403.25 s |
Debian GNU/Linux 8.1
Job type |
Calculation with single CPU core |
Calculation with 4 CPU cores |
Calculation with 4 CPU cores, norming to single core |
|
Opt |
Freq |
Stable=Opt |
Polar |
Polar + SCRF |
Total: |
|
Hours |
Min. |
Sec. |
1 |
30 |
37.7 |
4 |
18 |
13.8 |
0 |
36 |
55.1 |
1 |
6 |
27.1 |
1 |
13 |
41 |
8 |
45 |
54,7 |
|
Hours |
Min. |
Sec. |
3 |
33 |
27.3 |
11 |
29 |
34 |
1 |
18 |
14.8 |
2 |
21 |
4.9 |
2 |
34 |
58.3 |
21 |
17 |
19.3 |
|
Hours |
Min. |
Sec. |
0 |
53 |
21.825 |
2 |
52 |
23.5 |
0 |
19 |
33.7 |
0 |
35 |
16.225 |
0 |
38 |
44.575 |
5 |
19 |
19.825 |
|
| Total: 31,554.7 s | Total: 76,639.3 s | Total: 19,159.83 s |
My personal conclusion is that on
Unix™,
Gaussian™ returns calculation time as if single CPU core was used on
Unix™, but on
Windows™, the actual computation time is returned. This was confirmed just by comparing time of creation and the last modify time for each file: on Linux this time span was far shorter than it would be if we sum up all "Job cpu times". Therefore, third section of the first table and second section in the second table are not corresponding to reality.
A practical conclusion is that on
Unix™
Gaussian™ runs faster than on
Windows™. A technical conclusion is that we see clearly that not so much of computation can be parallelized, actually. Also, parallelization improves results much more on Windows than on Linux (but they are still worse).
We have also tried to use specific proprietary CPU firmware on Linux (Debian package i3fw). However, the results generally became slightly worse (for almost all jobs). We won't judge on it, because this "slightly" is really marginal difference. However, if You are free software freak, I think these findings will warm Your heart :)