As our Institute has licenses for 
Gaussian™ 09, Revision D.01 on both 
Windows™ and 
Linux®, it was interesting to compare performance on a single machine where both operating systems are installed. It has 
Intel® i3 2-core CPU (4 threads), 4 GB of RAM and 
Samsung ST500 hard drive – standard desktop PC, actually. Usually people test computational software on a single CPU core, but we ran both variants. Various calculation types were tested for some of our molecules.
At first glance, there is a big inconsistency in results: for single-core computations 
Gaussian 09 on 
Linux performs significantly better than on 
Windows™, but for multiple-core test the situation seems to be opposite. But if we divide "Job cpu time" values by the number of CPU treads for Linux values, the situation looks more logical... At least, these numbers are comparable.
Windows 7
 
| Job type | Calculation with single CPU core | Calculation with 4 CPU cores | Calculation with 4 CPU cores, norming to single core | 
| 
 
 
 | 
 |  | Opt |  | Freq |  | Stable=Opt |  | Polar |  | Polar + SCRF |  |
 | Total: | 
|---|
 | 
 
 
 | Hours | Min. | Sec. |  
 | 2 | 30 | 40 |   | 6 | 56 | 41 |   | 1 | 23 | 33 |   | 1 | 28 | 37 |   | 1 | 42 | 0 |  |
  | 14 | 1 | 31 |  | 
 
 
 | Hours | Min. | Sec. |  
 | 1 | 21 | 55 |   | 3 | 38 | 8 |   | 0 | 33 | 0 |   | 0 | 43 | 47 |   | 0 | 50 | 3 |  |
  | 7 | 6 | 53 |  | 
 
 
 | Hours | Min. | Sec. |  
 | 0 | 20 | 28.75 |   | 0 | 54 | 32 |   | 0 | 8 | 15 |   | 0 | 10 | 56.75 |   | 0 | 12 | 30.75 |  |
  | 1 | 46 | 43.25 |  | 
|  | Total: 50,491 s | Total: 25,613 s | Total: 6,403.25 s | 
Debian GNU/Linux 8.1
 
| Job type | Calculation with single CPU core | Calculation with 4 CPU cores | Calculation with 4 CPU cores, norming to single core | 
| 
 
 
 | 
 |  | Opt |  | Freq |  | Stable=Opt |  | Polar |  | Polar + SCRF |  |
 | Total: | 
|---|
 | 
 
 
 | Hours | Min. | Sec. |  
 | 1 | 30 | 37.7 |   | 4 | 18 | 13.8 |   | 0 | 36 | 55.1 |   | 1 | 6 | 27.1 |   | 1 | 13 | 41 |  |
  | 8 | 45 | 54,7 |  | 
 
 
 | Hours | Min. | Sec. |  
 | 3 | 33 | 27.3 |   | 11 | 29 | 34 |   | 1 | 18 | 14.8 |   | 2 | 21 | 4.9 |   | 2 | 34 | 58.3 |  |
  | 21 | 17 | 19.3 |  | 
 
 
 | Hours | Min. | Sec. |  
 | 0 | 53 | 21.825 |   | 2 | 52 | 23.5 |   | 0 | 19 | 33.7 |   | 0 | 35 | 16.225 |   | 0 | 38 | 44.575 |  |
  | 5 | 19 | 19.825 |  | 
|  | Total: 31,554.7 s | Total: 76,639.3 s | Total: 19,159.83 s | 
My personal conclusion is that on 
Unix™, 
Gaussian™ returns calculation time as if single CPU core was used on 
Unix™, but on 
Windows™, the actual computation time is returned. This was confirmed just by comparing time of creation and the last modify time for each file: on Linux this time span was far shorter than it would be if we sum up all "Job cpu times". Therefore, third section of the first table and second section in the second table are not corresponding to reality.
A practical conclusion is that on 
Unix™ 
Gaussian™ runs faster than on 
Windows™. A technical conclusion is that we see clearly that not so much of computation can be parallelized, actually. Also, parallelization improves results much more on Windows than on Linux (but they are still worse).
We have also tried to use specific proprietary CPU firmware on Linux (Debian package i3fw). However, the results generally became slightly worse (for almost all jobs). We won't judge on it, because this "slightly" is really marginal difference. However, if You are free software freak, I think these findings will warm Your heart :)