ISSP-LOM Quantum chemistry: March 2016

Thursday, March 31, 2016

Gaussian performance on Windows and on Linux

As our Institute has licenses for Gaussian™ 09, Revision D.01 on both Windows™ and Linux^®, it was interesting to compare performance on a single machine where both operating systems are installed. It has Intel^® i3 2-core CPU (4 threads), 4 GB of RAM and Samsung ST500 hard drive – standard desktop PC, actually. Usually people test computational software on a single CPU core, but we ran both variants. Various calculation types were tested for some of our molecules.

At first glance, there is a big inconsistency in results: for single-core computations Gaussian 09 on Linux performs significantly better than on Windows™, but for multiple-core test the situation seems to be opposite. But if we divide "Job cpu time" values by the number of CPU treads for Linux values, the situation looks more logical... At least, these numbers are comparable.

Windows 7

Job type

Calculation with single CPU core

Calculation with 4 CPU cores

Calculation with 4 CPU cores, norming to single core


Opt
Freq
Stable=Opt
Polar
Polar + SCRF
Total:

Hours	Min.	Sec.
2	30	40
6	56	41
1	23	33
1	28	37
1	42	0
14	1	31

Hours	Min.	Sec.
1	21	55
3	38	8
0	33	0
0	43	47
0	50	3
7	6	53

Hours	Min.	Sec.
0	20	28.75
0	54	32
0	8	15
0	10	56.75
0	12	30.75
1	46	43.25

Total: 50,491 s

Total: 25,613 s

Total: 6,403.25 s

Debian GNU/Linux 8.1

Job type

Calculation with single CPU core

Calculation with 4 CPU cores

Calculation with 4 CPU cores, norming to single core


Opt
Freq
Stable=Opt
Polar
Polar + SCRF
Total:

Hours	Min.	Sec.
1	30	37.7
4	18	13.8
0	36	55.1
1	6	27.1
1	13	41
8	45	54,7

Hours	Min.	Sec.
3	33	27.3
11	29	34
1	18	14.8
2	21	4.9
2	34	58.3
21	17	19.3

Hours	Min.	Sec.
0	53	21.825
2	52	23.5
0	19	33.7
0	35	16.225
0	38	44.575
5	19	19.825

Total: 31,554.7 s

Total: 76,639.3 s

Total: 19,159.83 s

My personal conclusion is that on Unix™, Gaussian™ returns calculation time as if single CPU core was used on Unix™, but on Windows™, the actual computation time is returned. This was confirmed just by comparing time of creation and the last modify time for each file: on Linux this time span was far shorter than it would be if we sum up all "Job cpu times". Therefore, third section of the first table and second section in the second table are not corresponding to reality.

A practical conclusion is that on Unix™ Gaussian™ runs faster than on Windows™. A technical conclusion is that we see clearly that not so much of computation can be parallelized, actually. Also, parallelization improves results much more on Windows than on Linux (but they are still worse).

We have also tried to use specific proprietary CPU firmware on Linux (Debian package i3fw). However, the results generally became slightly worse (for almost all jobs). We won't judge on it, because this "slightly" is really marginal difference. However, if You are free software freak, I think these findings will warm Your heart :)

Sunday, March 27, 2016

OrtVc1 failed #1.

Well, no much information about this on the internet. We have run into this when doing polarizability calculations on a supercell cut from a molecular semiconductor crystal.
Only information I have found is available here: http://www.somewhereville.com/?p=2175.

Disabling fast multipole method (by NoFMM keyword which is denoted as "obsolete" keyword in G09 Reference) helped us to resolve the issue. We, however, wrote to Gaussian Tech support and received further advice. The key point is that the problem arises if the system considered has some symmetry and it is used in the calculation; then FMM for electric field CPHF can fail. Fast multipole method is used only for large enough systems, and not only for CPHF; so, Gaussian' s Dr. Clemente advised me to disable it only for CPHF calculation by using IOp(10/63=1) keyword in the route section of Gaussian input file.

Gaussian technical support can be contacted through help@gaussian.com, usually they are very helpful.

Friday, March 25, 2016

RHF, UHF and Guess=Mix

It is straightforward to deduce that by default UHF (or UKS) singlet calculation produce identical alpha and beta orbitals, because there is no reason to break spin symmetry during the optimization (initially orbitals for both spins are the same, as follows from physical considerations). This was, however, new for me (thanks for explanation to Gaussian support team) when I had become suspicious about too good correspondence between RKS and UKS values, despite that I knew DFT is much less subject to spin contamination than HF.

Then, as it might happen if You are using some automation tool to generate input files, I have unintendedly generated some with RHF/... Guess=Mix. As in population analysis found in the output file there is only data on Alpha orbitals, I conclude that "R" or "U" in front of computation method is preferred over the Guess type by the program when it is determining the type of calculation.

UPDATE: First hyperpolarizability values are slightly changing for water and ozone calculations. RHF Guess=Mix seems to be closer to UHF than to RHF, although only alpha orbitals are shown in population analysis. Nevertheless, this difference is marginal. I will check later for some larger molecule.

Some default values in Gaussian 09

Probably these are considered "not interesting" by Gaussian, Inc., so not contained in G09 Reference (blue book). These and many others can be found in G09 IOps Reference (red book).

Integrals
2-Electron integral accuracy: 10^-10

(i.e., Integral=(Acc2E=10) )

SCF Procedure
SCF convergence criterion: 10^-8 , although for PBC 10^-7
( SCF=(Conver=8) )
Checked parameter is

RMS density for L502 (*DIIS)
RMS rotation gradient for L508 (Linear and Newton-Raphson quadratical convergence)
SQCDF for L506 (ROHF and GVB)
Energy for L510 (MCSCF).

RMS means "root-mean-square", not "Richard Matthew Stallman".

Polarizability
Numerical differentiation field step (which is quite important when You calculate Gamma) is by default 0.0003 a.u. (about 0.01542 V/Å). For double-numerical differentiation it is 0.001 a. u. (so 0.051 V/Å). For BOTH cases, this corresponds to keyword argument Step=10 (see this post).

Part of LOM