AMSOL 3.0 Page i AMSOL: An SCF Program for Free Energies of Solvation Version 3.0 Manual by Christopher J. Cramer, Gillian C. Lynch, and Donald G. Truhlar Department of Chemistry, University of Minnesota Minneapolis, Minnesota 55455-0431, USA AMSOL-versions 3.0 and 3.0c by Christopher J. Cramer, Gillian C. Lynch, and Donald G. Truhlar based on AMPAC, version 2.1 by D. A. Liotard, E. F. Healy, J. M. Ruiz, and M. J. S. Dewar Date of release, AMSOL versions 3.0 and 3.0c: June 24, 1992 Date of most recent update to this manual: June 24, 1992 ABSTRACT AMSOL carries out SCF calculations in the NDDO approximation with free energy of solvation terms added to the Fock operator to account for aqueous solvation effects. The solvation effects are included via two terms. The first accounts for electric polarization of the solvent by the generalized Born approximation based on a distributed monopole representation of the solute charges with dielectric screening. The second term is proportional to the solvent-accessible surface area, with a set of proportionality constants (surface tensions) that depend on the local nature of the solute for each atom or group's interface with the solvent. This version of AMSOL contains four different parameter sets, AM1-SM1, AM1-SM1a, AM1-SM2, and PM3-SM3. AVAILABILITY AMSOL-version 3.0 is available from Quantum Chemistry Program Exchange, Indiana University, Bloomington, Indiana 47405. The program number is 606. CONTENTS: 1. INTRODUCTION 2. SUGGESTED REFERENCES 3. COMPUTERS AND OPERATING SYSTEMS ON WHICH THE CODE HAS BEEN DEVELOPED 4. PROGRAM ORGANIZATION, DISTRIBUTION, AND INSTALLATION 5. USAGE 6. OUTPUT 7. NOTES AND COMMENTS 8. TEST RUN INPUT AND OUTPUT AMSOL 3.0 Page 1-1 1. INTRODUCTION AMSOL is an SCF program for calculating free energies of solvation of molecules and ions in aqueous solution. It is based on NDDO semiempirical molecular orbital theory in which the terms required to calculate the free energy of solvation are included in the solute Hamiltonian. Version 3.0 employs either the AM1 or the PM3 model for the solute electronic Hamiltonian. The free energy of solvation is based on two terms: The first is a generalized Born contribution that accounts for electric polarization of the continuum- dielectric solvent, i.e., for the electronic, atomic, and orientational polarization of the solvent molecules and for the resulting feedback of this effect on the solute charge distribution. The second term is an accessible- surface-area term that accounts for the free energy of cavity formation, dispersion interactions, and hydrophobic and hydrophilic effects such as solvent structure changing. The surface tension term, being semiempirical, allows for errors in AM1 or PM3 as well. Four general parameterizations are available for the solvation terms, AM1-SM1 and AM1-SM1a, which were originally released in the version 1 code, and AM1-SM2 and PM3-SM3, which are newer. In every case, the nomenclature indicates the underlying gas-phase Hamiltonian which is employed followed by the designation of the Solvation Model. The parameters for all of the models have been determined semiempirically. The original model and the determination of the parameters are described for SM1 and SM1a in "General Parameterized SCF Model for Free Energies of Solvation in Aqueous Solution" by C. J. Cramer and D. G. Truhlar, Journal of the American Chemical Society, 113 (1991) 8305-8311, 9901 (E). The AM1-SM1 model is the original general parameter set which can be used for ionic or neutral systems made up of H, C, N, O, F, S, Cl, Br, and I atoms. The AM1-SM1a model is a more specialized parameter set which is applicable to neutral molecules that are composed of the same atoms as in the SM1 model but that do not have hypervalent centers, three-center bonds, or unusual hybridization at N or O. The SM2 model is introduced in "An SCF Solvation Model for the Hydrophobic Effect and Absolute Free Energies of Solvation" by C. J. Cramer and D. G. Truhlar, Science (Washington, D.C.), 256 (1992) 213-217. This parameter set is an improved solvation model for the atoms in the SM1 model plus P. The SM3 model is introduced in "PM3-SM3: A New General Parametrization for Including Aqueous Solvation Effects in the PM3 Molecular Orbital Model" by C. J. Cramer and D. G. Truhlar, Journal of Computational Chemistry, in press. This parameter set is like the SM2 model but it is based on the PM3 Hamilitonian. A full comparison of all of the methods together with a detailed description of both the models and the computational methodologies employed by AMSOL may be found in "AM1-SM2 and PM3-SM3 Parametrized SCF Solvation Models for Free Energies in Aqueous Solution" by C. J. Cramer and D. G. Truhlar, Journal of Computer-Aided Molecular Design, to be published. AMSOL 3.0 Page 2-1 2. SUGGESTED REFERENCES: AM1-SM1 and AM1-SM1a: "General Parameterized SCF Model for Free Energies of Solvation in Aqueous Solution," C. J. Cramer and D. G. Truhlar, Journal of the American Chemical Society 113, 8305-8311, 9901(E) (1991). AM1-SM2: "An SCF Solvation Model for the Hydrophobic Effect and Absolute Free Energies of Solvation," C. J. Cramer and D. G. Truhlar, Science 256, 213-217 (1992). PM3-SM3: "PM3-SM3: A General Parameterization for Including Aqueous Solvation Effects in the PM3 Molecular Orbital Model," C. J. Cramer and D. G. Truhlar, Journal of Computational Chemistry, in press. AMSOL code: C. J. Cramer, G. C. Lynch, and D. G. Truhlar, AMSOL, Quantum Chemistry Program Exchange program no. 606, QCPE, Indiana University, Bloomington, IN, version 3.0 (or 3.0c), 1992, based on AMPAC-version 2.1 by D. A. Liotard, E. F. Healy, J. M. Ruiz, and M. J. S. Dewar. AMSOL 3.0 Page 3-1 3. COMPUTERS AND OPERATING SYSTEMS ON WHICH THE CODE HAS BEEN DEVELOPED AMSOL-version 1.0 was developed for the UNICOS (Unix) operating system on the Cray-2 and Cray X-MP series of supercomputers. AMSOL-version 3.0 is a portable program tested both on these supercomputers, on a Cray Y-MP, and on Unix workstations. The computers and operating systems on which AMSOL- version 3.0 has been tested are listed in Table 1. The program is in FORTRAN with the INCLUDE extension. Note that Cray computers use 64-bit words in single precision. Thus the REAL floating point variables in the program are interpreted as REAL*8 on Cray computers. When compiling AMSOL within the Cray environment, double precision should be disabled. (This is the -dp compiler option.) On the IBM, Silicon Graphics, and Sun workstations the code is executed in double precision. Since these machines have 32-bit words, this again yields REAL*8 floating point variables. On the IBM RS/6000 the code was compiled with the FORTRAN preprocessor, i.e. the -P option. This speeds up the code, but the code also runs if this compiler flag is not used. In Tables 2 and 3 below the compiler and loader commands used for testing the code are listed. Table 1. Operating systems on the various machines on which the code has been tested. ________________________________________________________________________________ Machine Compiler commands ________________________________________________________________________________ Cray-2 UNICOS 6.1 Cray Y-MP UNICOS 6.1.5a Cray X-MP-EA/4-64 UNICOS 6.1 IBM RS/6000 AIX 3.1.5 IRIS-4D/310GTXB IRIX 4.0.1 SUN SparcStation 4/330 SunOs 4.0.3 ________________________________________________________________________________ Table 2. Compiler commands used on the various machines on which the code has been tested. ________________________________________________________________________________ Machine Compiler commands ________________________________________________________________________________ Cray-2 cft77 -dp -i64 -a static Cray Y-MP cft77 -dp -i64 -ev Cray X-MP-EA/4-64 cft77 -dp -i64 -a static IBM RS/6000 xlf -c -qdpc -qmaf -O2 -P IRIS-4D/310GTXB f77 -c -O2 -Olimit 1300 -static SUN SparcStation 4/330 f77 -c -O3 -temp ________________________________________________________________________________ AMSOL 3.0 Page 3-2 Table 3. Loader commands used on the various machines on which the code has been tested. ________________________________________________________________________________ Machine Loader commands ________________________________________________________________________________ Cray-2 segldr -o Cray Y-MP segldr -o Cray X-MP-EA/4-64 segldr -o IBM RS/6000 xlf -o IRIS-4D/310GTXB f77 -o -Olimit 1300 SUN SparcStation 4/330 f77 -o -O3 ________________________________________________________________________________ In addition to the portable code, this distribution contains a file called amsol.cray, which may be used to create a partially optimized Cray- specific version called version 3.0c. (See Section 4 for further details.) The version 3.0c is more efficient for the test runs that are CPU intensive. In testing the program we ran the entire test suite (40 test runs) successfully on the Cray-2 and the Cray X-MP-EA and the IBM RS/6000 with version 3.0, and we ran the entire test suite successfully on the Cray-2 and the Cray X-MP-EA with version 3.0c. We tested more than 15 test runs on the Cray Y-MP with both versions 3.0 and 3.0c, and we tested almost all the test runs on the Silicon Graphics and Sun workstations with version 3.0. The Cray compiler, cft77, gives warnings on the equivalence statements in AMSOL, which are retained from AMPAC, but to the best of our knowledge this is not a cause for concern. Cray Research, Inc. has plans to make a better optimized version than version 3.0c for the Cray computers, and when completed they will make this available at Cray sites. The version number 3.0c2 is reserved for this modification. For further information on this version contact Ian Dillon of Cray Research, Inc. AMSOL 3.0 Page 4-1 4. PROGRAM ORGANIZATION, DISTRIBUTION, AND INSTALLATION AMSOL is adapted from AMPAC, version 2.1. In AMSOL-versions 3.0 and 3.0c, the PM3 parameter set [J. J. P. Stewart, J. Comp. Chem., 10 (1989) 209, 221] and a choice of four possible Solvation Models (references above) have been added to this version of AMPAC. In addition, the code has been made portable for Unix environments. AMSOL with either the AM1 or PM3 keyword calculates gas-phase electronic energies and, optionally, optimized geometries by the SCF method with the neglect of diatomic differential overlap and a parameterized Fock matrix. AMSOL with a Solvation Model keyword (see below) performs calculations in which free energy of solvation terms have been added to the solute Hamiltonian, thereby delivering electronic structures and, optionally, optimized geometries which incorporate solvation effects. In making AMSOL portable, one change is particularly noteworthy; the line (which occurs in many subprograms) INCLUDE "SIZES" has been converted to INCLUDE 'SIZES' wherever it occurs. Since this change was made globally throughout the program, subprograms which only have this change are not treated as modified for portability. All the other modifications (those made for portability or those made to include the solvation effects) are indicated by initials and a date in columns 73-80. The original (i.e., the unmodified) common blocks are the only portion of the code which has been consistently commented out and not removed. AMSOL-versions 3.0 and 3.0c consists of the following files: main.f This file contains the main and block data subprograms. ampac_unmod.f This file contains all the subprograms from AMPAC- version 2.1 which have not been modified. This file is distributed with AMSOL for completeness. ampac_nosol.f This file contains all the subprograms from AMPAC- version 2.1 which have been modified for portability or other reasons not having to do with solvation. ampac_sol.f This file contains all the subprograms from AMPAC- version 2.1 which have been modified to include the solvation effects and for portability. amsol_new.f This file contains the new subprograms. amsol_util.f This file contains the utility and header subprograms needed for AMSOL. amsol.cray This file contains the subprograms needed to create a partially optimized version of AMSOL on the Cray computers. AMSOL 3.0 Page 4-2 holder.f This file contains two subprograms from Andy Holder's version of AMPAC-version 2.1 that are used for portability. We are grateful to Dr. Holder for permission to distribute these two subprograms as part of the AMSOL package. dattim.cray This is the date/time subprogram for the Cray computers. dattim.ibm This is the date/time subprogram for the IBM RS/6000. dateclock.c This is a C subprogram which is used by dattim.ibm to obtain the date and time. dattim.iris This is the date/time subprogram for the IRIS. dattim.sun This is the date/time subprogram for the SUN. porcpu.cray This subprogram determines the CPU time for the Cray computers. porcpu.ibm This subprogram determines the CPU time for the IBM RS/6000. porcpu.iris This subprogram determines the CPU time for the IRIS. porcpu.sun This subprogram determines the CPU time for the SUN. amsol3cl.cray This is a job control file for compiling and linking the AMSOL-version 3.0 source code on the Cray computers. amsol3ccl.cray This is a job control file for compiling and linking the AMSOL-version 3.0c source code on the Cray computers. amsolcl.ibm This is a job control file for compiling and linking the AMSOL source code on the IBM RS/6000. amsolcl.iris This is a job control file for compiling and linking the AMSOL source code on the IRIS-4D/310GTXB. amsolcl.sun This is a job control file for compiling and linking the AMSOL source code on the Sun SparcStation 4/330. SIZES and PARAM These are files from AMPAC that are required in compilation as they are referenced by multiple INCLUDE statements. The file "SIZES" has two additional lines compared to AMPAC v2.1 (assigning the parameters NPACK and TDEF); "PARAM" has been extended to incorporate the PM3 Hamiltonian parameters. While the INCLUDE statement is not standard FORTRAN77, it is so useful that we have elected to retain it AMSOL 3.0 Page 4-3 for the convenience of users who may wish to change dimension limits or atom parameters. If the local host environment does not support INCLUDE, then SIZES and PARAM will simply (laboriously) have to be inserted into the source code at each occurrence of an INCLUDE statement. The variable TDEF in SIZES is the default value for the T keyword; thus it is the maximum CPU time allowed for a job in which T is not set. TDEF is set equal to 65000 in the distributed version. This is large enough to run the 37 test runs that do not have the T keyword on all six machines. The user may easily change TDEF if desired. For example, the 37 test runs without the T keyword can be run with TDEF equal to 10000 on the IBM and with TDEF equal to 30000 on the IRIS. SIZES3c This is the version of the INCLUDE files SIZES which is used in version 3.0c of the program. This file differs from the file SIZES in that the variable TDEF is set equal to 3600 (the original value used in AMPAC- version 2.1) instead of 65000. amsoli This file is a C shell script for interactive execution. envaq This file is a C shell script that writes the code numbers for SM1a atom types to standard output (your window if run interactively). It is useful to be able to remind oneself quickly of these values when preparing .dat files for input. tr1.dat, tr2.dat, ..., tr40.dat These are sample data files for 40 runs that constitute the test suite. tr1.out, tr2.out, ..., tr10.out These are the output files for the first 10 runs in the test suite. tr11.arc, tr12.arc, ..., tr40.out These are the archive files for test runs 11 through 40 in the test suite. amsol.doc This "on-line manual" (ASCII documentation file) AMSOL 3.0 Page 4-4 4.1. Installation The C shell scripts amsolcl.machine and amsol3cl.cray can be used to compile and link AMSOL-version 3.0 on any of the machines on which the code has been tested. These C shell scripts assume that all the source code, including the machine specific routines, and the INCLUDE files SIZES and PARAM are in the current working directory specified in the script. The executable created by these scripts will also be in the current working directory. Before running any of the C shell scripts to compile and link the program, the user should edit the script that will used and update the directory name to correspond to where the source code and INCLUDE files are stored. The C shell script amsol3ccl.machine can be used to compile and link AMSOL-version 3.0c on either the Cray-2 or the Cray X-MP-EA. (The same script also works on the Cray Y-MP, the user should change the compiler options as indicated in Table 2.) Like the C shell scripts discussed above, the directory name corresponding to the location of the source code, must be updated in this script before execution. This code requires the following files to be in the current working directory: main.f, ampac_unmod.f, ampac_nosol.f, ampac_sol.f, amsol_new.f, amsol.cray, dattim.cray, porcpu.cray, SIZES3c, and PARAM. AMSOL 3.0 Page 5-1 5. USAGE The user is assumed to be familiar with the usage of AMPAC. AMPAC calculates total electronic energies (i.e., electronic energy plus nuclear repulsion) by semiempirical molecular orbital theory. These are output in two forms - as energies and as "heats of formation". However, it should be noted that in AMPAC the only difference between the heat of formation and the sum of the electronic energy plus nuclear repulsion is the choice of the zero of energy. In other words, AMPAC computes the heat of formation by adding a constant to the sum of the electronic energy and the nuclear repulsion; this constant depends only on the number of each kind of atom. For example, the constant is the same for ethanol and dimethyl ether. As the above explanation makes clear, the term "heat of formation" in AMPAC is really a misnomer. In this manual and in the output files of AMSOL, when we say heat of formation we are referring to these AMPAC heats of formation, i.e., to sum of the electronic energy and nuclear repulsion with a specific choice of zero of energy. (The historical origin of this confusion is that, in the original parameterization of the semiempirical molecular orbital parameters, differences of the sum of the electronic energy and nuclear repulsion were fit using differences of experimentally determined heats of formation.) AMSOL is a modified version of AMPAC that, in addition to heats of formation, can also compute free energies of solvation. Modifications to the AMPAC program are invisible unless one of the four keywords which indicate a specific solvation model is included. The keywords are "SM1", "SM1A", "SM2", and "SM3" (both upper and lower case letters are allowed). Use of any one of these keywords requires specification of the gas-phase Hamiltonian, either AM1 (for SM1, SM1a, and SM2) or PM3 (for SM3). Note that while one can successfully run other combinations, e.g., PM3-SM2, the results are of dubious value since the SM parameters should be used with the Hamiltonian for which they were optimized. If a solvation model is specified, one of the three keywords, "NOPOL", "DERINU", or "1SCF" must also be included or the program will halt after reading the data file. The keyword "NOPOL" indicates that the program is to proceed with a normal gas-phase calculation, but upon completion it prints out the free energy of solvation which would be obtained using the gas-phase geometry and wave function (in particular the solvation free energy depends on the atomic charges and bond orders calculated from the solute wave function). Often considerably smaller polarization free energies are obtained with the geometry and wave function "frozen" this way than when they are re-optimized in the presence of solvent. The keyword "DERINU" is used for geometry optimizations. The default procedure in AMPAC calculates each gradient as the sum of all of the contributions from all possible pairwise combinations of atoms; the matrix operations are considerably simplified by restricting them to two-atom combinations. However, in AMSOL the individual gradients CANNOT be calculated in this manner. Instead, energy calculations must be made for the entire AMSOL 3.0 Page 5-2 molecule with each movement of an atom. "DERINU" accomplishes this. Any of the supplied methods for optimization may be used provided the gradients themselves are calculated by the DERINU option. The keyword "1SCF" performs a single SCF calculation at the input geometry optimizing the electronic distribution to reflect aqueous solvation. Two keywords, "TEXPN=x.xx" and "TDIFF=x.xx" are available for diagnostics. These control the numerical integrations used to determine atomic Coulomb effective radii and have default values of 1.5 and 0.01 (for SM2 and SM3) or 0.05 (for SM1 and SM1a) respectively. TEXPN controls the expansion factor used for concentric shells expanded about each atom in the numerical integration sequence. The default value indicates that each subsequent shell has a radius 50% larger than its immediate precursor. By decreasing this value to its limit of 1.0 the integration will deliver values more nearly converged to the analytical effective radius, however large increases in time are required for the computation. Typically TEXPN=1.2 doubles the run-time. TDIFF is the half-thickness for the first shell about the atom and has marginal effect unless very large (unreasonable) values are chosen. The program default values must be used for TEXPN and TDFIFF for true SMx runs (x = 1, 1a, 2, 3) because the parameters for the solvation models have been developed from them. However, situations where unusual molecular clefts or cavities are present might conceivably deliver radii uncharacteristically far from convergence. Use of these two keywords may then be illuminating for understanding the problem. For a complete description of the numerics, see the Journal of Computer-Aided Molecular Design article cited in Section 1. The keyword T=x, where x is in seconds, is used to increase the maximum allowed time limit above the default value TDEF (see description of SIZES file on page 4-3). If a job is stopped because of this time limit, then it can be restarted using the C shell script amsoli or the command file created by amsoli. In such a case the user should add the keywords RESTART and T=x to the original data file, using a larger value of x. Note that the RESTART option was not supported in version 1.0 but has been tested for version 3.0. The old keywords "AQUO" and "ENVAQ" used to specify SM1 and SM1a calculations respectively in version 1.0 of AMSOL are no longer supported. They are replaced by SM1 and SM1a, respectively. Finally the keyword TRUES is available for the convenient calculation of true solvation energies. The keywords SM1, SM1a, SM2, SM3, and TRUES are explained in more detail in the following subsections. AMSOL 3.0 Page 5-3 5.1. Keyword SM1 By using both SM1 and AM1, the user requests a calculation in aqueous solution by the AM1-SM1 method. The actual quantities output are, inter alia, the gaseous heat of formation relative to elemental standard states plus the aqueous free energy of solvation as well as the electronic energy plus the aqueous free energy of solvation. One additional item worthy of note is that the ionization potential and the HOMO energy are no longer adequately related by Koopmans' theorem. Hence, AMSOL reports the HOMO energy labeled as such. In SM1 the accessible-surface-area parameters are independent of chemical environment; thus there is a unique value of each surface tension for each atomic type (i.e., atomic number). Other than including the SM1 keyword, no modifications need be made to the standard AMPAC input file. 5.2. Keyword SM1A The keyword SM1A together with AM1 requests a calculation in aqueous solution by the AM1-SM1a method. The keyword SM1A functions analogously to SM1 with respect to energetics and output, with the exception that the accessible- surface-area terms are dependent not only on atomic number but also on chemical environment. When SM1A is specified the atom types must be provided to the program in the input file. They follow the blank line which concludes the symmetry and/or reaction path information and should be entered line-by-line in I3 format in the same order as the atoms appear in the Z-matrix. No entry need be made for dummy atoms as the program will not try to read an atom type for them. The allowed atom types in version 3.0 are as follows: 1. any carbon atom or attached hydrogen atom 2. hydrogen atom attached to a nitrogen atom 3. hydrogen atom attached to an oxygen atom 4. hydrogen atom attached to a sulfur atom 5. sp3 or amide nitrogen atom 6. sp, sp2, or aromatic nitrogen atom 7. sp3 oxygen atom 8. sp2 oxygen atom (e.g., ketone, aldehyde, sulfoxide, nitro, etc.) 9. fluorine atom 10. sulfur atom 11. chlorine atom 12. bromine atom 13. iodine atom 14. phosphorus atom or attached hydrogen The output file will echo back the atom types it has read along with the geometric information, etc. Execution of the script envaq displays the atom types on standard output as a reminder that may be useful when preparing data files. AMSOL 3.0 Page 5-4 5.3. Keyword SM2 By using both the SM2 and AM1 keywords, the user requests a calculation in aqueous solution by the AM1-SM2 method. While analogous to the SM1 method in its generality, the SM2 model recognizes the importance of classifying hydrogen atoms based on the heavy atom to which they are attached and derives that information from the bond order matrix. As a result, it is considerably more accurate than the SM1 approach. The output from such a run will include the final bond order matrix together with the contributions to surface-area- dependent terms from both heavy-atom surface tensions and attached hydrogen atoms (which do not block the solvent-accessible-surface-area of the heavy atoms to which they are attached). A complete description of the formalism may be found in the literature cited in Section 1 above. In SM2 all accessible-surface-area parameters are independent of chemical environment; thus there is a unique value for all surface tension parameters for each atomic type (i.e., atomic number). Other than including the SM2 keyword, no modifications need be made to the standard AMPAC input file. 5.4. Keyword SM3 By using both SM3 and PM3, the user requests a calculation in aqueous solution by the PM3-SM3 method. This is completely analogous to the SM2 method except that the underlying gas-phase Hamiltonian is the PM3 model of Stewart and the solvation model parameters are based upon it. The most desirable feature of the PM3 model is that it appears to predict more reasonable hydrogen bonds in terms of linearity, distance, and energy than does AM1. A significant drawback in comparison to AM1 is that the PM3 nitrogen charges are usually much too positive and the SM3 model is thus reduced in its effectiveness when employed for amines, nitriles, and nitro compounds. In SM3 all accessible-surface-area parameters are independent of chemical environment, thus there is a unique value for all surface tension parameters for each atomic type (i.e., atomic number). Other than including the SM3 keyword, no modifications need be made to the standard AMPAC input file. 5.5. Keyword TRUES The keyword TRUES can be used to calculate the true solvation free energy, which is the final heat of formation plus solvation free energy for the optimized geometry in solution minus the final heat of formation for the optimized gas-phase geometry. The keyword TRUES, used in a run with one of the keywords SM1, SM1a, SM2, or SM3, causes the program to calculate this quantity. When the keyword TRUES is specified, AMSOL reads an extra line of data (free format) immediately following the line that contains the title. This line should contain the heat of formation for the optimized gas-phase geometry in kcal. This usage, as well as that of the other AMSOL key words is illustrated in various test runs. AMSOL 3.0 Page 6-1 6. OUTPUT In addition to the output described above, use of a solvation model keyword will automatically ensure that the accessible-surface-area terms are printed out in kcal/mol, by atom as well as summed over atoms. The effective Born radii and effective interatomic distances are printed out in block matrix form. The atomic Born energies are printed out in block matrix form and summed over individual atoms as well as atomic number. Note that these energies will be significantly more negative than the actual electronic contribution to the free energy of solvation since there is an energy loss associated with the gas-phase portion of the SCF calculation due to electronic reorganization. That is, the formalism optimizes to the minimum of the sum of the internal electronic energy and the Born polarization energy. If the keyword FOCK is included in the data file, the final Fock matrix is printed (as with AMPAC-version 2.1), and additionally the contributions to the Fock matrix diagonal elements from the generalized Born treatment are printed by atom in eV. AMSOL 3.0 Page 7-1 7. NOTES AND COMMENTS The calculation of the effective Born radii is the slowest step in the calculation. This calculation, as well as the calculation of the CDS term, involves calculating the exposed surface area of a sphere in a set of spheres. The radii of the spheres are intrinsic coulomb radii in the calculation of effective Born radii, they are augmented van der Waals radii for the CDS calculation in the SM1 and SM1a models, and they are "hydrophobic radii" in the CDS calculation in the SM2 and SM3 models. This is calculated as follows. One covers the surface of each atomic sphere uniformly with "dots". Then, any dot that is closer to a sphere on a different atomic center than that sphere's radius is "erased". A simple division of dots left by dots started with gives the fraction of exposed surface area. However, in order to get relatively smooth behavior, one has to use many dots. The bare minimum is about 800 per sphere. Since, to find the effective Born radii by the method of Still et al., one calculates this surface area for many spheres, there are many distance measurements. Even fully vectorized it takes about 15-20 times longer than unmodified AMPAC for small molecules. As solutes get bigger the matrix manipulations begin to compete, e.g., for bis-(2-chloroethyl)sulfide the time required is only 7 times longer. Because the algorithms which calculate effective Born radius and surface area have been chosen for speed and are numerical instead of analytical, they have nonzero discontinuities. These have almost no effect on the final geometry and energy; however they do sometimes cause the final step of the optimization to finish with the warning that the "LINE MINIMISATION FAILED TWICE IN A ROW. TAKE CARE". This does not appear to be a cause for worry unless the gradient norm is usually high. If the gradient norm is less than about 5 kcal/Angstrom, the calculation is probably ok. Aromatics seem to give higher gradient norms (even in the gas phase) when the calculation is acceptable. Care should be taken when checking this that a geometry change has actually occurred when the LINE MINIMIZATION error is obtained. When geometry optimization fails to converge or gives high gradient norms, it is often helpful to change one of the bond lengths by about 0.2 Angstrom or several coordinates by 0.1 Angstrom. In particular, if the aqueous geometry optimization does not converge well when the gas-phase geometry is used as the initial guess, it may be helpful to start from a significantly perturbed geometry. Additionally, use of the Powell routine (keyword POWELL) to optimize geometries often avoids this problem. Moreover, the Powell method is quite useful for locating transition states since it optimizes to the nearest stationary point on the hypersurface. Note though that since the Hessian matrix is obtained by numerical differentiation, care should be exercised in verifying the nature of the stationary points that the Powell algorithm finds. In addition, there are occasions where the Powell method is very slow to converge, so it is usually better employed as an alternative than as the default. The keyword PRECISE affects only the SCF calculation (two-point central differences are the default for geometry optimization) if a solvation model is specified. In general, the default SCF tolerance of 0.00001 in AMPAC is a bit AMSOL 3.0 Page 7-2 too low to trust the output heats or energies. The recommended course of action is to use the keyword phrase SCFCRT=0.000001; this makes the tolerance ten times as stringent and usually gives well converged heats of formation and energies. Using PRECISE or still tighter tolerances occasionally causes the SCF calculation to converge very slowly due to the numerical discontinuities in the generalized Born terms, but is fully allowed. A new feature added to version 3.0 has been very successful at eliminating SCF convergence failures. For a 1SCF calculation, the Fock matrix is updated with the Born information derived from the latest atomic charges every four cycles. For geometry optimizations, updating occurs at iterations 1, 4, 9, 16, 25, .... Should the SCF fail to converge due to small oscillations in the atomic charges and their effect on the Fock matrix, the program will output a message to the effect that the NSTAR index has been increased by one, and updating will now occur twice in a row, but only half as often, e.g., for a geometry optimization at iterations 1, 2, 8, 9, 18, 19, 32, 33, 50, 51, .... Should oscillation still be observed, the updating will go to thrice in a row a third as often, and so forth until an index of five has been tried. Only extremely high energy (unusual) arrangements of atoms, which fail to converge in the gas phase as well, will fail to give a converged SCF with this formalism. Use of atoms other than {H,C,N,O,P,S,Cl,F,Br,I} with the keyword SM1, SM2, or SM3 is allowed, but no accessible-surface-area dependent correction will be applied, i.e., the surface tension factor is 0.00. With SM1A, some atomic type must be supplied in the data, and any number between the current highest defined type and 100 will also result in a factor of 0.00 being employed. While AM1-SM1a is more successful than AM1-SM2 in reproducing experimental free energies of solvation for neutral solutes, it suffers from the necessity of assigning explicit chemical environments. For cases where that is ambiguous, the user is left to his own intuition or else the AM1-SM2 or PM3-SM3 solvation models should be used. These models are very successful in approaching nearly the accuracy of AM1-SM1a, and they are applicable to ions and unusual bonding situations, whereas AM1-SM1a is not. PM3-SM3 is less accurate than AM1-SM2 (although more accurate than AM1-SM1) and may be used at the investigators discretion to compare AM1 and PM3 results or for cases where it is expected that PM3 treats the solute more accurately than AM1. When using amsoli, or amsolcl.machine (e.g., amsolcl.ibm), be sure to update the directory names in the scripts to correspond to where you have stored the amsol files. When using amsoli, give only the root name of the data file. For example: amsoli tr1 not amsoli tr1.dat Note that the run-script supplied with versions 3.0 and 3.0c refers to device fort.9, not fort.09. The former is recognizable as a legitimate FORTRAN device, while the latter, used in the scripts supplied with version 1.0 code, is not, and is treated as a normal file name. AMSOL 3.0 Page 7-3 Changes to the atomic BLOCK DATA have been made in order to deliver physically meaningful frequencies and centers of mass. Thus, the weighted average natural abundance atomic masses used by AMPAC 2.1 have been replaced with the exact masses of the most abundant isotopes. The center of mass issue arises in the computation of dipole moments for charged species. AMSOL does this calculation automatically, another change from AMPAC 2.1. For charged species, the dipole moment depends on the origin, and AMSOL puts this origin at the center of mass. The default time limit for particular calculations in the code, e.g., the SCF iterations, which was hard wired in AMPAC to be 3600 seconds has been converted to a variable. This variable is initialized to the value TDEF; TDEF is a parameter which is set in the INCLUDE file SIZES. This change was necessary because the limit of 3600 seconds was too low to get all the test runs in the test suite to run to completion on the workstations without having to use the RESTART option. Two versions of SIZES are distributed. The one in file SIZES has TDEF = 65000, and the one in file SIZES3c has TDEF = 3600. See further discussion on page 4-3. Finally, AMSOL-version 3.0 differs from AMSOL-version 1.0 in that it is portable. Thus the program is fully in DOUBLE PRECISION, all occurrences of the same common block have the same length, a more portable version of the block diagonalization routine is included, the timing calls have been reduced to a form that allows machine-dependent timing routines to be used conveniently, and many additional non-standard-usage and non-portablity problems in AMPAC have been corrected along with a few minor bugs. AMSOL 3.0 Page 8-1 8. TEST RUN INPUT AND OUTPUT The keywords and description line from the various .dat files of the test runs files are provided here as a quick reference. For every test run the .dat file and--in each case--either the associated .out file or the .arc file from the Cray X-MP-EA for version 3.0 is provided as part of the distributed version as enumerated in Section 4. Each of the four solvation models is illustrated, together with the keywords TRUES, DERINU, NOPOL, 1SCF, FOCK, SCFCRT=x.xx, and POWELL (together with some POWELL options). Various combinations of SYMMETRY information, reaction paths, SM1a chemical environment information, and use of dummy atoms are presented, along with two calculations on transition states and two UHF examples. The data files provided in the distribution package are the files which were run successfully on all six machines on which the code has been tested. The data files gave energies that are within 0.2 kcal/mol across all the machines in the test runs we made (see page 3-2), but in some cases the GNORMs are large. The GNORMs can be reduced on any given machine by perturbing the input geometry, but in doing so the data files are no longer portable across machines. In order to compare user test runs with the distributed output files, care should be taken to store the provided files in a subdirectory where they will not be overwritten. tr1.dat: AM1 CHARGE=-1 PRECISE chloride anion (gas phase) This line can be used for user's comments tr2.dat: DERINU SM2 AM1 CHARGE=-1 SCFCRT=0.000001 chloride anion tr3.dat: CHARGE=-1 PM3 SYMMETRY PRECISE acetate anion (Cs symmetry) (gas phase) tr4.dat: DERINU SM3 CHARGE=-1 PM3 SYMMETRY SCFCRT=0.000001 acetate anion (Cs symmetry) tr5.dat: AM1 PRECISE SYMMETRY ammonia (gas phase) AMSOL 3.0 Page 8-2 tr6.dat: AM1 SM2 DERINU SCFCRT=0.000001 SYMMETRY TRUES ammonia tr7.dat: PM3 PRECISE SYMMETRY water (gas phase) tr8.dat: PM3 SM3 DERINU SCFCRT=0.000001 SYMMETRY TRUES water tr9.dat: PRECISE AM1 SYMMETRY phosphine (gas phase) tr10.dat: SM2 DERINU SCFCRT=0.000001 AM1 SYMMETRY phosphine tr11.dat: POWELL CHARGE=1 AM1 SYMMETRY FOCK PRECISE dimethylammonium cation (C2v symmetry) (gas phase) tr12.dat: POWELL DERINU SM2 CHARGE=1 AM1 SYMMETRY FOCK SCFCRT=0.000001 dimethylammonium cation (C2v symmetry) tr13.dat: AM1 PRECISE SYMMETRY diethyl ether (gas phase) tr14.dat: AM1 SM1 DERINU SCFCRT=0.000001 SYMMETRY diethyl ether AMSOL 3.0 Page 8-3 tr15.dat: PM3 PRECISE SYMMETRY methylcyclohexane (gas phase) tr16.dat: T=200000 SM3 DERINU SCFCRT=0.000001 PM3 SYMMETRY methylcyclohexane tr17.dat: AM1 SYMMETRY PRECISE Z-dichloroethylene (C2v symmetry) (gas phase) tr18.dat: DERINU SM1 AM1 SYMMETRY SCFCRT=0.000001 Z-dichloroethylene (C2v symmetry) tr19.dat: AM1 SYMMETRY PRECISE morpholine (Cs symmetry -- Z matrix adapted from cyclohexane) (gas phase) tr20.dat: T=200000 POWELL DERINU SM1A AM1 SYMMETRY SCFCRT=0.000001 morpholine (Cs symmetry -- Z matrix adapted from cyclohexane) tr21.dat: AM1 SYMMETRY PRECISE methyl butanoate (Cs symmetry) (gas phase) tr22.dat: DERINU SM2 AM1 SYMMETRY SCFCRT=0.000001 methyl butanoate (Cs symmetry) tr23.dat: AM1 PRECISE 1,1,1-trifluoropropan-2-ol (gas phase) AMSOL 3.0 Page 8-4 tr24.dat: DERINU SM1A AM1 SCFCRT=0.000001 PRECISE 1,1,1-trifluoropropan-2-ol tr25.dat: SM2 NOPOL PRECISE AM1 SYMMETRY 4-pyridone (gas phase) tr26.dat: SM2 DERINU POWELL CYCLES=50 SCFCRT=0.000001 AM1 SYMMETRY 4-pyridone tr27.dat: 1SCF PRECISE PM3 3-ethyl-2-methoxypyrazine (gas phase) tr28.dat: SM3 1SCF SCFCRT=0.000001 PM3 3-ethyl-2-methoxypyrazine tr29.dat: AM1 PRECISE Rotation coordinate for hydroxyl of acetic acid (30 deg increment) (gas phase) tr30.dat: DERINU SM1A T=200000 AM1 SCFCRT=0.000001 Rotation coordinate for hydroxyl of acetic acid (30 deg increment) tr31.dat: PRECISE AM1 SYMMETRY phosphinous acid, rotation coordinate for P-O bond (gas phase) AMSOL 3.0 Page 8-5 tr32.dat: SM2 DERINU SCFCRT=0.000001 AM1 SYMMETRY phosphinous acid, rotation coordinate for P-O bond tr33.dat: CYCLES=30 POWELL AM1 CHARGE=-1 PRECISE SYMMETRY bromide methyl iodide (SN2 reaction transition state) (gas phase) tr34.dat: SM2 DERINU CYCLES=30 POWELL AM1 CHARGE=-1 SYMMETRY bromide methyl iodide (SN2 reaction transition state) tr35.dat: POWELL PM3 SYMMETRY PRECISE TS of Diels-Alder reaction cyclopentadiene and MVK (gas phase) tr36.dat: POWELL PM3 SYMMETRY SM3 DERINU SCFCRT=0.000001 TS of Diels-Alder reaction cyclopentadiene and MVK tr37.dat: SCFCRT=0.000001 PRECISE AM1 UHF nitrogen oxide (gas-phase) tr38.dat: SCFCRT=0.000001 PRECISE DERINU SM2 AM1 UHF nitrogen oxide (aqueous) tr39.dat: SCFCRT=0.000001 PRECISE CHARGE=-1 AM1 UHF superoxide (gas-phase) tr40.dat: SCFCRT=0.000001 PRECISE DERINU SM2 CHARGE=-1 AM1 UHF TRUES superoxide (aqueous) End of manual