Users' Guide to the Monte Carlo Program MMC
for Computer Simulation of Molecular Solutions.
Mihaly
Mezei
Department of Physiology and Biophysics
Mount Sinai School of Medicine, NYU
E-mail address:
mezei@inka.mssm.edu
Nov. 20, 2001.
TABLE OF CONTENTS
I. GENERAL DESCRIPTION
The present document describes the Force-Biased Metropolis
Monte Carlo program for computer simulation studies of molecular
liquids and solutions in the canonical, isobaric-isothermal
and grand-canonical
ensembles and the analysis of such calculations based on the
Proximity Criterion.
This section provides an overview of the
capabilities of the program.
I.1. Systems modeled
The present program was originally written for simulations on
infinitely dilute solutions (one solute in (nmolec-1)
solvent molecules).
As a result of this origin, the program has developed two
different treatments of molecules.
Molecules labeled solvent have to be rigid
and be of the same type
(this allows the use of special, rapid energy calculation code).
The collection of all other molecules (i.e., the one(s) not
declared to be solvents) are called by the program the
solute.
Thus the solute can consist of a single rigid
molecule, a collection of freely translating and rotating rigid
molecules or a collection of freely translating and rotating
molecules with certain (user specified) torsion angles freely
rotating.
The program establishes the connectivity of the solute
molecule(s)
based on interatomic distances and built-in threshold values.
This sometimes leads to missing or extra bonds, especally if the
input structure (given by the SLTA key)
is obtained from a molecular dynamics simulation.
Such defects can be identified by using the
FCGA and
BNDL keys and corrected by the keys
MAKB or
BRKB.
Free energy simulations can be performed on selected
molecules
of the solute (see Sec. I.5.).
The torsion angles to be changed on the solute are specified
by the two atoms defining the bond of the torsion
(with the key TORD).
The torsions can be grouped and
it is the group of torsions that is selected for change. The main
reason for this is that if a torsion affects atoms that are on an
other rotatable bond then it may be efficient to move the
affected bond as well. These moves, however, are not correlated
(i.e., each is generated by a different random number).
The torsion potential parameters can be obtained from a CHARMM or
AMBER parameter file.
It is possible to impose periodicity on the solute,
either along the X axis (with the key SLTA
POLY)
i.e., with period equal to the edge of the cell in the X direction
- an infinite polymer,
or in all three directions
(with the key SLTA XTAL) with its
unit cell (or a multiple of a unit cell) coinciding with the
periodic cell of the simulation (i.e., a crystal).
The periodic extension of the solute outside the
periodic cell is done automatically by the program.
The intermolecular interactions are assumed to be of the usual 12-6-1 type,
i.e., Lennard-Jones terms for the repulsion and dispersion plus
electrostatics based on partial charges on atoms (or pseudo atoms).
The parameters can be obtained from libraries stores by the program
or defined via input.
The interactions can be truncated
(see keys SVVC,
SUVC, SUUC)
either by a spherical cutoff
(each atom interacts with all others within a certain distance)
or by the minimum image convention
(each atom interacts with one periodic image of all the other atoms).
For the purpose of cutoff or image calculation, distances usually refer
to molecule-molecule or residue-residue distances.
When spherical truncation is used, the long-range electrostatic energy
can be estimated by the reaction field formula
(provided all residues are neutral) - see key
RFCR.
I.2. Properties
calculated during a simulation
The major outputs of a Monte Carlo simulation (besides the
configurations saved for further analysis by the
TRAJ key) are as follows.
- Macroscopic thermodynamic properties of the system:
internal energy, pressure, heat capacity (constant-volume
or pressure, depending on the ensemble used), the density far from
the solute (see key MOND). Simulations
in the grand-canonical ensemble (see key GCEN) also provide the density, excess
chemical potential, isothermal compressibility, and
expansivity. Simulations in the isothermal isobaric ensemble
(see key IBEN) also provide the density
and the thermal expansion coefficient.
- Several methods to calculate free energy have been
implemented (see key FREE) by relating
it to either some computed energy values or to some distributions.
- Bulk solvent-solvent and solute-solvent distribution functions
(see key DSTC):
radial distribution function based on center of mass (COM),
quasi-component distribution functions for coordination number,
binding energy and pair energy, and dipole correlation functions.
- The derivative of the solute and total energy with respect
to the solute potential parameters can also be calculated as an
ensemble average (see key SENS). This
may be used for the development of
potential parameters and for the sensitivity of the calculated
energy to the potential parameters.
- Calculation of the mean forces exerted by the solvent on the solute atoms
(see key SENS) and the electric field
gradients at the solute and solvent sites (see key
FLDG) can also be requested.
- For systems with variable torsion angles the torsion
angle distribution can be calculated
(see key TAND)
- Proximity analysis, i.e., distributions based on the
original form of the Proximity Criterion
or its enhanced version can also be
performed. This analysis partitions the solvent according to the
proximal solute atoms and calculates distributions for each
partition.
The solute atoms are partitioned into functional
groups either defined by their chemical identity
(see key FCGA) or into alternative
functional groups specified by the user
(see key FCGD).
The proximity analysis results are presented for individual solute
atoms, functional groups, and solute residues (see key
SLTA) and also are
averaged over functional groups of the same kind.
I.3. Potential
functions
Several potential functions are built into the program. For
the solute-solvent nonbonded interactions
(specified with the SUPT key).
Clementi et al.'s
library, EPEN/QPEN, Kollman et al.'s libraries (AMBER and
AMBER-94), CHARMM22 library, Berendsen-Van Gunsteren library
(GROMOS87 and GROMOS96);
and Jorgensen's OPLS library.
Note that for the GROMOS force field only the combination
rule can be used - the non-bonded parameters for pairs
are not implemented.
Except for the EPEN form, parameters are stored internally, but
they can be modified from input and allowance is made for
inputting new atom types besides the stored ones. As these
parameter libraries are continually updated and extended, it is
advisable to check if the value stored is the best one
(using the PLBP key).
For aqueous solutions, the solvent-solvent interaction
(specified with the SUPT key)
can be either
in the form of MCY-CI type or TIPS/TIP3P/SPC type or
TIPS2/TIP4P/B-F type potential. For the MCY-CI type the
parameters of both the MCY-CI-II and the Yoon-Morokuma-Davidson
parametrizations are built in. For TIPS2/TIP4P/B-F types, the A, B
and q parameters may also be provided by the user, but the TIPS,
SPC, TIP3P, Bernal-Fowler, TIPS2 and TIP4P water model parameters
are built in. For solvent other than water, 12-6-1 type potential
can be used with parameters from the AMBER, GROMOS or OPLS
libraries.
There is a framework for adding field-dependent potentials to the
currently implemented atom-atom potentials, but this requires some
additional coding.
I.4. Sampling techniques
Various sampling techniques can be selected with the
MOVE key.
Translations and rotations can be sampled with the original
Metropolis Method or with the
Force-Biased algorithm of
Pangali, Rao and Berne (not implemented for
multimolecular solute moves, though).
Force biasing, of course, necessitates the
recalculation of energies and forces at every accepted step,
thereby significantly increasing the computer time required
to perform a move. The improvement in convergence, however, more
than outweighs the added cost. By default a lambda value of
1/2
is used but the scaled force-bias technique where lambda is
scaled down near the solute has also been
implemented.
Molecules to be moved can be selected randomly, in a cycle
or in a
shuffled cycle or using the preferential
sampling scheme of
J.C. Owicki, which leads to improved
statistics for the local solution environment of the solute (see
key MOVE PRF*).
The cavity-biased insertion technique for
grand-canonical ensemble simulations and the
virial-biased volume-change
for the isobaric ensemble have also been implemented.
I.5. Free energy methods
Free energy simulations can be performed using a coupling
parameter lambda (CP in the followings) in the energy expression.
There are two
fundamentally different approaches
to generate a path: (A) simultaneous
creation/annihilation and (B) continuous deformation. In the
first the Hamiltonians of the initial and final states are used
in unchanged and the coupling parameter is used to combine the
two; in the second case there is a single hybrid Hamiltonian that
is based on a parametrization that is a mixture of the two
systems. For multimolecular solute, the free energy simulation
can be applied to just one or two solute molecules, depending on
the type of coupling.
I.5.A. Creation/annihilation path
I.5.A.1. Creation/annihilation path to be used in
thermodynamic integration (FREE TICA).
Here
E(CP)=CPk*E1+(1-CP)k*Eo
where Eo and E1 represent the
Hamiltonians initial and final
state, respectively. k=1 represents linear coupling, k>1 has been
referred to as or 'almost linear' coupling (e.g., k=4 for
1/r12
repulsion). It is also possible to use different k values for
the different types of energy terms
(1/r12, 1/r6, and 1/r) with the
so-called polynomial TI.
Eo and E1 can differ in conformation,
partial charges, and chemical identity (the present version
requires equal number of atoms in the two states - this can
always be achieved with he use of dummy atoms. It is also
possible to provide two more coupling parameter values
CPo and CP1 such that
CPo < CP < CP1.
The program will compute the free energy
difference between states represented by E(CPo) and
E(CP1) with the
perturbation method as the sum of two free energy differences:
A1-Ao =
-kT ln<exp((E(CP1)-E(CP))/kT)>CP
+ kT ln<exp((E(CPo)-E(CP))/kT)>CP.
<E(CPo)>CPo and
<E(CP1)>CP1 are also computed for
consistency check.
The two molecules have to contain the same number of residues
(groups), but can have different number of atoms.
I.5.A.2. Perturbation method based on a sequence of
intermediate states (FREE TILI). This
option performs a
perturbation method calculation between two states generated
using the deformation-type path described below in I.5.B.
The reference state will be
(E(CP1)-E(CPo))/2. In this case, the
inital and final states have to have the same number of atoms also.
I.5.B. Deformation path
I.5.B.1. The program also provides option for a
fundamentally nonlinear coupling where the coupling parameter
changes both the solute coordinates and the potential parameters
to generate intermediate states (continuous deformation).
Besides creating intermediate Eo and E1
states for the linear or
'almost linear' couplings described above in I.5.A.2, this
coupling can be used with the probability ratio method
(FREE PMF1) or with the overlap ratio
method (FREE PMNL), as
described
below in I.5.B.2 and I.5.B.3, respectively. The initial and final
states may differ in position, partial charge or identity of
atoms (there are limitations for EPEN).
In each of the following cases, the path involves one or two
molecules of the solute. The following paths are supported:
I.5.B.1.1. Linear combination of two conformations of a
single molecule (FREE PMF1 GENL). In
this case all atoms of
the molecule may
change positions independently. The configurations at the
beginning and at the end of the path are given on input and the
intermediate configurations are convex linear combinations (i.e.,
R(CP)=CP*R1+(1-CP)*Ro) of these two.
I.5.B.1.2. Moving two molecules of the solute in a
correlated fashion (FREE PMF1 WRMM). In
this case, the
intramolecular distances within the two parts should remain
constant. Note, that if the orientation of either part is
different in the initial and final states then at the
intermediate states the molecule will be distorted.
I.5.B.1.3. Varyig the distance between two solute molecules
of the solute but keeping the one of them fixed
(FREE PMF1 WRFM).
In this case the configurations at the two ends of the path
differ only in the conformation of the moving part of the solute.
This option may be more economical when one part is large and the
other is small.
I.5.B.1.4. Moving and rotating one molecule while keeping
the other one fixed (FREE PMF1 WRFR).
In this case the center of
mass of the first part is moved between the center of masses at
the two endpoints and the orientation of the moving part is also
changed to transform it from the initial to the final
orientation. This option differs from I.5.B.1.3 in that at each
point of the path the moving molecule has the
same intramolecular conformation.
I.5.B.1.5. Correlated rotation of a set of torsion angles
(FREE PMF1 TORS). In this case a set of
torsion angles are
changed in a user-specified interval using a common CP parameter:
CP=0 sets all the selected angles to the beginning of their
interval and CP=1 sets all of them to the end of their interval.
I.5.B.2. The solvent contribution to the free energy change
between two solute conformations can also be determined by
computing the ratio of the Boltzmann probabilities as sampled
along a line in the conformation space of the solute (described
by any of the paths described in Sec. I.5.B.1,
FREE PMF1 **** *USS).
This approach is called the probability ratio method.
Calculations of this type usually require umbrella sampling
techniques. The program allows either the use of a harmonic
potential to be given from input or the self-consistent
determination of a tabulated weighing function using the
Adaptive Umbrella Sampling Method.
I.5.B.2. Any of the paths descibed in Sec. I.5.B.1 can be
used to estimate free energy changes simultaneosly using the
overlap ratio method of
Jacucci and Quirke] and the
perturbation method (FREE PMNL). Here
the calculations
are run at a given fixed E(CP) and two more coupling parameter
values, CPo and CP1 are specified,
CPo < CP < CP1. For the overlap
ratio method, the program will calculate the CP averaged
distribution of <E(CP)-E(CPo)>CP
and <E(CP1)-E(CP)>CP. Separate
calculations at CPo and CP1 will give the
needed CPo and CP1 averaged
distributions to be used with the overlap ratio method.
Simultaneously, the program also calculates the perturbation
method differences between the states CPo and CP as
well as CP and CP1. Notice that this kind of
perturbation method calculation is
different from the calculation described with the
creation/annihilation path above: there the middle state E(CP) was
a linear combination of the initial and final states,
E(CPo) and E(CP1).
For consistency check, (similarly to the perturbation
method calculations described above) the CPo average
of E(CPo) and
the CP1 average of E(CP1) are also
calculated.
I.5.C. Chemical potential.
Free energy can be also obtained from the chemical potential.
The excess chemical potential can be obtained either from
grand-canonical
ensemble simulations (key GCEN),
from calculating the excess free energy
of changing a molecule to ideal gas (with one of the methods
described
above, e.g., with key FREE TICA)
or by the Widom insertion method
(key FREE WIDO).
The program implements a version that is based on
insertions into cavities.
I.6. Solute move selection
The sampling of the moves is specified as follows. The
overall frequency of solute moves is specified by input provided
with the STEP key. As discussed above,
the whole solute can be a) rotated and displaced;
b) one constituent molecule of the solute
can be rotated and displaced; c) two constituent molecules can be
swapped; d) a group of torsion angles can be changed; and e) the
coupling parameter used in a potential of mean force calculation
can be changed. If moves of type b, c, d, and e are required,
then the program will request a relative frequency (weight) of making that
move. Regular solute moves and coupling parameter changes are
both given a weight of 1.0 . Thus giving a weight of 2.0 for type
3 moves, out of 4 solute move attempts, 1 will change the solute
position, 1 will change the coupling parameter, and 2 will change
one of the randomly selected torsion angle groups (on the
average, of course). Clearly, the more types of moves on is using,
the more frequently has the solute to be moved.
For the free energy options, the treatment of moves of type 2
or 3 depends of the type of path. For the creation/annihilation
path using TI (FREE TICA)
the torsions are independently sampled on
both copies, since these can be, and generally are, different
molecules. For the continuous path
(FREE PMF1 or
FREE PMNL) or
the cration/annihilation PM (FREE PMLI),
the torsions are treated
identically in each copy since here it is assumed that the
molecule is changing smoothly. This means, that for a
creation/annihilation TI these moves have to be sampled twice as
frequently.
I.7. Initial
configuration
The starting configuration for the Monte Carlo
(see key CNFI) is either read from a
previously prepared file, generated from the solute's description
(see key SLTA) and randomly placed solvents
or from a configuration of another system through various
transformations.
Grand-canonical ensemble simulation
(see key GCEN) is also an efficient way
of obtaining a random initial configuration.
I.8.
Information needed for a simulation
I.9.1. T, temperature at which the simulations are to be
run (specified with the TEMP key).
I.9.2. The coordinates of the nuclei, electrons and
pseudoatoms (if any) in a local coordinate framework, and the
charges associated with these centers for the solute and
(optionally) for the solvent
(specified with the SLTA key).
The program
SIMULAID
can help convert structures in different file formats
(e.g., PDB, CHARMM .CRD) to MMC format and can
optimize the orientation of a solute within a periodic box
(to maximize the smallest inter-image distance and thereby minimize
the artifactual effect of the solute images), within a boundig box,
or optimally center a solute in a sphere (for droplet-type
calculations).
I.9.3. Potential specifications.
For each solute atom the potential type has to be specified from
input (except for EPEN where for each atom and pseudoatom
the A, B, and C parameters have to be specified individually).
The program
SIMULAID
can help to convert from PDB, CHARMM, AMBER or Macromodel
input files to MMC format.
Several water potentials are implemented with specific codes,
but solvent can be defined similarly to the solute as well.
I.9.4. nmolec, the total number of molecules used in the
simulation (i.e., the number of solvent moecules + 1),
specified with the SIZE key.
I.9.5. Size and shape of the periodic cell
(specified with the PBCN key).
I.9.6. When generating a new configuration the program can
determine nmolec and the simulation cell size parameters
from the partial molar volumes of the solute and solvent and
the required number of solvent shells around the solute.
I.9.7. For partial solute displacements the groups of solute
atoms (independent molecules) to be moved separately
(specified with the PARD key).
I.9.8. For solute torsions
(specified with the TORD key).
the atoms defining the torsion
angle and the torsion potential parameters.
I.9.
Self testing
There is an extensive
self testing routine in the program that
tests the integrity/consistency of the data.
It protects against both corruption of data and program errors.
It is invoked at several ways:
- Automatically at startup
- Periodically, as specified by the user with the
SLFT key.
- Optionally at the end of the run, as specified by the user with
the STOP key.
- Automatically at every 2,500,000th step (but not at the last
one).
I.10. General analysis
Except for the the bulk distributions and most thermodynamic
variables,
all the the properties discussed above in
Sec. I.2., can be calculated from
the history of a simulation as well. Note, that such analyses can
be done
on simulation trajectories generated by other programs, e.g., by
CHARMM or AMBER. In addition to the analyses described above,
several other types of analyses are implemented as summarized
below.
- Generic solvent sites can be calculated
(see key GENS)
- Two configurations can be compared and the changes in the
solute molecules can be calculated
(see key CMPC)
- Cavities in the solute can be visualized by a grid technique
(see key PRTG) and their volume be
estimated (see key STVG)
- The solute-solvent and solute solute energies can be separated
into terms involving each solute atom
(see key ENGL)
- The distribution of the solvents around the solute can be
visulaized by creating a configuration with the solvent coordinates
from a set of selected configurations
(see key DENF)
- A simulation history can also be filtered according to various
criteria, e.g., energetic, geometric or simply eliminating segments
of the system (see key FILT)
II. FILE ORGANIZATION.
Beside then standard input (Fortran unit 5) - for the information
needed
for the run - and the standard output file (unit 6) - printed
output, diagnostics, described in Section
V. - the program uses a number of different
files. In the followings,
writing to the standard output will be refererred to as printing.
The file names are structured the following way:
<jobname>[.<numrun>][_<version>].<file
type>.
where
- <jobname> is any legitimate filename, common to
all files generated by the run (specified by the
FILE key);
- <numrun> is a number
greater than 1 and less than 99 (i.e., the omit
the run number from the file name for <numrun>=1
and limit the number of runs to 98);
After each RUNS command,
<numrun> gets incremented by one.
This allows the execution of several different MC runs
(equilibrations, free energy runs with different coupling
parameters, etc.) within a single job.
Also, <numrun> can be modified
by the FILE command.
- <version> is an integer between 2 and 9999
(i.e., the omit
the version number from the file name for <version>=1.
It is used whenever a given runnumber creates several files of the
same type during a run
(e.g., FILT, CHKP or
the SCKP keys)
- <file type>, as explained below, is one of
{ckp, crd, hst, dst, idl,
pxc, pxp, pxi, slt, wmp,
dat, grd, pxv, wmp,
pdb, fgd, CRD}
- File type=ckp: Checkpoint file storing the
complete program status (checkpoint). It is created automatically
when the RUNS,
WCKP or
SCKP keys are used.
It allows the resumption of the
calculation with the RCKP command by
restoring all common
blocks to their value prior to the last write.
It starts with the
coordinates of the current configuration, in the same (binary)
format as on file .crd
(except for the coupling parameter that is stored differently).
Checpoint file is written before a selftest
(see key SLFT),
at every nplt-th step
(before printing the distributions
- see key RUNS)
and at every nrecd step.
By default, nrecd is set
to the smaller of nplt or
1000000, 750000, 500000, 200000, 100000 for systems with
number of atoms <3000, <5000, <7500, <10000, >10000,
respectively but it can be set to any value with the key
CHKP.
- File type=crd: The coordinates of the initial
configuration are stored here.
Additional records may store the coupling parameter
(see key FREE),
the number of molecules
(see key GCEN)
and/or the simulation cell sizes
(see key IBEN)
It is read by the CNFI command,
and can be written by the WCNF command
- File type=hst: Trajectory file where the simulation
history is (to be) stored. It's use is activated by the
TRAJ command.
- File type=dst: The bulk solvent distribution functions
calculated during the simulation may
be saved after every nplt steps (the g(R), various QCDF's
(Quasi-component distribution
functions) on this file. It can later
be used with plot programs for graphical displays. Examples for
QCDF's are the distribution of coordination numbers or binding
energies.
- File type=idl: The log of insertions and deletions are
stored here, as described for the command
GCEN.
- File type=pxc: The checkpoint file for proximity
analys is generated when keys
PXAN or
SCAN are used.
- File type=pxp: The file containing the proximity
analysis distributions for further plotting, written when the
PXPL key is present.
- File type=pxi: The file containing the proximity
information (generated by PXWR)
for each configuration analyzed.
- File type=slt: The file containing the solute
description (read after the key SLTA).
- File type=wmp: The file of matched PMF segments prepared
for plotting by the WMAT command.
- File type=dat: Configuration file written for InsightII
input (generated by WCNF SVAN).
- File type=grd: Grid point file written for cavity
analysis, generated by PRTG.
- File type=pxv: Proximity region visualization file,
generated by GENV WRIT.
- File type=pdb: Configuration written in
PDB format.
- File type=fgd: Field-gradient file, generated by the
FLDG key.
- File type=CRD: Coordinate file in CHARMM CRD format
(with or without the CHARMM headers).
III. INPUT DESCRIPTION
The program input is based on keyword driven commands.
The structure of a command is as follows:
- Leading keyword
- Auxiliary keyword(s)
- Unformatted data
- Formatted data
Of these only the leading keyword appears all the time.
Each keyword is a four-character, upper case string.
Keywords and unformatted data have to be on the same line.
Commands requiring
more than one line require the continuation character "~" at the
end of the line that is not the end of the command.
Several
commands can be put on the same line if they are separated by the
separator character "|". Text after the comment character "!" is
neglected (i.e., considered just comment).
The type of items required are defined by the leading keyword.
Auxiliary keywords and data may have default values.
The order of keywords and data is fixed, so if there is a needed
auxiliary keyword or data, all other items have to be specified
that precede the needed one.
Formatted data (when required) always start in a line after the
keywords and it can not be omitted, but for items with
default value, can be secified as zero.
While there is no rigid order of the leading keywords, there
are certain rules. Certain leading keywords have
prerequisite
keywords (i.e., the prerequisite keywords all have to be inputted
earlier) and potential precursor keywords that, if
used, must be
read before the current one. If the same leading keyword is used
more than once before a RUNS command, a
warning is given and the last occurrence will be used (except for
TITL).
During input and initializations the data read is cheked
for consistency and for current program limits.
This may result in messages prefixed by OVERRIDE,
WARNING, STRONG WARNING or ERROR.
At the conclusion of the run, the number of each type of messages
is printed.
ERROR messages prevent a simulation or an analysis run
but don't abort the input.
When the input has more than one ERROR messages,
some of the additional ERROR messages may
only be the consequence of the first one.
WARNING message indicates
that under some circumstances the input may be incorrect
and STRONG WARNING message indicates
that the input is likely but not necessarily incorrect or that the
error is not fatal.
OVERRIDE messages are generally harmless, but they can
indicate that the input was not what the user intended.
This section describes the input for each leading keyword
and specifies their prerequisites and potential precursors.
The description of the commands is given by leading keywords.
For example
CNFI Key1 [Key2 Key3 Data] Fdata
means that the
leading keyword CNFI has to be followed
by at least one other auxiliary
keyword and optionally two more ("Key 2 Key 3"),
followed by free-format numerical data ("Data")
(all on the line containing the leading keyword), followed by
formatted data, started in a new line.
Note also that the type of (or the need for) auxiliary keys,
free-format and formatted
data may depend on the actual choice of auxiliary keys.
The possible keywords to be used for
Key1, Key2 and Key3 and their effects are explained subsequently,
as well as the nature and (for Fdata) the format of the data.
Keywords are always given with upper case BOLD
characters while variable (data) names are given in lower case
bold characters. Input item that is optional is given in
[ brackets ] and the default values are given in { braces }.
The default value for an auxiliary keyword
is the first one on the list.
A record type number (RECTYPE) is assigned
to every formatted data record. It is used in error messages
on the program's output to
identifying the type of data read when the error was encountered.
III.1. File handling keywords
- FILE: Job name
- WCNF: Configuration
saving
- WCKP: Creation of a
checkpoint file
- SCKP: Periodic archiving of
checkpoint files
- RMCK: Checkpoint file
removal
- TRAJ: History file
specification
- BUFF: History file buffer
save
- IDLG: Insertion/deletion
log file specification
- PXPL: Proximity analysis
distribution plot file writing
- PXWR: Proximity information
file specification
- WTRA: Trajectory
conversions
- STIR: Contact elimination
III.01.01. Job name
FILE Data
Data: jobname[, numrun]
III.01.02. Configuration saving
WCNF
Prerequisite: CNFI,
SLTA
Key1 (coordinate file format):
- BNRY - Configuration is saved in binary
- ASCI - Configuration is saved in ASCII, (3f15.5)
- ASAN - Configuration is saved in ASCII, and in Å, each
atom's coordinates are preceded by its atomic symbol and followed
by the group number and charge (a4,1x,3f15.5i5,f10.5)
- PDB - Configuration is saved in
PDB format
(last two colums contain the atomic radii and charge)
- PDBQ - Save the configuration in GRASP PDB format #1
(last two colums contain the atomic radii and charge)
- CHRM - Save the configuration in CHARMM CRD format
Key2 (possible manipulations) - for any of the keys above:
- CENT - The solute is centered in the simulation cell
before saving.
- SW23 - Switch between 2-copy and 3-copy free energy
solutes before saving the configuration:
- UNCH - Save the configuration unchanged
- CANO - Save the configuration in canonical
ensemble format (as specified by Key1) - meaningful only
for grand-canonical or isothermal-isobaric ensemble runs.
- MLCT - Extract from the configuration the centers of the
solute molecules and discard the rest (including the solvent)
before saving
- REPL - Extend the system with its periodic replicas
as specified by the PBCN key.
If extensions required only in a few directions then the
required replica cell centers should be read with the
PBCN READ (cell center coordinates
can be printed with the PRPB key).
All solvent molecules are printed after all the solute images and
the solute atoms affected by the key PARD
are also printed contiguously.
- REP0 - Extend the system with its periodic replicas
but don't replicate the solute. This option is useful for enlarging an
existing system.
- PDBD - Save the configuration in
PDB format
(last two colums contain general data in free format
- see Key2 and Key3 below and key KMNP)
- PDBG - Save the configuration in
Grasp PDB format #2
(last two colums contain general data in free format
- see Key2 and Key3 below and key KMNP)
- PDTD - Same as PDBD except the data in the two
data colums will come from the difference between two datasets,
generated by the PXDT key.
- PDTG - Same as PDBG except the data in the two
data colums will come from the difference between two datasets,
generated by the PXDT key.
- CHRD - Save the configuration in CHARMM CRD format. The
content of the data column after the coordinates will be determined
by Key2 as described below.
Key2 (for Key1=CHRD only);
Key2, Key3 (for Key1=PD*D or Key1=PD*G
only):
Any (two) of the following keywords, specifying the
two quantities to be printed in the last (two) data column(s)
of the atom records
- VFSH Volume of the first solvation shell
- V2SH Volume of the first two solvation shells
- NFSH Number of solvents in the first solvation shell
- N2SH Number of solvents in the first and second
solvation shells
- DFSH Solvent density in the first solvation shell
- DSSH Solvent density in the first and second solvation
shell
EBFS Solute-solvent binding energy in the first
solvation shell
- EPFS Solute-solvent binding energy in the first
solvation shell
- TOBE Total binding energy
- TOTN Total number of solvents
- NFVV Solvent-solvent coordination number
- EPVV Solvent-solvent pair energy
- EBVV Solvent-solvent binding energy
- MRDF First minimum of the primary RDF
- KRDF Coordination number using the actual
first minimum of the primary RDF
- VRDF First shell volume using the actual
first minimum of the primary RDF
- DRDF First shell density using the actual
first minimum of the primary RDF
- CRCV Circular variance, based on all solute atoms
- CRCH Circular variance, based on the solute heavy atoms
only
Data: numrunr, rcrcv
- numrunr {numrun} - The configuration is saved
to the file <job name>.<numrunr>.dat when
Key1 is SVAN,
to the file <job name>.<numrunr>.pdb when
Key1 is PDB*,
otherwise to file <job name>.<numrunr>.crd
When Key2 or Key3 = CRC*:
- rcrcv {6.0} - The
circular variance will be calculated
using solute atoms within a radius of rcrcv.
The circular variance for a point R is defined as
CV = 1.0 - [ SUM (<R>-ri)
/ SUM (|<R>-ri| ]
where ri are the coordinates of solute atoms
within
rcrcv. When R is inside the solute, CV is near one,
and
when R is outside the solute, CV is near zero.
III.01.03. Creation of a checkpoint file
WCKP [Data]
Prerequisite: FILE
Data: numrunw
- numrunw {numrun} - Checkpoint file
<jobname>.numrun.ckp is to be written.
III.01.04. Periodic archiving of checkpoint files
SCKP [Data]
Data: nsavckpf
- nsvckpf - Frequency of writing a separate checkpoint
file.
Each checkpoint file will have the runnumber of the current run
and consecutive version numbers, starting at 2.
This is option allows the later calculation of proximity analysis
averages over different segments of the run (see key
CMPC),
provides a backup procedure in case the program
crashes while writing the checkpointfile, and is
useful for debugging.
III.01.05. Checkpoint file removal
RMCK [Data]
Data: numrunp
- numrunp {<current runnumber> -1} - if positive,
the checkpoint file(s) for
run number.
numrunp will be removed
(by default, the checkpoint file created by the last
RUNS or SCAN
will be removed).
If numrunp is negative then the checkpoint file(s) for
the <current runnumber> - |numrunp| will be removed.
This option is useful for preventing the disk to become too full
when the calculation is broken up into several
RUNS or SCAN
steps.
III.01.06. History file specification
TRAJ Key1, [Key2,Key3] [Data]
For MC runs this key sets up the saving or reading
of the run history on
file <jobname>.<runnum>_<runvers>.hst
(For runnum=1 the run number part is omitted and for
runvers=1 the version part is omitted).
See also the key BUFF.
For analysis (SCAN) or
filtering (FILT **** TRAJ)
this command
specifies the format of the history file. An input
history can consist of several files with the same runnumber and
with consecutive versionnumbers (i.e., the
second, third, etc. piece should have names
<jobname>.<runnum>_<runvers+1>.hst ,
<jobname>.<runnum>_<runvers+2>.hst , etc.).
Key1 (trajectory file format - if any):
- NONE - No saving of run history.
- ALLE -
The following information is stored at every accepted step in
blocks of 250 when no variable coupling parameter is applied:
- estac - Total energy of the configuration
- cst - Coordinates of the first three atoms of the
moved molecule istc.
- istc - Number (label) of the molecule moved.
- nmcst - Step number in the MC walk.
- bngst - Binding energy of the molecule moved. When
variable coupling parameter is used (i.e.,
FREE PMF1
run), the coupling parameter value is saved instead of the binding
energy.
This option allows the full reconstruction of the
run's history but works only when neither the key
PARD nor the key
PART is in use and
only in the canonical ensemble.
- ALLV - Same as ALLE, but the total viral sum is
saved instead of the binding energy.
Valid only in the canonical ensemble.
- MINE - The coordinates of the minimum energy
configuration are saved on the .hst file (overwriting the
previous configuration) every time a minimum is found, in
binary format. For free energy runs the
coupling parameter is written as an additional record.
For (T,P,N) ensemble runs the three edge parameters (see key
PBCN) are also saved as an additional
record.
- ALLC - Trajectory file contains full configurations
in formatted (ASCII) records as follows:
- 1st line: number of atoms written (natomp), number of
molecules, total number of atoms, first solute atom written
out, last solute atom written out, MC stepnumber, coupling
parameter in format(i6,9x,i6,8x,i6,7x,2i6,5x,i9,4x,f8.0)
- Next natomp lines: x,y,z coordinates (in Å)
(3f15.5) format
- Last -1 line: ins/del stepnumber, total energy, number of
accepted insertions and deletions, umbrella sampling weight
factor for the configuration
- Last line: solute-solvent energy and its contributions for
each interaction term. Important: only the first three atoms of
each solvent molecule are written out.
For (T,P,N) ensemble runs the three edge parameters (see key
PBCN) are also saved
in an additional record in (3f10.5) format.
- ALLA - Trajectory file contains full configurations
file in formatted (ASCII) records as follows:
- 1st line: number of atoms written (natomp), number of
molecules, total number of atoms, first solute atom written
out,i last solute atom written out, MC stepnumber, coupling
parameter in format(i6,9x,i6,8x,i6,7x,2i6,5x,i9,4x,f8.0)
- Next natomp lines: atomic symbol, x,y,z coordinates
(in Å), group (residue) number, partial charge with
(a4,1x,3f15.5,i5,f10.5) format
- Last -1 line: ins/del stepnumber, total energy, number of
accepted insertions and deletions, umbrella sampling weight
factor for the configuration
- Last line: solute-solvent energy and its contributions for
each interaction term. Important: only the first three atoms of
each solvent molecule are written out.
- For (T,P,N) ensemble runs the three edge parameters (see key
PBCN) are also saved in an additional
record.
- ALLB - Trajectory file contains full configurations
in binary format:
- First record: number of solvent
molecules, number of atoms, umbrella sampling weight factor for the
configuration, MC stepnumber, insertion/deletion stepnumber, number
of accepted insertions, number of accepted deletions, first atom
written out, total energy, solute-solvent energy and its
contributions for each interaction term
- Second record:
the whole configuration
- For free energy runs the coupling parameter is written as an
additional record. Again, only the first three solvent atoms
are written out.
- For (T,P,N) ensemble runs the three edge parameters (see key
PBCN) are also saved in an additional
record.
- ALLP - Trajectory file contains full configurations
in PDB format.
The first record is a REMARK giving the same information as the
first line of the ALLA format in
format(9x,i6,3x,i6,5x,i6,7x,2i6,5x,i9,4x,f8.0) and the next two remarks
- ALLH - Trajectory file contains full configurations
contains full configurations
in CHARMM CRD format.
The first title line gives the same information as the
first line of the ALLA format in
format(9x,i6,3x,i6,5x,i6,7x,2i6,5x,i9,4x,f8.0)and
the next two title lines mirror the TITLE lines.
- CHRM - Trajectory file contains full configurations
in CHARMM binary trajectory format.
For runnum=1 the .DCD extension is also allowed.
- AMBR - Trajectory file contains full configurations
in AMBER format. The program will
check if the system is centered around (0,0,0) or
(Ex/2,Ey/2,Ez/2).
Key2 (level of solute changes):
- RGFX - The solute is rigid and is not moved during the
simulation
- RGMO - The solute is rigid and but it is allowed to
move during the simulation
- FLEX - The solute is flexible
Key3 after Key1=ALL*
(level of solute atom saving):
- ALST - All solute atoms are saved
- MVST - Only the solute atoms affected by partial
solute moves (PARD key) are saved
- NOST - Solute atoms are not saved
Key3 after AMBR (writing PBC information):
- NOBX - No box-size information after configurations
- BOX - After each configuration the simulation cell
edges are written in a new line
- BOXX - Read the box description after each
configuration but keep using the values inputted with the
SIZE command
Data: natskip, runnum, runvers,
tfilename
- natskip {0} - Number of atoms to skip after having read
a configuration from a trajectory file.
Ignored with MMC trajectories.
- runnum {current run number} - The
run number
of the trajectory file.
- runvers {1} - The version number of the (initial)
trajectory file.
- tfilename {jobname} - The trajectory file
is named tfilename.runnum_runvers.hst
When running a MC simulation, full configurations
(Key1=ALL*)
will be written on the history file
at every nmcrec-th MC step
(read with the RUNS key).
III.01.07. History file buffer save
BUFF
When this key is present, the buffers of the open history files are
written out whenever a new checkpoint file is saved.
This is currently achieved by rewindig the file and skipping to
the end.
III.01.08. Insertion/deletion log file specification
IDLG
Prerequisite: GCEN,
Key1 (ins/del log file creation):
- OFF - No log is written.
- ASCI - For each accepted insertion and deletion the
program writes +1 or -1 (for insertion or deletion), the molecule
index, the number of molecules after the insertion/deletion, the MC
stepnumber and the coordinates of the center of mass of the
molecule inserted/deleted on the .idl file.
III.01.09. Proximity analysis distribution plot file
writing
PXPL
If this keyword is present, the program will write an ASCII
file containing all the calculated proximity distributions to the
.pxp file.
The distributions written on the .pxp file can be plotted
with
the companion (Fortran-77) program pxps.f.
pxps.f produces Postscipt files of plots, 8 plots per page.
It can put two distributions on the same plot, but both have to use
the
same dependent variable.
The right-side vertical axis will belong to the second plot.
A typical run of pxps.f is as follows (user input is
displayed in bold face):
% pxps
Proximity plot generator Version:/07/22/98
Data file name root=wt1
File wt1.pxp opened <- the file wt1.pxp
was generated by
the PXPL key
Use @ for: @f=Greek fnt, @n=Normal fnt, @+=superscript
@-=subscript, @m=newline, @#n#=font size n, @@=@symbol
Title of graph page:
GnRH Wild Type, Extended Conformation
Do you want grid lines drawn (y/n)? n <- If yes, only one plot
can be plotted
Plot types:
1 Solute-solvent radial distribution functions <- from key PXGR
2 Solute-solvent pair energy distribution functions <- from key PXBE
3 Mean solvent orientation as a function of R <- from key PXDP
4 Distribution of first shell solvent orientations <- from key PXDP
5 Solvent-solvent radial distribution functions <- from key PXWW
Plot type number=1
Data colums: 1 R 2 gpx(R) 3 Kpx(R) 4 gt(R) 5 Kt(R)
Data column number=2
Do you want a second function also plotted (y/n)? y
Plot types:
1 Solute-solvent radial distribution functions
2 Solute-solvent pair energy distribution functions
3 Mean solvent orientation as a function of R
4 Distribution of first shell solvent orientations
5 Solvent-solvent radial distribution functions
Plot type number=1
Data colums: 1 R 2 gpx(R) 3 Kpx(R) 4 gt(R) 5 Kt(R)
Data column number=3
Number of graphs to print (0: will ask for range)=0
First and last distribution function=1,18
Data set not found (skipped): gkr is= 1 <- empty proximity
regions
Data set not found (skipped): gkr is= 10
File wt1_2_9_gpx_Kpx.ps opened
File wt1_11_18_gpx_Kpx.ps opened
More plots (y/n)? n <- allows to plot other distributions
%
The format of the .pxp file is as follows:
- Header:
- nsltpx, nsltpxgr (2i5)
- ianslt (20i4)
- resnam,atnam (20a4)
- nsltpx - Number of solute atoms analysed
- nsltpxgr - Number of rdf's generated
- ianslt - atomic numbers of all solute atoms analysed
- resnam,atnam - 8-character label of all solute atoms
analysed
- Radial distribution functions (PXGR
key was present). For each distribution function the program writes
- is, mxgrd (' is=',i4,' ngrids=',i4,' gkr')
- (grd(i), gr(i), rcoord(i),
grt(i), rcoort(i), i=1,mxgrd) (5f8.3)
- is - rdf number
- mxgrd - Number of gridpoins
- grd(i) - Radius of the i-th gridpoint
- gr(i) - Primary rdf
- rcoord(i) - Primary running coordination number
- grt(i) - Total rdf
- rcoort(i) - Total running coordination number
- Primary solute pair energy distributions
(PXBE key was present).
For each distribution function the program writes
- is, mxgrd (' is=',i4,' ngrids=',i4,' xpe')
- (egr(i),sltep(i),rcoore(i),i=1,mxgrd)
(10f8.4)
- is - rdf number
- mxgrd - Number of gridpoins
- grd(i) - Energy of the i-th gridpoint
- sltep(i) - Primary pair energy distribution value
- rcoord(i) - Primary running coordination number
- Primary solvent-solvent rdfs
(PXWW key was present).
For each distribution function the program writes
- is, mxgrd, nijgvvp
(' is=',i4,' ngrids=',i4,' ngvv=',i2,' gvv')
- mxgrd lines: grd(i),(groo(i,iv),v=1,nijgvvp)
(10f8.4)
- is - rdf number
- mxgrd - Number of gridpoins
- nijgvvp - Number of rdf types (between different
solvent centers)
- grd(i) - Radius of the i-th gridpoint
- groo(i,iv) - Primary solvent-solvent rdf
- Solvent mean orientation
(PXDP key was present).
For each distribution function the program writes
- is, mxgrd (' is=',i4,' ngrids=',i4,' tav')
- mxgrd lines: (grd(i),tav(i),i=1,mxgrd) (10f8.4)
- is - rdf number
- mxgrd - Number of gridpoins centers)
- grd(i) - Radius of the i-th gridpoint
- tav(i) - Primary mean solvent-solute angle
- Solvent orientation rdf (first shell)
(PXDP key was present).
For each distribution function the program writes
- is, mxgrd (' is=',i4,' ngrids=',i4,' xtd')
- mxgrd lines: (grd(i),xtd(i),i=1,mxgrd) (2f9.4)
- is - rdf number
- mxgrd - Number of gridpoins
centers)
- grd(i) - Angle THETA of the i-th gridpoint
- xtd(i) - Primary solvent-solute angle distribution
III.01.10. Proximity information file specification
PXWR Key1 [Key2 Data]
Key1 (file format - if any):
- OFF - Proximity information writing is turned off.
- ASCI - The proximity information file is an annotated
ASCI file containing the following data for each configuration
analyzed:
(nsv,nmc,ncnfpx,
(iproxi(i),resname(i),resno(i),atnam(i),rmin(i),esvst(i),i=1,nsv)
where
- nsv is the number of solvent molecules
- nmc is the MC stepnumber
- ncnfpx is the number of configurations analyzed so far
- iproxi(i) is the solute atom nearest to solvent i
- rmins(i) is the corresponding minimum distance (in Å)
- esvst(i) is the interaction energy of solvent i with the solute
(in kcal/mol).
- BNRY - The proximity information file is a
binary file containing one record for each configuration analyzed:
(nsv,nmc,ncnfpx,(iproxi(i),rmin(i)2,esvst(i),i=1,nsv)
Important: iproxi is stored as INTEGER*2 to save space.
Note, that the size of a file created with the BNRY option
is about the tenth of the size of a file
created with the ASCI option.
- BNRR - The proximity information file is a
binary file containing one record for each configuration analyzed:
(nsv,nmc,ncnfpx,(iproxi(i),rmin(i)2,i=1,nsv) .
This file is about 2/3rd of the size of the file written with
the BNRY option.
Key2 (read or write):
- WRIT - Write the proximity information
- READ - Read the proximity information from an existing
.pxi file
Data:
- numrunpxi - The
run number
of the .pxi file to read from
(only read with Key2=READ)
The proximity information file has extension .pxi.
III.01.11. Converting/shifting a trajectory
WTRA Key1 Key2 [Data]
Prerequisite: TRAJ
Key1 (transformation type):
- UNCH - Don't change coordinates
- STCT - Shift the system to put the solute's center of
mass at the cell center
- SHCT - Shift the system to move the origin from the
lower left corner of the box to the center (update PBC images
accordingly)
Key2 (trajectory format):
- One of ALLC, ALLA, ALLB, ALLP -
see key TRAJ.
Note, that ALLB is the most compact.
Data: nmcmax, nmcfreq, nmcskip,
incvers
This key allows the conversion of the input trajectory to a new
format
and gives the option of centering it in two different ways.
III.01.12. Contact elimination
STIR [Data]
Prerequisite: CNFI
Data: eijmin, disp, dispa
- eijmin {1000.0}, disp {3.0}
- solute molecule pairs with energy above
eijmin kcal/mol will be pulled apart by disp Å
- dispa {90.0} - solute torsion group members with intramolecular
energy exceeding eijmin will be incremented by dispa
degrees.
This option is useful for 'cleanng up' a starting configuration
generated with the RANC option.
III.2. System descriptor keywords
- SIZE: Number of
molecules
- TEMP: Simulation
temperature
- TITL: Description
- PBCN: Periodic boundary
conditions
- GCEN: Grand-canonical (T,V,mu)
ensemble
- STPX: Stop GCE run at
nx
- BTUN: Tune GCE B
parameter
- LIMG: Limit the cavity
grid
- PBGR: Solute reset with
grids
- IBEN: Isothermal isobaric
(T,P,N) ensemble
- CNFI: Initial
configuration
- SCAL: Simulation cell
rescaling
- OVST: Overlap of the second
solute copy with the first
- SPRD: Spread out solute
molecules
III.02.01. System size
SIZE Data
Data: nmolec, nmolfix
- nmolec - Number of molecules. Note that the
solute
(as defined by the SLTA key) always
counts as just one molecule.
- nmolfix {0} - Number of molecules to be kept fixed
during the simulations.
nmolfix=1 immobilizes the solute (moving large solutes is
simply too expensive) and nmolfix>1 allows the treatment of
a solvation shell as a fixed unit.
III.02.02. Simulation temperature
TEMP Data
Data: T
- T - the simulation temperature in Kelvin
III.02.03. Title
TITL Data
Data: description of the calculation.
III.02.04. Periodic boundary conditions and cell
dimension.
PBCN Key1 [Key2 Key3] Data [Fdata]
Prerequisites: SUVC,
SVVC
Key1 (cell shape):
- RECT - Rectangular (cubic) boundary conditions (SC).
Data: edgex [edgey, edgez] (in Å).
- edgex, edgey {edgex}, edgez
{edgex} - The X,Y,Z edges of the rectangle.
- Vol=edgex*edgey*edgez Å3,
Rinsc=min(edgex,edgey,edgez)/2
- FCC - Face-centered cubic boundary conditions
(FCC).
Data: edg
- edg- The edge of the rombohedral dodecahedron is
edg*sqrt(3)/2, Vol=2*edg3Å3,
Rinsc=edg/sqrt(2)
- HEXG - Hexagonal prism boundary conditions (HP).
Data: plen, pedge
- plen - The length of the prism
- pedge - The edge of the hexagon.
- Vol=plen*pedge2*3*sqrt(3)
Å3,
Rinsc=min(plen,pedge*sqrt(3)/2.
- The axis of the prism is the X axis and the vertex of the
hexagon goes through the Z axis.
For analyzing trajectories with different axis conventions, use the
SUVC key.
- SPHR - Sphere boundary conditions (SPH)
Data: rsph
- rsph-radius of the sphere.
- PHS - Primary hydration shell
(Beglov & Roux)
Variable to set to 'T' in the preprocessor: PH
Key2 (solute atom radius source):
- RVDW - Standard Van der Waals radii will be used for
the solute atom radius.
- RSIG - Lennard-Jones SIG/2 values will be used for the
solute atom radius.
Key3 (target energy definition):
- UTTO - uphsref read is the
total restraining energy
- UTPM - uphsref read is the
per molecule restraining energy
- UPML - uphsref read is the
per molecule restraining energy, but solvents farther than
a threshold are ignored for the reference shell energy calculation
Data: uphsref, phsk, rphs,
rphsmin, rphsmax, denphs, nmcphs, rphswid
- uphsref - reference hydration shell energy
- phsk - force constant of the restraining potential
- rsph,rphsmin, rphsmax - initial,
minimum and maximum radius of the shell, resp.
- denphs - solvent bulk density (used to calculate g(r)'s)
- nmcphs - frequency (in MC steps) of shell radius updates
For Key3=UPML only:
- rsphwid - width of the restraining shell contributing
the per molecule restraining energy (used only with Key3=UPML).
- READ - Input boundary conditions.
Fdata:
- RECTYPE 29: rinscr, vol, rmax (3F10.0)
- rinscr - Radius of the largest inscribed sphere into
the cell.
- vol - Volume of the unit cell (in Å3)
- rmax - For minimum image convention the cutoff
radius used is rmax.
- RECTYPE 39: ncell (I5)
- ncell - Number of cell-center coordinates to be read.
- RECTYPE 38's: (cic(k,i),k=1,3) (3F15.0)
- ncell times repeated.
- cic - The negatives of the
coordinates of center i (i.e., -X, -Y, -Z).
Note: Input boundary conditions work only if the image cell
can be found by comparing the distances of a point from the
various cell centers, the smallest distance will belong tot he
wanted cell number. It works when the periodic system
can be described by an orthogonal crystal coordinate
system. For other cases, more complicated calculations are
required - to start with, a transformation to
non-orthogonal coordinate system - but it is not
implemented yet.
For pure liquids, FCC is the best since this maximizes the
distance of a molecule from its image. On the other hand,
RECT is conceptually the simplest. Furthermore, for
RECT the dimensions of the cell can be different along the
three axes, allowing for various types of crystals. Solutes that
are far from spherical can also be wrapped around by waters more
efficiently.
(In this case, however, the solute rotation should be prevented
- see key STEP.) HEXG have been
developed for long
solutes or solutes representing an infinite polymer chain, to
maximize the distance of the chain from its image. SPHR
simply encloses the system into a sphere of inputted radius. Note
that the FCC calculations are only partially vectorized.
For the rectangular, face-centered cubic, hexagonal prism
and sphere boundary conditions there are special algorithms in
the program, one for each. When boundary conditions are inputted,
the cell number of the periodic image cell where a molecule is
located is determined simply by comparing the distance of the
molecule from the various cell centers read in.
The unit cell parameters are determined from the bulk
density or molar volume of the solution, Nmol, and the
choice of boundary conditions. If the density of the system is d
g/ml , then
d (g/ml) = weight in g of Nmol / volume of
Nmol in ml =
(Nmol/L)*Mol.wt / (Vol/1024)
where L is Avogadro's number. The cell parameter(s) can be
obtained from the resulting Vol value.
III.02.05. Grand-canonical ensemble selection
GCEN[ Key1 Key2 Key3 Key4 Key5 Data ]
Fdata
Potential precursors: LIMG
Variable to set to 'T' in the preprocessor: CB
Grand-canonical ensemble simulation using the cavity-biased
method with grid-insertion or the original,
random insertion method.
Key1 (GCE insertion strategy):
- OFF - Return to canonical ensemble.
- CAVB - Cavity biased insertions at gridpoints
- CVBF - Cavity biased insertions at a rectangle around
the gridpoints
- UNBI - Unbiased (random insertions)
Key2 (solute atom radius source):
- RVDW - Standard Van der Waals radii will be used for
the solute atom radius.
- RSIG - Lennard-Jones SIG/2 values will be used for the
solute atom radius.
- RSGH - Like RSIG, but hydrogens with SIGMA=0 will
be assigned SIGMA=1.
Key3 (insertion/deletion selection strategy):
- ALTI - Alternating insertion and deletions attempts
- RANI - Randomly decided insertion and deletions
attempts
Key4 (altervative Pcav definitions):
- ALLG - Pcav is defined as the number of free grids /
total number of grids
- SLVG - Pcav is defined as the number of free grids /
number of grids not covered by the solute
Key5 (strategy to use electrostatic term
for insertion/seletion steps):
- LJQI - All potential terms are used for the I/D
decision.
- LJNC - The electrostatic contributions are
neglected for the insertion/deletion decision. Ensemble averages
for the thermodynamical quantities are not corrected
for the applied approximation.
- LJCN - As in LJNC, but the ensemble
averages for the thermodynamic quantities are corrected for by
utilizing the applied umbrella sampling.
- LJCU - As in LJCN, but the umbrella
sampling is applied to both the insertion/deletion and the regular
moves, and thermodynamic quantities are
corrected for the biasing.
Data: ucspma, usmixp.
- ucspma - Average solute-solvent electrostatic
energy(in kcal/mole). If the supplied average electrostatic
solute-solvent energy is less than -1000 kcal/mole,
then the program uses the starting configuration's
value for the average solute-solvent electrostatic
energy. Umbrella sampling bias is Ew =
usmixp*(us1-n*ucspma) where
us1=solute-solvent electrostatic
energy, and n=number of solvent molecules at a given
configuration.
Fdata:
- RECTYPE 13: ba,
rspha,
pmvslt, ngridx, ngridy, ngridz,
nmolxp,
nmolfx, idfreq, idrepf, nmcransh
(3F10.0,6I5,2I10)
- ba - The B parameter of
Adams. The larger it is, the larger will the
chemical potential be.
- rspha - The cavity diameter in Å.
- pmvslt - The molar volume of the solute (to
be used in the density calculation).
- ngridx, ngridy, ngridz - The
number of gridpoints in the X, Y and Z directions (not used for
GCEN UNBI runs).
- nmolxp - The targeted number of molecules
referred to the key STPX
- nmolfx {1}- The number of molecules that are
not allowed to be deleted.
- idfreq - The frequency of insertion/deletion
attempts (in MC steps).
- idrepf - The frequency of summary line (in
Ins/Del steps)
- nmcransh - The inital grid will be
regenerated after every nmcransh-th MC step with a random
shift in its position.
III.02.06. Stop GCE run at nx
STPX
Prerequisite: GCEN,
When this key is present the run will stop if the number of
molecules reached
nmolxp (read in with the key
GCEN).
III.02.07. Tune the GCE B parameter
BTUN [Data]
Prerequisite: GCEN,
MOND
Data: targetden, nmctunave, nmctunskip,
chabmax, dentol
- targetden {0.997} - The targeted density in the bulk
region as defined by the MOND key.
- nmctunave {100000} - The number of MC steps to use for
collecting data before each B parameter change.
- nmctunskip {nmctunave} - The number of MC steps
to skip
(as equilibration) after each B parameter change.
- chabmax {1.0} - The maximum absolute change allowed for
the B parameter.
- dentol {0.01} - The density range around
targetden
within which the tuning is considered converged.
The program will periodically change the GCE B
parameter
to achieve the targeted density in the 'bulk' region - defined as
the outside region by the MOND key.
III.02.08. Limit the cavity grid
LIMG [Key1,Data]
Key1:
- NOWA - Free movement of solvents in and out of the grid
region
- WARC - Free movement of solvents in and out of the grid
region but counts of border crossings are maintained
- WALL - Movement of solvents in and out of the grid
region
is prevented, counts of crossing attempts are maintained
Data:
glimminx, glimminy, glimminz,
glimmaxx, glimmaxy, glimmaxz
- The limited grid will extend from
glimminx to glimmaxx,
glimminy to glimmaxy,
glimminz to glimmaxz in the x, y, and z directions,
respectively.
For any glim*value that reaches or exceeds the limits of the
simulation
cell (centered at (0,0,0), the periodic boundary conditions will
also be invoked.
III.02.09. Force solute reset with grids
PBGR
When this key is present, the grid calculations will reset
each solute atom to the periodic cell before calculating
the cover list. This is a safety option, since it is automatically
set when the inital structure contains solute atoms outside the
simulation cell.
III.02.10. Isothermal isobaric ensemble simulation (TPN)
IBEN Key1, Key2, Data
Variable to set to 'T' in the preprocessor: IB
Key1 (volume sampling technique):
- NONE - Turn off (TPN) ensemble sampling
- UNIF - (TPN) ensemble sampling with uniformly
distributed steps
- VB3D - (TPN) ensemble sampling with
virial-biased sampling of Mezei.
- VB1D - (TPN) ensemble sampling with refinement of the
virial-biased sampling of Mezei,
using the components of the virial sum to generate cell size
changes
in the X, Y, and Z directions, respectively.
Key2 (volume change (un)isotropy):
- ISOT - Volume change involves all three axes
- IXYZ - Volume change alternates between changes in the
X, Y or Z directions.
- ISYZ - Volume change alternates between changes in the
X dimensions and in the Y-Z dimensions.
directions
Data:
- press {1.0} - The external pressure specified (in
atm)
- vrange {100.0} - The volume change range (in
Å3)
- nvchfreq {nmolec} -
The frequency of volume change attmpts (in MC steps)
- nvchrep {nmolec} - The frequency of reports on
volume change
attmpts (in number of volume change attempts)
- vlam {0.5} - The virial biasing lambda factor.
Note, that putting the key IBEN after CNFI
will result in ignoring the cell size on the .crd file
and the vaulues inputted with the PBCN key will be used
instead.
III.02.11. Initial configuration
CNFI Key1 [ Key2 Key3 Data ] [Fdata]
Prerequisites: FILE,
SIZE,
SLTA
Potential precursor: GCEN,
TORD,
CNFC
PXYZ
Key1 (data source):
- READ : Initial configuration is read from user supplied
.crd file. For Key2=PDB , .pdb is also
allowed, for Key2=CRD , .CRD is also allowed.
Key2 (file format):
- ASCI - Coordinate file will be read/written in
ASCII.
- BNRY - Coordinate file will be read/written in
binary.
- ASAN - Coordinate file is an annotated file written
with WCNF SVAN.
- PDB - Coordinate file is read in
PDB format
- CHRM - Coordinate file is read in CHARMM CRD format
Key3 (configuration check procedure):
- NOFX - Abort if solute on the .crd file
differs form the input definition
- FXCR - If solute on the .crd file differs
form the input definition by more than 0.01 Å,
use the input definition for both
- FXCC - If solute on the .crd file differs
form the input definition, by more than 0.001,
use the input definition for both
- FXIN - If solute on the .crd file differs
form the input definition, use one on the .crd file for
both
- IGND - Ignore if solute on the .crd file
differs form the input definitition
Data: numrunr, jobnamer
- numrunr {numrun}, jobnamer
{jobname} - The program will look for the file
<jobnamer>.<numrunr>.crd
(<jobnamer>.crd when numrunr=1) to read the
coordinates from. If the filename is different from the
default coordinate file name, the program will open a new
coordinate file name using numrun as well, and copy the
coordinates there.
- RANC - The starting configuration is obtained by placing
the solute (as defined by the SLTA key)
at the cell center,
with its COM shifted to the cell center,
and inserting the number of solvents
requested by the SIZE key at random
location and orientation. Overlap with the solute van der Waals
envelop is prevented. Force bias and the calculation of
distribution functions will be turned off.
When solute molecules are cloned, the cloned copies will be
placed at randomly selected points with randomly selected
orientations in the cell.
If torsions are active (see key PART)
then the torsion angles are also set randomly.
No tests for overlap will be performed.
See also the key STIR
for the elimination of persistent contacts.
Key2 (file format): see CNFI READ
Key3 (determination of cell size and number of solvents):
- SIZE - The cell dimensions given
by the PBCN
key will be used
- NTOS - Cell dimensions determined from externally
provided partial molar volumes and number of molecules. For
PBCN RECT: all edges are the same;
for PBCN HEXG: provide the
prism length as plen with PBCN HEXG
- STOS - Cell dimensions are determined from
externally provided solute and solvent partial molar volumes and
number of required shells. Data inputted with the keys
SIZE and PBCN
will be disregarded.
- STON - Number of solvents is determined from the
externally provided partial molar volumes and cell dimensions.
Data: pmvslt, pmvslv, rnshll,
cplpar
- pmvslt - Partial molar volume of the solute in
ml/mole.
- pmvslv {18.12004 (water at 25 deg C)} - Partial
molar volume of the solvent in ml/mole.
- rnshll {2} - The number of solvent shells required
around the solute in the simulation cell (only used for
Key3=STOS)
- cplpar {0.0} The coupling parameter value to use
for the creation of the third copy (for
FREE PMF1 or
FREE PMNLruns)
- RANI - Same as RANC, except that the center of
the inputted solute coordinate system will be placed at the cell
center.
- TRAJ : The initial configuration is gathered from the
trajectory file opened previously.
Key2(file format): see CNFI READ
- RPSU : A built-in procedure is used. The first molecule
of an old system is replaced by the solute. The COM of the old and
new solute will coincide. Solvents overlapping with the Van der
Waals envelope of the new solute will be discarded. The
resulting configuration will be written on the .crd file.
Also, DSTC NONE and
SAMP METC will be set.
Key2(file format): see CNFI READ
Data: numrunr, jobname, noslt
- jobname, numrunr - Old configuration is to be
read from file <jobname>.<numrunr>.crd
- noslt - Number of solute atoms in the old
configuration.
Fdata:
- RECTYPE 23: iclslt(1),...,iclslt(noslt)
(14I5)
- iclslt(i) - Atomic number (for EPEN potential) or
atom type (for Clementi potential) of the I-th atom in the old
solute.
- RPUV : The starting configuration is obtained from an
old system by replacing both the solute and solvent molecules in
such a way that the orientation matrices are preserved. Both
the number of solute and solvent atoms can change.
Key2: see CNFI READ
Data: numrunr, jobname, noslt,
noslv, iop37o
- jobname, numrunr - Old configuration is to be
read from file <jobname>.<numrunr>.crd
- noslt - Number of solute atoms in the old
configuration.
- noslv - Number of solvent atoms in the old
configuration.
- iop37o - Potential identifier of the old solvent
(0-1-2 <=> MCY-TIP3P-TIP4P).
Fdata:
- RECTYPE 25's: (not needed for
SLTA SLVT).
- noslt records - one for every atom of the old
solute:
- ioslt(i),(coslt(j,i),j=1,3) (I5,3F15.10)
- ioslt - Atomic number for atom i (6 for C, 8 for
O, etc.) for EPEN potential and atom type for all the other
potentials.
- coslt - Coordinates of the old solute.
- RECTYPE 26's: (read only when iop37o .ne. 0)
- noslv records - one for every atoms of the old
solvent:
- ioslv(i),(coslv(j,i),j=1,3) (I5,3F15.10)
- ioslv - atomic number for solvent atom i.
- coslv - X,Y,Z coordinates of atom i in the old
solvent.
- PMFN - Create an input file for conformational
transition from a file with a fixed solute conformation. It takes
the conformation given after SLTA
(RECTYPEs 14) as Ro reads in the R1
conformation
(together with new charges) and prepares the linear combination of
Ro and R1.
All three solute conformations
will be stored as a single solute on the .crd file, along
with the value of the parameter CP (read from input). Note, that it
assumes that the solute in the input configuration is in the
Ro conformation.
Key2: see CNFI READ
Data: numrunr, jobname, nstfa0new,
cplpar
- jobname, numrunr - The input file containing
a full configuration with the solute in its initial conformation is
<jobname>.<numrunr>.crd . The solute
conformation has to be the one described by the data after
SLTA.
- nstfa0new - the number of FE solute atoms in the
file <jobname>.<numrunr>.crd
.
- cplpar - The starting value of the coupling
parameter CP.
Fdata:
- RECTYPE 31: iclslt(i),(cslt(j,i),j=1,3),
qslt(i) (I5,4F15.5)
- iclslt, cslt, qslt - Description of
the second solute conformation. Identical meaning as for the data
after SLTA. The number of real solute
atoms is nslt
in the above description. This same number should appear also
with the keyword SLTA. However, the
program will
replace it with 3*nslt, corresponding to the fact that
it stores three copies of the free-energy solute, one each of the
it stores three copies of the free-energy solute, one each of the
two extremal conformations and the combined
conformation.
- RPUU - Same as RPSU, but no solvent molecule is
discarded.
III.02.12. Simulation cell rescaling
SCAL Data
Data: e1, e2, e3
- e1, e2, e3 - Same meaning as edgex,
edgey, edgez for the PBCN
key, defining the new cell. The COM's of the solvents will be
scaled
to fill in the new box evenly.
III.02.13. Overlap of the second solute copy with the
first
OVST Data
Data: i1, i2, i3
- i1, i2, i3 - The second copy of the solute
will be shifted and rotated to best overlap with the first. The
overlap is based on the coordinates of atoms i1, i2,
i3. If any of them are zero, the first three atoms will be
chosen. This is useful for free-energy calculations to reduce the
change in the solute that the calculation has to model.
III.02.14. Spread out the solute molecules for display
SPRD [Data]
Prerequisite: CNFI
Data:
- numrunw {numrun} - the runnumber of the PDB file
to be written.
This key generates a PDB file containing the solute molecules
spread out
in the x-y plane so that they don't overlap.
III.3. Free-energy calculation keywords
- FREE: Free energy
options
- RAUS: AUS parameter
change
- WMAT: PMF matching
- WPLT: AUS iteration plot
- TIQU: TI quadrature
- OVRA: Overlap ratio
evaluation
III.03.01. Free energy options
FREE Key1, [Key2-4 Data Fdata]
Prerequisite: PBCN
Key1 (free-energy method choice):
- WIDO - Excess chemical potential is calculated using the Widom insertion
method with cavity modification.
This option requires either a subsequent SCAN key
or the PXAN key before the
RUNS key
Data: rspha, ngridx, ngridy,
ngridz,
ngtry, nitry, ishftx, ishfty,
ishftz, nmolwid; nmolwid times
(ifdummy, ew0)
- rspha - Cavity radius (solute atom sizes are defined
by the vdw radii see key PRAT)
- ngridx, ngridy, ngridz -
Number of grid points in the x, y, z directions, respectively.
- ngtry {1} - Number of different grids (different random shifts)
to use for each configuration
- nitry {1} - Number of insertion trys per grid
- ishftx {0}, ishfty {0}, ishftz {0} -
When isftx=1, no random shift is applied in the x direction;
same for y and z.
nmolwid {1} - Number of different solutes to calculate free energy for.
For each solute:
- ifdummy {0} - When ifdummy=1, the first atom of the
inserted molecule is assumed to be non-interacting (used only to
ensure the best centering into the cavity).
- ew0 {0.0} - expected energy value (will be subtracted
from all calculated energies before forming exponentials, added
back at the end)
- CHIM - the excess Helmholtz free energy
of changing a solvent molecule into a new substance
is calculated using the
perturbation formula without a coupling parameter.
Each solvent molecule is changed to its new form and the energy
of the new substance is evaluated and fed into the perturbation
method formula. Clearly, it is a low-accuracy technique.
This option requires either a subsequent SCAN key
or the PXAN key before the
RUNS key
Data: ngridx, ngridy, ngridz,
ifdummy, ew0
- rspha - Cavity radius
- ngridx, ngridy, ngridz -
Number of grid points in the x, y, z directions, respectively.
nmolwid {1} - Number of different solutes to calculate free
energy for.
For each solute:
- ifdummy {0} - When ifdummy=1, the first atom of the
inserted molecule is assumed to be non-interacting (used only to
ensure the best centering into the cavity).
- ew0 {0.0} - expected energy value (will be subtracted
from all calculated energies before forming exponentials, added
back at the end)
- TICA - Energy coupling is used to change from one solute
to an other:
E(CP) = CPtiexp*E1 +
(1-CP)tiexp*Eo
The control function of the TI integrand will be plotted at the
end of the run and error estimates will be provided for it.
- PMLI - Same as TICA except that
the program will calculate the free energy difference between the
states precombined with the CIMX or CCMX key
using the perturbation method.
The control function of
<exp(-(E(uscp1)-E(cplpar))/kT)>cplpar
will be plotted at the end of the run and error estimates
will be provided for it.
- PMNL
- The calculation is performed at a fixed
coupling
parameter value CP (atomic coupling) to get free energy dfferences
between states CP and CPo and states CP and
CP1 with both the perturbation method and overlap ratio
method.
- Key2 for Key1=TICA, PMLI, or PMNL
(pre-combining):
- NOMX - No combining of states. The (excess) free
energy difference will be calculated between the two states
represented by the two free energy solutes defined with the
SLTA key.
- CIMX - Prepare the initial and final states as
the linear combinations of the inputted ones (after the
SLTA
key) with coupling parameter values uspar0, uspar1.
The potential coefficients will be similarly combined.
During the run, these states
will then be considered to correspond to coupling parameters 0 and
1, respectively. The coupling parameter cplpar will
transformed accordingly to maintain the ratio of the forward
and backward coupling parameter change.
- CCMX - The same linear combinations will be done as
for CIMX but both for the conformation pair given after
SLTA and for the conformation on the
.crd file. In
addition to that, the two solute conformations on the .crd
file are assumed to have been be obtained previusly as linear
combinations of the two inputted ones with coupling parameter
values uscpo0, uscpo1 (read with formatted input,
see below). This allows the preparation of a new
'window' from a previously used (and equilibrated) 'window'.
Data (for Key1=TICA only):
ngquadp, igquadp,
epstol, sigtol, qtol, xyz2tol
- ngquadp {0}, igquadp {0} -
if not zero, the coupling parameter
will be the igquadp-th point of the ngquadp-point
Gaussian quadrature
- epstol {0.0} - When the LJ sigma values of all atoms
in a group differ by less than epstol then that group will
be coupled linearly, i.e., tiexp = 1 will be used.
- sigtol {0.0}, qtol {0.0}, xyz2tol
{0.0} - similar tolerance values for the LJ epsilon, atomic charge
and distance square.
Data for key1=PMNL:
gmor0k, gmor1k, gdvork (X20,5F10.0)
- gmor0k, gmor1k - the energy difference
distributions will be accumulated
for the overlap ratio method calculation
in a grid starting at gmor0k, gmor1k
for the initial and final states, respectively.
Each distribution contains 100 grids with a grid width of
gdvork.
Fdata (for Key1=TICA, PMLI, or PMNL):
- RECTYPE 5: tiexp, cplpar, uspar0,
uspar1, uscpo0, uscpo1 (3F5.2,5X,5F10.0)
- tiexp - Coupling parameter exponent array. Its
elements give the coupling parameter exponents for the
1/r12, 1/r6 and 1/r potential energy terms
(Polynomial TI).
When tiexp(i)>0, in using TI
(FREE TICA), the CP and
(1-CP) factors for the corresponding potential energy
terms are raised to the tiexpi(i)-th power).
- cplpar - Coupling parameter value to be used in
the energy expression. Ignored when ngquadp > 0 !!
- uspar0, uspar1 -
when Key2 = CIMX or CCMX
the free energy difference will be determined
between states represented by these two
coupling parameter values (instead of between the
two states defined by the key
SLTA),
using the perturbation method (Key1 = PMLI or PMNL)
and the overlap ratio method (Key1 = PMNL).
- uscpo0 {0},uscpo1 {1} - The coupling
parameter values that had been used to prepare the solute
previously on the .crd file. It is used when a starting
configuration is generated for an intermediate coupling parameter
value from a configuration that itself was already an
intermediate one (Key2 = CCMX).
- PMF1
- The conformational coupling parameter CP will
be sampled with umbrella sampling using either harmonic or adaptive
weighing. If the solute includes molecules not involved
in the coupling the affected molecules have to be specified
with the key PMFM.
Key2 for Key1=PMF1 (coupling type):
- GENL - Linear coupling in Cartesian space
- WRMM - Linear coupling in Cartesian space, but the
solute has two rigid molecules that move together as the coupling
parameter changes.
- WRFM - Like WRMM, but the second molecule
solute is kept fixed. The second molecule is not considered part of
the free energy solute and should be specified among the non-free
energy solutes.
- WRFR - Like WRFM but the first molecule is not
only displaced but also rotated.
- TORS - Coupling parameter is defined along torsional
coordinate(s). Requires also the TORD
key.
- WRTR - Linear coupling of the center of mass of two
positions of a single molecule, followed by translation and
rotation
in certain directions.
Key3 (solute change type):
- GEOC - Only the solute geometry changes
- GEQC - Only the solute geometry and partial charges
change
- GDQC - The solute geometry, partial charges and atom
types all change
Key4 (umbrella sampling type):
- HUSS - Umbrella sampling is based on a harmonic
potential.
- AUSL - Adaptive umbrella sampling is used with linear
interpolation for the UAS weighths
- AUSE - Adaptive umbrella sampling is used with exponential
interpolation for the UAS weighths
Data: (for FREE PMF1 WRF* ****
**** nmolcnt2
- nmolcnt2 - The second molecule of the PMF calculation
that is kept fixed. This information is used
to generate the common center of the two PMF molecules to make
sure that
all molecules interact with the same images of both PMF molecules
Data: (for FREE PMF1 WRTR ****
**** cdpmfx, crpmfx, cdpmfy,
crpmfy, cdpmfz, crpmfz
- cdpmfx, cdpmfy, cdpmfz - Center of mass
displacement step size limits
(see key STEP) in
the X, Y, and Z directions, respectively. Zero value in any
direction
will restrict the motion to the other, nonzero directions.
- crpmfx, crpmfy, crpmfz - Rotation
step size limits (see key STEP) around
the X, Y, and Z axes, respectively.
Fdata for
FREE PMF1 **** **** **SS):
- RECTYPE 6:
nsweep, cplmin, cplmax, delcpl,
c0cplh,
p0cplh, wcplcha (15x,i5,6F10.0)
- nsweep {0}. If > 0, an adaptive umbrella
sampling iteration will continue if the interval was traveled at
least nsweep times.
- cplmin, cplmax - minimum and maximum of
the coupling parameter CP to be sampled in this run.
- delcpl - Stepsize parameter for CP.
- c0cplh, p0cplh - With HUSS, the
umbrella sampling weights are obtained as
exp(c0cplh*(CP-p0cplh)2).
- wcplcha {1.0} - Relative weight of coupling
parameter moves (compared to other types of solute moves).
Fdata (additional) for
FREE PMF1 **** **** AUS*):
- RECTYPE 7: iopnrm, iopeql, iopenc,
nitssk, faclim, fcenc1,
- smplmx, ratmax, rldvmx, diffmx
(4I5,6F10.0)
- RECTYPE 8: ngovmn, nsubmn, ngrcor,
nwtst, fcenc2, encexp, tolera
(4I5,6F10.0)
- iopnrm - Option to select matching algorithm in
adaptive US.
- iopnrm < 0: linear n-step optimized:
- iopnrm = -1: matching is done on the
probabilities
(not recommended).
- iopnrm = -2: matching is done on the
potential of mean force.
- iopnrm = -3: matching is done on the potential of
mean force, with each increment matched separately (Dutch method).
- iopnrm = 0: nonlinear n-step optimized.
- iopnrm > 0: nonlinear n-step optimization,
with
regenerating initial guess from successive 1-step
optimization at every iopnrm-th iteration (after
resorting the iterations based on the range covered).
- iopeql - When >0, self test for equilibration in
the adaptive umbrella sampling is performed by screening previous
iterations at every iteration and dropping
iterations 'contradicted' by subsequent ones.
- iopenc - Controls the strategy to force the
adaptive umbrella sampling to extend the sampled range of the
coupling parameter by temporarily modifying the weights.
- iopenc = 0 : No weight modification.
- iopenc = 1 : Weights are modified gradually as
the number of iterations during which a grid is unsampled
grows. ("global encouraging")
- iopenc = 2 : Suddenly abandoned regions
will get higher weight.
- iopenc = 3 : 1 and 2 combined.
- nitssk - Number of iterations to be skipped when
equilibration is found to be necessary during adaptive US.
- faclim {0.5} - K factor used to limit the
sampling to the requested interval.
- fcenc1 {2.0} - C factor used to encourage
sampling of new, undersampled region.
- smplmx, ratmax {2.0} - The allowed ratio
between W(CP) over neighboring grid points is limited to the range
[1/ratmax, ratmax] when the frequency of sampling of
that grid point is under smplmx.
- rldvmx {0.0001} - Maximum relative deviation
allowed in nonlinear optimization.
- diffmx {0.9} - Threshold value for the iteration
difference indicator.
- ngovmn {4} - Minimum number of grids to overlap
for two iterations to be considered in the screening.
- nsubmn {5} - Iteration is rejected if one out of
every nsubmn subsequent overlapping iterations give
difference indicator > diffmx.
- ngrcor {5} - Number of gridpoints over which the
weighting is to be modified when sampling
encouragement is used.
- nwtst - Partial results of adaptive step is
printed for iterations until nwtst. If nwtst=#WI
(see Section VI.), stop the run after
saving the data on the iteration.
- fcenc2 {1.1} - For global encouraging, unsampled
grid weight will be multiplied by fcenc2 in each
iteration.
- encexp {0.75} - for global encouraging, sampled
grids modifying weights will be replaced by their encexp-th
power.
- tolera {0.06} - The maximum distance of the
coupling parameter from the interval defined by cplmin and
cplmax.
III.03.02. Adaptive Umbrella Sampling parameter change
RAUS Fdata
Fdata:
- RECTYPE's 6-8 as described with the
FREE PMF1 **** **** AUS*
keywords followed by
- RECTYPE 9: iterw0, newlim (2i5)
- iterw0 - When iterw0 > 0, use only iterations
from
iterw0 for
the matching. When iterw0 < 0, restore dropped
iterations.
- newlim - If iterw0 = 1, iterations are discarded
until the newly inputted region is reached and further
nitskp iterations are performed (as equilibration).
III.03.03. PMF matching
WMAT Key1, Data, Fdata
This function joins the adaptive umbrella sampling estimates
of the potential of mean force obtained from the checkpoint files
of calculations on overlapping coupling parameter ranges (windows).
Key1:
- REGL - checkpoint file runnumbers are evenly spaced
- LIST - checkpoint file runnumbers are irregularly spaced
Data for Key1=REGL:
nruns, numrunr, incrnumrun
- nruns - Number of runs to match
- numrunr {numrun} - Runnumber for the checkpoint
file containing the first segment (window)
- incrnumrun {1} - Run number increment, i.e., the i-th
window's checkpoint file was generated by run number
numrunr+incrnumrun*(i-1).
Data for Key1=LIST: nruns, numrun1,
numrun2
...
- nruns - Number of runs to match
- numrun* -
Run number
of the checkpoint files to match (nruns number)
Fdata:
- RECTYPE 48: ioplot, delpl1,
delpl2, r0, cx, pmn, imatread
(I5,5F10.0,I5)
- ioplot - If ioplot>0 then the program writes on
the file <jobname>.<numrun>.wmp the matched CP
and w(CP) and sampling frequency p(CP) values (for further
plotting). If ioplot=2 then the values outside the matched
intervals will not be written (to produce a contiguous plot).
- delpl1, delpl2 - To be added to the values of
w(CP) and p(CP) (to shift plots).
- r0, cx - The CP values on the .crd file
will be transformed into r0+cx*CP.
- pmn - Minimum sampling frequency for a grid point to
be written on the .wmp file (to keep noise out of the
plots).
- RECTYPE 49: (read if imatread > 0):
(match(i),i=1,nruns+1) (11I5)
- match - matching point specification. match(1)
and match(nruns+1) specify the beginning and the end
point of the complete coupling parameter interval. match(2)
is the gridpoint used to match runs 1 and 2. In general,
match(i+1) is the matching point between runs i and i+1.
Zero at any place results in default values. The default matching
points are determined based on the combined sampling frequencies
and the interval defaults to the [0.0,1.0] coupling parameter
interval. The whole record can be omitted.
III.03.04. AUS iteration plot
WPLT Key1, Data, [Data]
Key1 (scale for data points):
- LIN - circles with radius proportional to the extent of
sampling will be drawn at each data point
- SQRT - radius will be proportional to the square root of
the sampling extent
- LOG - radius will be proportional to the logarithm of
the sampling extent
Data: PSfile,nfstplt, maxit,
nfrplt, nrad
- PSfile - name of the file the plot will be written
in Postscript format
- nfstplt {1}, maxit {number of iterations},
nfrplt {1} - plot data for every nfrplt -th iteration
from iteration nfstplt to iteration maxit
- nrad {5} - the radius of the largest circle (in pixels)
This key will produce a family of curves showing the PMF calculated from
each iteration as lines, with circles representing the extent of sampling
at each coupling parameter grid. The matched PMF is shown with 'x' symbols.
III.03.05. TI quadrature
TIQU [Data] Fdata
Data: numrunr, incrnumrun
- numrunr {numrun} -
Run number
for the checkpoint
file containing the first quadrature point run
- incrnumrun {1} - Run number increment, i.e., the i-th
quadrature checkpoint file was generated by run number
numrunr+incrnumrun*(i-1).
Fdata:
- RECTYPE 41: nquad, ncread, nfac,
nskipread, itrapread, nendread (3I5)
- nquad - Number of quadrature points. Currently allowed
values: 3, 5 and 8.
- ncread - If >0, the cumulative averages of the TI
integrands will be read from unit 5 (RECTYPE's 43-45), ncread
numbers per line. Maximum value: 6.
- nfac - The TI integrands read from unit 5 will be divided by
nfac if nfac > 0 (ignored if ncread=0).
- trapread - see RECTYPE 46 below
- RECTYPE 42: (read if nskipread > 0)
(nskp(iq),iq=1,nquad) (8I5)
- nskp(iq) - Number of control function blocks (i.e.,
ncntin long) to skip from the beginning for quadrature point
iq.
- RECTYPE 43: (read if iread > 0) ident
(a80,/,a80)
- ident - 2-line description of the data.
- If iread > 0 then nquad times (one data set
for each quadrature point):
- RECTYPE 44: nbl, ncntin, cplpr
(2I10,F10.0)
- nbl - The number of control function blocks to read
in.
- ncntin - The length of a control function block.
- cplpar - The coupling parameter value.
- RECTYPES 45: ticums(ib,iq),ib=1,nbl,
ncread numbers per line (6E12.6)
- ticums - The cumulative average of the integrand at
each control function block.
- RECTYPE 46: (read if itrapread > 0 ):
idef, ymin, ymax (I5,2F10.0)
- idef,-
If idef=0, the fitted polynomial will be used to calculate
the integrand at the endpoints.
- ymin, ymax -
The value of the TI integrand at
coupling parameters 0 and 1 (if available) for trapezoid rule
integration (to be used when idef > 0).
- RECTYPE 47: (nendread diffferent lines): xmin,
xmax (6F12.6)
- xmin, xmax - New coupling parameter limits. The
program will calculate the integral of the Gaussian approximating
polynomial between these limits. This line is only read if
RECTYPE 43 was present.
This function performs the numerical (Gaussian) quadrature on
thermodynamic integration runs. It prepares a plot of the
polynomial fitting the quadrature ponts and, optionally, performs
other numerical integrations.
III.03.06. Overlap ratio evaluation
OVRA Data
This function obtains the free energy from two overlap ratio method
runs.
Data: numrun1, numrun2
Run number
- numrun1, numrun2 - The
run numbers
that generated the two checkpoint files containing the
energy distributions the method requires.
III.4. Molecule descriptor keywords
- SLTA: Solute
description
- SLVA: Solvent
description
- CLON: Solute cloning
- MAKB: Make bonds
- BRKB: Break bonds
- MOLD: Define solute
molecule limits
- TORD: Torsion angle
definitions
- CSEG: CHARMM segment id
handling
- LOOP: Loop torsion moves
III.04.01. Solute description
SLTA Key1 Key2 Key3, Data, [Data, Fdata]
Prerequisites: FILE,
SVPT, PBCN,
SVPT
Potential precursors: FREE,
SLVA, PMOD,
PROT, CHRG,
BRKB, MAKB,
CLON, MIXR,
MOLD, MODA,
FC14, FC14
Key1 (possible solute periodicity):
- SMPL - Simple solute.
- POLY - Periodic solute along the X axis. Displacement
not made along the X axis.
- XTAL - Solute is periodic along all 3 axes (crystal).
Global solute moves are forbidden.
Key2 (input format):
- MMC - MMC format - specific to this program
(the program
SIMULAID
can help convert structures in different file formats
(e.g., PDB, CHARMM .CRD) to this format).
- PDB - PDB
format input - atomnames will
be converted to atomic numbers
- CRD - CHARMM CRD format input -
atomnames will be converted to atomic numbers
Key3 (data source):
- READ - Read the atom description data
(the formatted input below describing each solute atom)
from the input file, starting from the next line
- FILE - Read the atom description data
from a file with extension .slt .
For Key2=PDB, .pdb is also allowed,
For Key2=CRD, .CRD is also allowed.
Data: nslt, nsltfe, nsltfe0,
nsltpr, numrunr, jobnamer
- nslt - Number of solute atoms. Note that free energy
calculations require multiple copies and nslt includes the
atoms in all the copies.
- nsltfe {0} - The number of solute atoms in all copies
of the free-energy molecule(s).
- nsltfe0 {nsltfe/<number of free energy copies
(1,2 or 3)>}
The number of solute atoms in the first
copy of the free-energy molecule(s).
- For the Widom ((FREE WIDO) or the
chimera (FREE CHIM) there is one copy
of the free energy solute, the atoms in the nmolwid solutes
for which the free energy is calculated. Each molecule has to be a
single (and different) residue.
- For creation/annihilation path calculations
(FREE TICA) or
(FREE PMLI) the free energy solute
consists of two copies of the same molecule in its two different
states.
- For calculations over a path of continuous deformation,
(FREE PMF1 or
FREE PMNL ) the program requires three
copies of the solute: the initial and final states and an
intermediate state. This third copy is essentially a placeholder
for the actula, continually changing intermediate copy used in the
calculation of energies, etc.
- nsltpr {nslt} - Number of solute atoms to echo
- numrunr {1} -
Run number
of the solute description file (only read when Key3=FILE).
- jobnamer {jobname} - File name root of the solute
description file (only read when Key3=FILE).
Fdata: Description of the solute atoms
- RECTYPE 14's: nslt records - one for every atoms of the
solute for MMC format:
tslt(i),(cslt(j,i),j=1,3),qslt(i),
igrslt(i), gcent, mcent, labslt1(i),
labslt2(i), indxrdf(i), pf(i),
spgroup(i) (1x,A4,4F10.0,I5,2a1,2A4,I5,1x,A4,A10)
- tslt(i) - Atom type specification for atom i (not
EPEN) for solute-solvent potential - either its code number or its
4-character label (AMBER or CHARMM only) - or atomic number for
atom i for EPEN solute-solvent potential.
- cslt(k,i) - X,Y,Z coordinates in Å of the atom i.
- qslt(i) - charge on atom i. For Jorgensen OPLS it is
disregarded and internally stored values will be used, unless this
feature is overridden by CHRG INPT.
- igrslt - group (residue) index of the solute atoms.
Used only with SUVC **GC. Important: The
atoms have to be listed by increasing group index.
- gcent - if gcent='G' then this atom will be
the group center, overriding the built in topological algorithm
- mcent - if mcent='M' then this atom will be
the molecular center, overriding the built in topological algorithm
- labslt1(i) { },
labslt2(i) { }
- Optional atom labels label used in proximity analysis output.
Whenever applicable, they are interpreted
as residue name and atom names (four characters each,
residue name left adjusted).
- indxrdf(i) {i} - optional index pointing to the
proximity distributions this solute atom contributes to.
If all indxrdf(i)'s are zero then individual proximity
distributions
will be calculated for each solute atom. If at least one indxrdf(i)
is nonzero then the atoms with indxrdf(i)=0 will not contribute to
any distribution.
- pf(i) - Potential function type for this residue. If
not present, potential types are determined by the
SUPT key.
If present for one atom in a residue it will change the
whole residue.
- spgroup(i)
{ }
If not blank, atoms with the same spgroup may be considered
a single functional group for the proximity analysis (see key
FCGD).
For the Widom insertion (FREE WIDO) or
the chimera method (FREE CHIM)
spgroup in the record of the atom belonging to the
molecule's center will be used as the name of the solute.
- RECTYPE 51's: nslt records - one for every atoms of the
solute for PDB format:
- atnam,resnam,(cslt(k,i),k=1,3)
(11x,a4,1x,a4,10x,3f8.0)
- RECTYPE 50's: nslt records - one for every atoms of the
solute for CRD format:
- resnam,atnam,(cslt(k,i),k=1,3)
(11x,a4,1x,a4,1x,3f10.0)
- resnam,atnam - the analysis label labslt will
be a concatenation of resnam and atnam.
Fdata: RECTYPEs 19-21 are only needed for
SUPT EPEN
(EPEN potentials). The location and the type of each center have
already been
defined above. The program finds their breakdown into
electrons and nuclei automatically thus here only the
repulsion/dispersion term parameters are to be inputted.
- RECTYPE 19's: nslte records - for every solute
electronic center, in the same order as the centers were defined:
as(i),bs(i),ds(i) (3F20.10)
- as - A parameter for solute electron i
- bs - B parameter for solute electron i
- ds - C parameter for solute electron i
- RECTYPE 20's: nslve records - for every solvent
electronic center:
- asv(i),bsv(i),dsv(i) (3F20.10)
- asv - A parameter for solvent electron i
- bsv - B parameter for solvent electron i
- dsv - C parameter for solvent electron i
The units of the various parameters are discussed in
Marchese et al.
- RECTYPE 21: cutin, escale, iopict
(2F20.10,I5)
- cutin,iopict - The radius of the hard sphere
placed at the COM of the solute when iopict = 1 or 2. (Use
it with some caution! There might be some discontinuity problems.)
- escale {1.0} - Energy scaling parameter for the
solute-solvent potential function. All the solute-solvent
pair energies are multiplied by this constant scale factor.
The majority of the potential functions, including EPEN,
break down at very close distances leading to very high
negative energies in certain orientations. Fortunately,
these spurious minima are followed by large barriers as the
molecule moves outward. If the starting configuration has
such close contacts and regular potential is used, then the
total energy of this configuration will be very negative
and it is highly unlikely that the system will ever get out
of this bottleneck. This problem can be circumvented by
redefining the potential function as repulsive in the inner
Van der Waals region.
- escale=0: Regular EPEN potential without any
inner cutoff.
- escale=1: A hard sphere of radius cutin is
placed
at the COM of the solute. This feature is included to expel any
solvent molecule which are in close contact with the
solute molecule. Beyond the cutin distance (see RECTYPE 21),
regular EPEN potential is used.
- escale=2: Van der Waals envelopes are created
around the
solute and solvent molecules to check whether any solvent
molecule is in close contact with the solute.
The program is currently prepared to handle atoms up to atomic
number 86. The atomic number for lone-pair is 89 (in case
of EPEN, it is disregarded) and atomic number 90 is an
electron.
Atomic numbers 91-93 have been reserved for the united atoms
CH1-CH3, respectively. The various
atomtypes for the different
potential libraries implemented and the actual functional form of
the potentials are specified in a separate document.
Important: For EPEN, the different types of centers have to
be grouped together. Lone-pairs should follow nuclei and
electrons should follow nuclei and lone-pairs.
When the coupling parameter is used to determine
conformational transition (FREE PMF1),
three sets of solute
coordinates are used: the two conformations Ro and
R1 and their
combination CP*R1+(1-CP)*Ro. The solute
description thus has to
contain the coordinates, charges of the two conformations as well
as room for the mixture. In this third set, only the atomtypes
have to be given. When CNFI PMFN,
however, only the Ro conformation is required here since
the R1 conformation will be
given at a different part of the input. It is important that the
different descriptions of the conformation give the solute atoms
in identical order. Solute atom type change during the transition
can not involve EPEN.
III.04.02. Solvent description
SLVA [Key1 Data [Fdata]]
Prerequisite: SVPT.
Key1:
- OWNS - Solute-solvent interactions use the solvent
parameters belonging to the water specific to the library: (MCY for
Clementi, TIP3P for AMBER, SPC for GROMOS and TIP4P for OPLS)
- MIXS - Solute-solvent interactions combine the solute
parameter and the specified solvent parameters
- READ - Solvent description is to be inputted.
Data: nslv, islvrep, label
- nslv - Number of solvent atoms per solvent molecule.
Ignored if Key1 is notREAD.
- islvrep {1} - The central atom to be used for
representing the solvent molecule when only one center is needed,
e.g., for proximity analysis, solute-solvent distances will be
calculated from the islvrep-th solvent atom
- label {HOH } - The residue label to use with this
molecule. Ignored if Key1 is notREAD.
Fdata (only for Key1 = READ):
- RECTYPE 15's: nslv records - one for every atoms of the
solvent:
ianslv(i), (cslv(j,i),j=1,3), qslv(i),
labslv(i)
(1x,a4,4F15.0,2x,a4) or (1x,i4,4F10.0,2x,a4)
- ianslv - for general solvent
(SVPT GENL) it
is the atom type (either label or number - see key
SLTA), otherwise the atomic number of atom i.
- cslv - X,Y,Z coordinates in Å of the atom i.
- qslv - charge on atom i to be used in the SLT-SLV
interactions.
- labslv - label of atom i
Pseudo-atoms are to be placed after the regular atoms.
The order of the (water) solvent centers is set by the
program as follows. For MCY, O,H,H,LP, for TIPS, O,H,H; for
TIPS2, O,H,H,Q.
Note: the dipole of the solvent molecule should be parallel
to the X axis for the purpose of the near-neighbor solvent dipole
correlation function computation
(DSTC ALL).
III.04.03. Solute molecule cloning
CLON Data
Data: nclone, iclonef(1), iclonel(1),
ncopy(1), ...
iclonef(nclone), iclonel(nclone),
ncopy(nclone)
III.04.04. Making additional bonds
MAKB Data
Data: nmake, imake1(1), imake2(1),
...,
imake1(nmake), imake2(nmake)
- nmake - Number of bonds to make
- imake1(i), imake2(i) - The i-th new bond
connects atoms imake1(i) and imake2(i)
Note that this key is not implemented for the case when cloning
(CLON) is in effect.
III.04.05. Breaking bonds
BRKB Data
Data: nbreak, ibreak1(1), ibreak2(1),
...
ibreak1(nbreak), ibreak2(nbreak)
- nbreak - Number of bonds to break
- ibreak1(i), ibreak2(i) - The i-th bond to
be broken
connects atoms ibreak1(i) and ibreak2(i)
Note that this key is not implemented for the case when cloning
(CLON) is in effect.
III.04.06. Defining molecule limits
MOLD Data
Data: molslt, ilastm(1), ilastm(2),
...
- molslt - Number of molecules in the solute
- ilastm(i) - Last atom of molecule i.
This key overrides the molecular topology for defining the
molecules within the solute. This is useful when the
molecule is distored and too many incorrect bonds are
found or too many bonds are missing.
When cloning (CLON)
is in effect, the definitions have to be
repeated for all clones.
III.04.07. Torsion angle definitions
TORD Key1 Key2 Data Fdata
Prerequisite: SLTA,
PRMF
Potential precursors: FC14,
FREE, LOOP,
PARD, SLTA
Key1 (torsion definiton source):
- INPT
Data: ntorsinp, nsltcop, rngfac,
rngfacl, inrcat
- ntorsinp - Number of torsion angles on the solute.
If given as zero, all possible torsions of the solute will be
active.
- nsltcop {1} - number of solute copies to apply the
inputted torsions. Applicable for free energy simulations where two
or three solute copies are needed (see