Users' Guide to the Monte Carlo Program MMC

for Computer Simulation of Molecular Solutions.

Mihaly Mezei

Department of Physiology and Biophysics

Mount Sinai School of Medicine, NYU

E-mail address: mezei@inka.mssm.edu

Nov. 20, 2001.

TABLE OF CONTENTS

I. GENERAL DESCRIPTION

The present document describes the Force-Biased Metropolis Monte Carlo program for computer simulation studies of molecular liquids and solutions in the canonical, isobaric-isothermal and grand-canonical ensembles and the analysis of such calculations based on the Proximity Criterion. This section provides an overview of the capabilities of the program.

I.1. Systems modeled

The present program was originally written for simulations on infinitely dilute solutions (one solute in (nmolec-1) solvent molecules). As a result of this origin, the program has developed two different treatments of molecules. Molecules labeled solvent have to be rigid and be of the same type (this allows the use of special, rapid energy calculation code). The collection of all other molecules (i.e., the one(s) not declared to be solvents) are called by the program the solute. Thus the solute can consist of a single rigid molecule, a collection of freely translating and rotating rigid molecules or a collection of freely translating and rotating molecules with certain (user specified) torsion angles freely rotating.

The program establishes the connectivity of the solute molecule(s) based on interatomic distances and built-in threshold values. This sometimes leads to missing or extra bonds, especally if the input structure (given by the SLTA key) is obtained from a molecular dynamics simulation. Such defects can be identified by using the FCGA and BNDL keys and corrected by the keys MAKB or BRKB.

Free energy simulations can be performed on selected molecules of the solute (see Sec. I.5.).

The torsion angles to be changed on the solute are specified by the two atoms defining the bond of the torsion (with the key TORD). The torsions can be grouped and it is the group of torsions that is selected for change. The main reason for this is that if a torsion affects atoms that are on an other rotatable bond then it may be efficient to move the affected bond as well. These moves, however, are not correlated (i.e., each is generated by a different random number). The torsion potential parameters can be obtained from a CHARMM or AMBER parameter file.

It is possible to impose periodicity on the solute, either along the X axis (with the key SLTA POLY) i.e., with period equal to the edge of the cell in the X direction - an infinite polymer, or in all three directions (with the key SLTA XTAL) with its unit cell (or a multiple of a unit cell) coinciding with the periodic cell of the simulation (i.e., a crystal). The periodic extension of the solute outside the periodic cell is done automatically by the program.

The intermolecular interactions are assumed to be of the usual 12-6-1 type, i.e., Lennard-Jones terms for the repulsion and dispersion plus electrostatics based on partial charges on atoms (or pseudo atoms). The parameters can be obtained from libraries stores by the program or defined via input. The interactions can be truncated (see keys SVVC, SUVC, SUUC) either by a spherical cutoff (each atom interacts with all others within a certain distance) or by the minimum image convention (each atom interacts with one periodic image of all the other atoms). For the purpose of cutoff or image calculation, distances usually refer to molecule-molecule or residue-residue distances.

When spherical truncation is used, the long-range electrostatic energy can be estimated by the reaction field formula (provided all residues are neutral) - see key RFCR.

I.2. Properties calculated during a simulation

The major outputs of a Monte Carlo simulation (besides the configurations saved for further analysis by the TRAJ key) are as follows.

I.3. Potential functions

Several potential functions are built into the program. For the solute-solvent nonbonded interactions (specified with the SUPT key). Clementi et al.'s library, EPEN/QPEN, Kollman et al.'s libraries (AMBER and AMBER-94), CHARMM22 library, Berendsen-Van Gunsteren library (GROMOS87 and GROMOS96); and Jorgensen's OPLS library. Note that for the GROMOS force field only the combination rule can be used - the non-bonded parameters for pairs are not implemented. Except for the EPEN form, parameters are stored internally, but they can be modified from input and allowance is made for inputting new atom types besides the stored ones. As these parameter libraries are continually updated and extended, it is advisable to check if the value stored is the best one (using the PLBP key). For aqueous solutions, the solvent-solvent interaction (specified with the SUPT key) can be either in the form of MCY-CI type or TIPS/TIP3P/SPC type or TIPS2/TIP4P/B-F type potential. For the MCY-CI type the parameters of both the MCY-CI-II and the Yoon-Morokuma-Davidson parametrizations are built in. For TIPS2/TIP4P/B-F types, the A, B and q parameters may also be provided by the user, but the TIPS, SPC, TIP3P, Bernal-Fowler, TIPS2 and TIP4P water model parameters are built in. For solvent other than water, 12-6-1 type potential can be used with parameters from the AMBER, GROMOS or OPLS libraries.

There is a framework for adding field-dependent potentials to the currently implemented atom-atom potentials, but this requires some additional coding.

I.4. Sampling techniques

Various sampling techniques can be selected with the MOVE key. Translations and rotations can be sampled with the original Metropolis Method or with the Force-Biased algorithm of Pangali, Rao and Berne (not implemented for multimolecular solute moves, though). Force biasing, of course, necessitates the recalculation of energies and forces at every accepted step, thereby significantly increasing the computer time required to perform a move. The improvement in convergence, however, more than outweighs the added cost. By default a lambda value of 1/2 is used but the scaled force-bias technique where lambda is scaled down near the solute has also been implemented.

Molecules to be moved can be selected randomly, in a cycle or in a shuffled cycle or using the preferential sampling scheme of J.C. Owicki, which leads to improved statistics for the local solution environment of the solute (see key MOVE PRF*). The cavity-biased insertion technique for grand-canonical ensemble simulations and the virial-biased volume-change for the isobaric ensemble have also been implemented.

I.5. Free energy methods

Free energy simulations can be performed using a coupling parameter lambda (CP in the followings) in the energy expression. There are two fundamentally different approaches to generate a path: (A) simultaneous creation/annihilation and (B) continuous deformation. In the first the Hamiltonians of the initial and final states are used in unchanged and the coupling parameter is used to combine the two; in the second case there is a single hybrid Hamiltonian that is based on a parametrization that is a mixture of the two systems. For multimolecular solute, the free energy simulation can be applied to just one or two solute molecules, depending on the type of coupling.

I.5.A. Creation/annihilation path

I.5.A.1. Creation/annihilation path to be used in thermodynamic integration (FREE TICA). Here

E(CP)=CPk*E1+(1-CP)k*Eo

where Eo and E1 represent the Hamiltonians initial and final state, respectively. k=1 represents linear coupling, k>1 has been referred to as or 'almost linear' coupling (e.g., k=4 for 1/r12 repulsion). It is also possible to use different k values for the different types of energy terms (1/r12, 1/r6, and 1/r) with the so-called polynomial TI. Eo and E1 can differ in conformation, partial charges, and chemical identity (the present version requires equal number of atoms in the two states - this can always be achieved with he use of dummy atoms. It is also possible to provide two more coupling parameter values CPo and CP1 such that CPo < CP < CP1. The program will compute the free energy difference between states represented by E(CPo) and E(CP1) with the perturbation method as the sum of two free energy differences: A1-Ao = -kT ln<exp((E(CP1)-E(CP))/kT)>CP + kT ln<exp((E(CPo)-E(CP))/kT)>CP. <E(CPo)>CPo and <E(CP1)>CP1 are also computed for consistency check. The two molecules have to contain the same number of residues (groups), but can have different number of atoms.

I.5.A.2. Perturbation method based on a sequence of intermediate states (FREE TILI). This option performs a perturbation method calculation between two states generated using the deformation-type path described below in I.5.B. The reference state will be (E(CP1)-E(CPo))/2. In this case, the inital and final states have to have the same number of atoms also.

I.5.B. Deformation path

I.5.B.1. The program also provides option for a fundamentally nonlinear coupling where the coupling parameter changes both the solute coordinates and the potential parameters to generate intermediate states (continuous deformation). Besides creating intermediate Eo and E1 states for the linear or 'almost linear' couplings described above in I.5.A.2, this coupling can be used with the probability ratio method (FREE PMF1) or with the overlap ratio method (FREE PMNL), as described below in I.5.B.2 and I.5.B.3, respectively. The initial and final states may differ in position, partial charge or identity of atoms (there are limitations for EPEN).

In each of the following cases, the path involves one or two molecules of the solute. The following paths are supported:

I.5.B.1.1. Linear combination of two conformations of a single molecule (FREE PMF1 GENL). In this case all atoms of the molecule may change positions independently. The configurations at the beginning and at the end of the path are given on input and the intermediate configurations are convex linear combinations (i.e., R(CP)=CP*R1+(1-CP)*Ro) of these two.

I.5.B.1.2. Moving two molecules of the solute in a correlated fashion (FREE PMF1 WRMM). In this case, the intramolecular distances within the two parts should remain constant. Note, that if the orientation of either part is different in the initial and final states then at the intermediate states the molecule will be distorted.

I.5.B.1.3. Varyig the distance between two solute molecules of the solute but keeping the one of them fixed (FREE PMF1 WRFM). In this case the configurations at the two ends of the path differ only in the conformation of the moving part of the solute. This option may be more economical when one part is large and the other is small.

I.5.B.1.4. Moving and rotating one molecule while keeping the other one fixed (FREE PMF1 WRFR). In this case the center of mass of the first part is moved between the center of masses at the two endpoints and the orientation of the moving part is also changed to transform it from the initial to the final orientation. This option differs from I.5.B.1.3 in that at each point of the path the moving molecule has the same intramolecular conformation.

I.5.B.1.5. Correlated rotation of a set of torsion angles (FREE PMF1 TORS). In this case a set of torsion angles are changed in a user-specified interval using a common CP parameter: CP=0 sets all the selected angles to the beginning of their interval and CP=1 sets all of them to the end of their interval.

I.5.B.2. The solvent contribution to the free energy change between two solute conformations can also be determined by computing the ratio of the Boltzmann probabilities as sampled along a line in the conformation space of the solute (described by any of the paths described in Sec. I.5.B.1, FREE PMF1 **** *USS). This approach is called the probability ratio method. Calculations of this type usually require umbrella sampling techniques. The program allows either the use of a harmonic potential to be given from input or the self-consistent determination of a tabulated weighing function using the Adaptive Umbrella Sampling Method.

I.5.B.2. Any of the paths descibed in Sec. I.5.B.1 can be used to estimate free energy changes simultaneosly using the overlap ratio method of Jacucci and Quirke] and the perturbation method (FREE PMNL). Here the calculations are run at a given fixed E(CP) and two more coupling parameter values, CPo and CP1 are specified, CPo < CP < CP1. For the overlap ratio method, the program will calculate the CP averaged distribution of <E(CP)-E(CPo)>CP and <E(CP1)-E(CP)>CP. Separate calculations at CPo and CP1 will give the needed CPo and CP1 averaged distributions to be used with the overlap ratio method. Simultaneously, the program also calculates the perturbation method differences between the states CPo and CP as well as CP and CP1. Notice that this kind of perturbation method calculation is different from the calculation described with the creation/annihilation path above: there the middle state E(CP) was a linear combination of the initial and final states, E(CPo) and E(CP1). For consistency check, (similarly to the perturbation method calculations described above) the CPo average of E(CPo) and the CP1 average of E(CP1) are also calculated.

I.5.C. Chemical potential.

Free energy can be also obtained from the chemical potential. The excess chemical potential can be obtained either from grand-canonical ensemble simulations (key GCEN), from calculating the excess free energy of changing a molecule to ideal gas (with one of the methods described above, e.g., with key FREE TICA) or by the Widom insertion method (key FREE WIDO). The program implements a version that is based on insertions into cavities.

I.6. Solute move selection

The sampling of the moves is specified as follows. The overall frequency of solute moves is specified by input provided with the STEP key. As discussed above, the whole solute can be a) rotated and displaced; b) one constituent molecule of the solute can be rotated and displaced; c) two constituent molecules can be swapped; d) a group of torsion angles can be changed; and e) the coupling parameter used in a potential of mean force calculation can be changed. If moves of type b, c, d, and e are required, then the program will request a relative frequency (weight) of making that move. Regular solute moves and coupling parameter changes are both given a weight of 1.0 . Thus giving a weight of 2.0 for type 3 moves, out of 4 solute move attempts, 1 will change the solute position, 1 will change the coupling parameter, and 2 will change one of the randomly selected torsion angle groups (on the average, of course). Clearly, the more types of moves on is using, the more frequently has the solute to be moved.

For the free energy options, the treatment of moves of type 2 or 3 depends of the type of path. For the creation/annihilation path using TI (FREE TICA) the torsions are independently sampled on both copies, since these can be, and generally are, different molecules. For the continuous path (FREE PMF1 or FREE PMNL) or the cration/annihilation PM (FREE PMLI), the torsions are treated identically in each copy since here it is assumed that the molecule is changing smoothly. This means, that for a creation/annihilation TI these moves have to be sampled twice as frequently.

I.7. Initial configuration

The starting configuration for the Monte Carlo (see key CNFI) is either read from a previously prepared file, generated from the solute's description (see key SLTA) and randomly placed solvents or from a configuration of another system through various transformations. Grand-canonical ensemble simulation (see key GCEN) is also an efficient way of obtaining a random initial configuration.

I.8. Information needed for a simulation

I.9.1. T, temperature at which the simulations are to be run (specified with the TEMP key).

I.9.2. The coordinates of the nuclei, electrons and pseudoatoms (if any) in a local coordinate framework, and the charges associated with these centers for the solute and (optionally) for the solvent (specified with the SLTA key). The program SIMULAID can help convert structures in different file formats (e.g., PDB, CHARMM .CRD) to MMC format and can optimize the orientation of a solute within a periodic box (to maximize the smallest inter-image distance and thereby minimize the artifactual effect of the solute images), within a boundig box, or optimally center a solute in a sphere (for droplet-type calculations).

I.9.3. Potential specifications. For each solute atom the potential type has to be specified from input (except for EPEN where for each atom and pseudoatom the A, B, and C parameters have to be specified individually). The program SIMULAID can help to convert from PDB, CHARMM, AMBER or Macromodel input files to MMC format. Several water potentials are implemented with specific codes, but solvent can be defined similarly to the solute as well.

I.9.4. nmolec, the total number of molecules used in the simulation (i.e., the number of solvent moecules + 1), specified with the SIZE key.

I.9.5. Size and shape of the periodic cell (specified with the PBCN key).

I.9.6. When generating a new configuration the program can determine nmolec and the simulation cell size parameters from the partial molar volumes of the solute and solvent and the required number of solvent shells around the solute.

I.9.7. For partial solute displacements the groups of solute atoms (independent molecules) to be moved separately (specified with the PARD key).

I.9.8. For solute torsions (specified with the TORD key). the atoms defining the torsion angle and the torsion potential parameters.

I.9. Self testing

There is an extensive self testing routine in the program that tests the integrity/consistency of the data. It protects against both corruption of data and program errors. It is invoked at several ways:

I.10. General analysis

Except for the the bulk distributions and most thermodynamic variables, all the the properties discussed above in Sec. I.2., can be calculated from the history of a simulation as well. Note, that such analyses can be done on simulation trajectories generated by other programs, e.g., by CHARMM or AMBER. In addition to the analyses described above, several other types of analyses are implemented as summarized below.

II. FILE ORGANIZATION.

Beside then standard input (Fortran unit 5) - for the information needed for the run - and the standard output file (unit 6) - printed output, diagnostics, described in Section V. - the program uses a number of different files. In the followings, writing to the standard output will be refererred to as printing.

The file names are structured the following way:

<jobname>[.<numrun>][_<version>].<file type>.

where

III. INPUT DESCRIPTION

The program input is based on keyword driven commands. The structure of a command is as follows:

Of these only the leading keyword appears all the time. Each keyword is a four-character, upper case string. Keywords and unformatted data have to be on the same line. Commands requiring more than one line require the continuation character "~" at the end of the line that is not the end of the command. Several commands can be put on the same line if they are separated by the separator character "|". Text after the comment character "!" is neglected (i.e., considered just comment).

The type of items required are defined by the leading keyword. Auxiliary keywords and data may have default values. The order of keywords and data is fixed, so if there is a needed auxiliary keyword or data, all other items have to be specified that precede the needed one. Formatted data (when required) always start in a line after the keywords and it can not be omitted, but for items with default value, can be secified as zero.

While there is no rigid order of the leading keywords, there are certain rules. Certain leading keywords have prerequisite keywords (i.e., the prerequisite keywords all have to be inputted earlier) and potential precursor keywords that, if used, must be read before the current one. If the same leading keyword is used more than once before a RUNS command, a warning is given and the last occurrence will be used (except for TITL).

During input and initializations the data read is cheked for consistency and for current program limits. This may result in messages prefixed by OVERRIDE, WARNING, STRONG WARNING or ERROR. At the conclusion of the run, the number of each type of messages is printed. ERROR messages prevent a simulation or an analysis run but don't abort the input. When the input has more than one ERROR messages, some of the additional ERROR messages may only be the consequence of the first one. WARNING message indicates that under some circumstances the input may be incorrect and STRONG WARNING message indicates that the input is likely but not necessarily incorrect or that the error is not fatal. OVERRIDE messages are generally harmless, but they can indicate that the input was not what the user intended.

This section describes the input for each leading keyword and specifies their prerequisites and potential precursors. The description of the commands is given by leading keywords. For example CNFI Key1 [Key2 Key3 Data] Fdata means that the leading keyword CNFI has to be followed by at least one other auxiliary keyword and optionally two more ("Key 2 Key 3"), followed by free-format numerical data ("Data") (all on the line containing the leading keyword), followed by formatted data, started in a new line. Note also that the type of (or the need for) auxiliary keys, free-format and formatted data may depend on the actual choice of auxiliary keys. The possible keywords to be used for Key1, Key2 and Key3 and their effects are explained subsequently, as well as the nature and (for Fdata) the format of the data. Keywords are always given with upper case BOLD characters while variable (data) names are given in lower case bold characters. Input item that is optional is given in [ brackets ] and the default values are given in { braces }. The default value for an auxiliary keyword is the first one on the list.

A record type number (RECTYPE) is assigned to every formatted data record. It is used in error messages on the program's output to identifying the type of data read when the error was encountered.

III.1. File handling keywords

  1. FILE: Job name
  2. WCNF: Configuration saving
  3. WCKP: Creation of a checkpoint file
  4. SCKP: Periodic archiving of checkpoint files
  5. RMCK: Checkpoint file removal
  6. TRAJ: History file specification
  7. BUFF: History file buffer save
  8. IDLG: Insertion/deletion log file specification
  9. PXPL: Proximity analysis distribution plot file writing
  10. PXWR: Proximity information file specification
  11. WTRA: Trajectory conversions
  12. STIR: Contact elimination

III.01.01. Job name

FILE Data

Data: jobname[, numrun]

III.01.02. Configuration saving

WCNF

Prerequisite: CNFI, SLTA

Key1 (coordinate file format):

III.01.03. Creation of a checkpoint file

WCKP [Data]

Prerequisite: FILE

Data: numrunw

III.01.04. Periodic archiving of checkpoint files

SCKP [Data]

Data: nsavckpf

Each checkpoint file will have the runnumber of the current run and consecutive version numbers, starting at 2. This is option allows the later calculation of proximity analysis averages over different segments of the run (see key CMPC), provides a backup procedure in case the program crashes while writing the checkpointfile, and is useful for debugging.

III.01.05. Checkpoint file removal

RMCK [Data]

Data: numrunp

This option is useful for preventing the disk to become too full when the calculation is broken up into several RUNS or SCAN steps.

III.01.06. History file specification

TRAJ Key1, [Key2,Key3] [Data]

For MC runs this key sets up the saving or reading of the run history on file <jobname>.<runnum>_<runvers>.hst (For runnum=1 the run number part is omitted and for runvers=1 the version part is omitted). See also the key BUFF.

For analysis (SCAN) or filtering (FILT **** TRAJ) this command specifies the format of the history file. An input history can consist of several files with the same runnumber and with consecutive versionnumbers (i.e., the second, third, etc. piece should have names <jobname>.<runnum>_<runvers+1>.hst , <jobname>.<runnum>_<runvers+2>.hst , etc.).

Key1 (trajectory file format - if any):

When running a MC simulation, full configurations (Key1=ALL*) will be written on the history file at every nmcrec-th MC step (read with the RUNS key).

III.01.07. History file buffer save

BUFF

When this key is present, the buffers of the open history files are written out whenever a new checkpoint file is saved. This is currently achieved by rewindig the file and skipping to the end.

III.01.08. Insertion/deletion log file specification

IDLG

Prerequisite: GCEN,

Key1 (ins/del log file creation):

III.01.09. Proximity analysis distribution plot file writing

PXPL If this keyword is present, the program will write an ASCII file containing all the calculated proximity distributions to the .pxp file. The distributions written on the .pxp file can be plotted with the companion (Fortran-77) program pxps.f. pxps.f produces Postscipt files of plots, 8 plots per page. It can put two distributions on the same plot, but both have to use the same dependent variable. The right-side vertical axis will belong to the second plot.

A typical run of pxps.f is as follows (user input is displayed in bold face):

% pxps

Proximity plot generator Version:/07/22/98
Data file name root=wt1
 File wt1.pxp opened                             <-  the file wt1.pxp
                                                     was generated by
                                                     the PXPL key
Use @ for: @f=Greek fnt, @n=Normal fnt, @+=superscript
 @-=subscript, @m=newline, @#n#=font size n, @@=@symbol
Title of graph page:
GnRH Wild Type, Extended Conformation
Do you want grid lines drawn (y/n)? n   <- If yes, only one plot
                                                  can be plotted
Plot types:
 1  Solute-solvent radial distribution functions       <- from key PXGR
 2  Solute-solvent pair energy distribution functions  <- from key PXBE
 3  Mean solvent orientation as a function of R        <- from key PXDP
 4  Distribution of first shell solvent orientations   <- from key PXDP
 5  Solvent-solvent radial distribution functions      <- from key PXWW
Plot type number=1
Data colums: 1 R        2 gpx(R)   3 Kpx(R)   4 gt(R)    5 Kt(R)
Data column number=2
Do you want a second function also plotted (y/n)? y
Plot types:
 1  Solute-solvent radial distribution functions
 2  Solute-solvent pair energy distribution functions
 3  Mean solvent orientation as a function of R
 4  Distribution of first shell solvent orientations
 5  Solvent-solvent radial distribution functions
Plot type number=1
Data colums: 1 R        2 gpx(R)   3 Kpx(R)   4 gt(R)    5 Kt(R)
Data column number=3
Number of graphs to print (0: will ask for range)=0
First and last distribution function=1,18
Data set not found (skipped): gkr is=   1    <- empty proximity
regions
Data set not found (skipped): gkr is=  10
 File wt1_2_9_gpx_Kpx.ps opened
 File wt1_11_18_gpx_Kpx.ps opened
More plots (y/n)? n   <- allows to plot other distributions
%

The format of the .pxp file is as follows:

III.01.10. Proximity information file specification

PXWR Key1 [Key2 Data]

Key1 (file format - if any):

The proximity information file has extension .pxi.

III.01.11. Converting/shifting a trajectory

WTRA Key1 Key2 [Data]

Prerequisite: TRAJ

Key1 (transformation type):

This key allows the conversion of the input trajectory to a new format and gives the option of centering it in two different ways.

III.01.12. Contact elimination

STIR [Data]

Prerequisite: CNFI

Data: eijmin, disp, dispa

This option is useful for 'cleanng up' a starting configuration generated with the RANC option.

III.2. System descriptor keywords

  1. SIZE: Number of molecules
  2. TEMP: Simulation temperature
  3. TITL: Description
  4. PBCN: Periodic boundary conditions
  5. GCEN: Grand-canonical (T,V,mu) ensemble
  6. STPX: Stop GCE run at nx
  7. BTUN: Tune GCE B parameter
  8. LIMG: Limit the cavity grid
  9. PBGR: Solute reset with grids
  10. IBEN: Isothermal isobaric (T,P,N) ensemble
  11. CNFI: Initial configuration
  12. SCAL: Simulation cell rescaling
  13. OVST: Overlap of the second solute copy with the first
  14. SPRD: Spread out solute molecules

III.02.01. System size

SIZE Data

Data: nmolec, nmolfix

nmolfix=1 immobilizes the solute (moving large solutes is simply too expensive) and nmolfix>1 allows the treatment of a solvation shell as a fixed unit.

III.02.02. Simulation temperature

TEMP Data

Data: T

III.02.03. Title

TITL Data

Data: description of the calculation.

III.02.04. Periodic boundary conditions and cell dimension.

PBCN Key1 [Key2 Key3] Data [Fdata]

Prerequisites: SUVC, SVVC

Key1 (cell shape):

Note: Input boundary conditions work only if the image cell can be found by comparing the distances of a point from the various cell centers, the smallest distance will belong tot he wanted cell number. It works when the periodic system can be described by an orthogonal crystal coordinate system. For other cases, more complicated calculations are required - to start with, a transformation to non-orthogonal coordinate system - but it is not implemented yet.

For pure liquids, FCC is the best since this maximizes the distance of a molecule from its image. On the other hand, RECT is conceptually the simplest. Furthermore, for RECT the dimensions of the cell can be different along the three axes, allowing for various types of crystals. Solutes that are far from spherical can also be wrapped around by waters more efficiently. (In this case, however, the solute rotation should be prevented - see key STEP.) HEXG have been developed for long solutes or solutes representing an infinite polymer chain, to maximize the distance of the chain from its image. SPHR simply encloses the system into a sphere of inputted radius. Note that the FCC calculations are only partially vectorized.

For the rectangular, face-centered cubic, hexagonal prism and sphere boundary conditions there are special algorithms in the program, one for each. When boundary conditions are inputted, the cell number of the periodic image cell where a molecule is located is determined simply by comparing the distance of the molecule from the various cell centers read in.

The unit cell parameters are determined from the bulk density or molar volume of the solution, Nmol, and the choice of boundary conditions. If the density of the system is d g/ml , then

d (g/ml) = weight in g of Nmol / volume of Nmol in ml =

(Nmol/L)*Mol.wt / (Vol/1024)

where L is Avogadro's number. The cell parameter(s) can be obtained from the resulting Vol value.

III.02.05. Grand-canonical ensemble selection

GCEN[ Key1 Key2 Key3 Key4 Key5 Data ] Fdata

Potential precursors: LIMG
Variable to set to 'T' in the preprocessor: CB

Grand-canonical ensemble simulation using the cavity-biased method with grid-insertion or the original, random insertion method.

Key1 (GCE insertion strategy):

III.02.06. Stop GCE run at nx

STPX

Prerequisite: GCEN,

When this key is present the run will stop if the number of molecules reached nmolxp (read in with the key GCEN).

III.02.07. Tune the GCE B parameter

BTUN [Data]

Prerequisite: GCEN, MOND

Data: targetden, nmctunave, nmctunskip, chabmax, dentol

The program will periodically change the GCE B parameter to achieve the targeted density in the 'bulk' region - defined as the outside region by the MOND key.

III.02.08. Limit the cavity grid

LIMG [Key1,Data]

Key1:

III.02.09. Force solute reset with grids

PBGR

When this key is present, the grid calculations will reset each solute atom to the periodic cell before calculating the cover list. This is a safety option, since it is automatically set when the inital structure contains solute atoms outside the simulation cell.

III.02.10. Isothermal isobaric ensemble simulation (TPN)

IBEN Key1, Key2, Data

Variable to set to 'T' in the preprocessor: IB

Key1 (volume sampling technique):

Note, that putting the key IBEN after CNFI will result in ignoring the cell size on the .crd file and the vaulues inputted with the PBCN key will be used instead.

III.02.11. Initial configuration

CNFI Key1 [ Key2 Key3 Data ] [Fdata]

Prerequisites: FILE, SIZE, SLTA
Potential precursor: GCEN, TORD, CNFC PXYZ

Key1 (data source):

III.02.12. Simulation cell rescaling

SCAL Data

Data: e1, e2, e3

III.02.13. Overlap of the second solute copy with the first

OVST Data

Data: i1, i2, i3

III.02.14. Spread out the solute molecules for display

SPRD [Data]

Prerequisite: CNFI

Data:

This key generates a PDB file containing the solute molecules spread out in the x-y plane so that they don't overlap.

III.3. Free-energy calculation keywords

  1. FREE: Free energy options
  2. RAUS: AUS parameter change
  3. WMAT: PMF matching
  4. WPLT: AUS iteration plot
  5. TIQU: TI quadrature
  6. OVRA: Overlap ratio evaluation
III.03.01. Free energy options

FREE Key1, [Key2-4 Data Fdata]

Prerequisite: PBCN

Key1 (free-energy method choice):

III.03.02. Adaptive Umbrella Sampling parameter change

RAUS Fdata

Fdata:

III.03.03. PMF matching

WMAT Key1, Data, Fdata

This function joins the adaptive umbrella sampling estimates of the potential of mean force obtained from the checkpoint files of calculations on overlapping coupling parameter ranges (windows).

Key1:

Data for Key1=REGL: nruns, numrunr, incrnumrun

Data for Key1=LIST: nruns, numrun1, numrun2 ... III.03.04. AUS iteration plot

WPLT Key1, Data, [Data]

Key1 (scale for data points):

This key will produce a family of curves showing the PMF calculated from each iteration as lines, with circles representing the extent of sampling at each coupling parameter grid. The matched PMF is shown with 'x' symbols.

III.03.05. TI quadrature

TIQU [Data] Fdata

Data: numrunr, incrnumrun

This function performs the numerical (Gaussian) quadrature on thermodynamic integration runs. It prepares a plot of the polynomial fitting the quadrature ponts and, optionally, performs other numerical integrations.

III.03.06. Overlap ratio evaluation

OVRA Data

This function obtains the free energy from two overlap ratio method runs.

Data: numrun1, numrun2

III.4. Molecule descriptor keywords

  1. SLTA: Solute description
  2. SLVA: Solvent description
  3. CLON: Solute cloning
  4. MAKB: Make bonds
  5. BRKB: Break bonds
  6. MOLD: Define solute molecule limits
  7. TORD: Torsion angle definitions
  8. CSEG: CHARMM segment id handling
  9. LOOP: Loop torsion moves
III.04.01. Solute description

SLTA Key1 Key2 Key3, Data, [Data, Fdata]

Prerequisites: FILE, SVPT, PBCN, SVPT
Potential precursors: FREE, SLVA, PMOD, PROT, CHRG, BRKB, MAKB, CLON, MIXR, MOLD, MODA, FC14, FC14

Key1 (possible solute periodicity):

The program is currently prepared to handle atoms up to atomic number 86. The atomic number for lone-pair is 89 (in case of EPEN, it is disregarded) and atomic number 90 is an electron. Atomic numbers 91-93 have been reserved for the united atoms CH1-CH3, respectively. The various atomtypes for the different potential libraries implemented and the actual functional form of the potentials are specified in a separate document.

Important: For EPEN, the different types of centers have to be grouped together. Lone-pairs should follow nuclei and electrons should follow nuclei and lone-pairs.

When the coupling parameter is used to determine conformational transition (FREE PMF1), three sets of solute coordinates are used: the two conformations Ro and R1 and their combination CP*R1+(1-CP)*Ro. The solute description thus has to contain the coordinates, charges of the two conformations as well as room for the mixture. In this third set, only the atomtypes have to be given. When CNFI PMFN, however, only the Ro conformation is required here since the R1 conformation will be given at a different part of the input. It is important that the different descriptions of the conformation give the solute atoms in identical order. Solute atom type change during the transition can not involve EPEN.

III.04.02. Solvent description

SLVA [Key1 Data [Fdata]]

Prerequisite: SVPT.

Key1:

Pseudo-atoms are to be placed after the regular atoms.

The order of the (water) solvent centers is set by the program as follows. For MCY, O,H,H,LP, for TIPS, O,H,H; for TIPS2, O,H,H,Q.

Note: the dipole of the solvent molecule should be parallel to the X axis for the purpose of the near-neighbor solvent dipole correlation function computation (DSTC ALL).

III.04.03. Solute molecule cloning

CLON Data

Data: nclone, iclonef(1), iclonel(1), ncopy(1), ... iclonef(nclone), iclonel(nclone), ncopy(nclone)

III.04.04. Making additional bonds

MAKB Data

Data: nmake, imake1(1), imake2(1), ..., imake1(nmake), imake2(nmake)

Note that this key is not implemented for the case when cloning (CLON) is in effect.

III.04.05. Breaking bonds

BRKB Data

Data: nbreak, ibreak1(1), ibreak2(1), ... ibreak1(nbreak), ibreak2(nbreak)

Note that this key is not implemented for the case when cloning (CLON) is in effect.

III.04.06. Defining molecule limits

MOLD Data

Data: molslt, ilastm(1), ilastm(2), ...

This key overrides the molecular topology for defining the molecules within the solute. This is useful when the molecule is distored and too many incorrect bonds are found or too many bonds are missing. When cloning (CLON) is in effect, the definitions have to be repeated for all clones.

III.04.07. Torsion angle definitions

TORD Key1 Key2 Data Fdata

Prerequisite: SLTA, PRMF
Potential precursors: FC14, FREE, LOOP, PARD, SLTA

Key1 (torsion definiton source):