CHARMM 27 B1 Documentation files CHARMM Element doc/ace.doc $Revision: 1.1 $  File: ACE, Node: Top, Up: (chmdoc/commands.doc), Next: Syntax Analytical Continuum Solvent (ACS) Potential Purpose: calculate solvation free energy and forces based on a continuum description of the solvent, in particular the analytical continuum electrostatics (ACE) potential. Please report problems to Michael Schaefer at schaefer@brel.u-strasbg.fr WARNING: The module is still being developed and may change in future versions. REFERENCES: M. Schaefer & M. Karplus (1996) J. Phys. Chem. 100, 1578-1599. M. Schaefer, C. Bartels & M. Karplus (1998) J. Mol. Biol., in press. * Menu: * Syntax:: Syntax of the ACE specifications * Defaults:: Defaults and Recommended values * Function:: Purpose of each of the specifications * Examples:: Usage examples of the ACE module  File: ACE, Node: Syntax, Up: Top, Previous: Top, Next: Defaults Syntax [SYNTAX ACE functions] Syntax: The ACE specifications can be specified any time the nbond specification parser is invoked, e.g., ENERgy [other-spec] [ace-spec] ace-spec::= [ ACE ] [ IEPS real ] [ SEPS real ] [ ALPHa real ] [ SIGMa real ]  File: ACE, Node: Defaults, Up: Top, Next: Function, Previous: Syntax The defaults for the ACE potential are IEPS 1.0 SEPS 80.0 ALPHa 1.2 SIGMa 3.0 In the current implementation, ACE should be used with united atom parameters, ALPHa set equal to 1.2 and the PARAM19 parameter file param19-1.2.inp; the param19-1.2.inp file inludes a table of atom volumes at the end which is compatible with ALPHa 1.2.  File: ACE, Node: Function, Up: Top, Previous: Syntax, Next: Examples 0. Introduction The analytical continuum solvent (ACS) potential is introduced to perform molecular dynamics/minimization calculations with a continuum approximation of the solvent. Two solvent contributions to the effective (free) energy of a solute are included: the electrostatic solvation free energy, and the non-polar (i.e., non-electrostatic) solvation free energy. The first (electrostatic) contribution (G_el) is calculated using an analytical approximation to the solution of the Poisson-equation called ACE (from: analytical continuum electrostatics). The non-polar solvation free energy (G_np) is approximated by a pairwise potential which yields results that are very similar to the well-known surface area approximations of the hydrophobic (solvation) energy (e.g., Wesson and Eisenberg, Prot. Sci. 1 (1992), 227--235; see the ASP potential in CHARMM). Restriction: The ACE solvation potential has to be used together with no cutoff or with atom based switching. Compatibility: ACE can be used with BLOCK (but: the diagonal elements of the BLOCK matrix MUST NOT be zero). Meaning of the ACE parameters: 1. IEPS Dielectric constant for the space occupied by the atoms that are treated explicitly, e.g., the space occupied by the protein. 2. SEPS Dielectric constant for the space occupied by the solvent that is treated as a continuum (i.e., the complement of the space occupied by the protein). 3. ALPHa The volumes occupied by individual (protein) atoms are described by Gaussian density distributions. The factor ALPHa controls the width of these Gaussians. The net volume of the individual atom Gaussian distributions is defined in the volume table at the end of the parameter file param19-1.2.inp. The width of the atom volume distributions and the volume table have to be consistent -- currently, the volumes in the param19-1.2.inp file are optimal for an ALPHa of 1.2. Changing ALPHa without adapting the volume table is expected to reduce the precision of the results. 4. SIGMa The ACE solvation potential includes a hydrophobic contribution which is roughly proportional to the solvent accessible surface area. The factor SIGMa scales the hydrophobic contribution. For peptides with about 10-15 residues, a SIGMa factor of 3 results in hydrophobic contributions that are approximately equal to the solvent accessible surface area multiplied by 8 cal/(mol*A*A).  File: ACE, Node: Examples, Up: Top, Previous: Function, Next: Top Examples To set up simulations/minimizations with the ACE solvation potential, the following energy call is expected to be adequate in most situations: ENERgy ATOM ACE IEPS 1.0 SEPS 80.0 ALPHa 1.2 SIGMa 3 SWITch - VDIS VSWI CUTNB 13.0 CTONNB 8.0 CTOFNB 12.0 When you run molecular dynamics or minimization with ACE, you get two more lines in the log file printout with energy terms, e.g., DYNA DYN: Step Time TOTEner TOTKe ENERgy TEMPerature DYNA PROP: GRMS HFCTote HFCKe EHFCor VIRKe DYNA INTERN: BONDs ANGLes UREY-b DIHEdrals IMPRopers DYNA EXTERN: VDWaals ELEC HBONds ASP USER DYNA PRESS: VIRE VIRI PRESSE PRESSI VOLUme DYNA ACE1: HYDRophobic SELF SCREENing COULomb DYNA ACE2: SOLVation INTERaction ---------- --------- --------- --------- --------- --------- DYNA> 0 0.00000 -3423.29671 0.00000 -3423.29671 0.00000 DYNA PROP> 4.45310 -3423.12228 0.52327 0.17442 -532.70519 DYNA INTERN> 6.58717 60.43092 0.00000 56.00750 7.32144 DYNA EXTERN> -380.26218 -3173.38156 0.00000 0.00000 0.00000 DYNA PRESS> 0.00000 355.13679 0.00000 0.00000 0.00000 DYNA ACE1> 109.04469 -3829.20991 2750.59427 -2203.81062 DYNA ACE2> -1078.61564 546.78365 ---------- --------- --------- --------- --------- --------- and the same during minimization (MINI...). The terms in lines with ACE1 and ACE2 are: HYDRophobic: Hydrophobic potential, equivalent to a surface based solvation term proportional to the sigma input parameter; SELF: Self contribution to electrostatic solvation free energy, Delta-E_self, first term of eq(8) (i.e., sum over all atomic solvation energies, Delta-E_self_i, eq(28)); SCREENing: Interaction contribution to electrostatic solvation free energy, i.e., screening of Coulomb interactions, eq(38) (sum over all atom pairs, including bonded and 1-3 atom pairs!); COULomb: Coulomb energy with constant dielectric of EPSI (sum over all atom pairs for the first term in eq(36) -- excluding bonded and 1-3 atom pairs, and 1-4 atom pair contributions scaled with E14FAC); SOLVation: Electrostatic (!) solvation free energy, sum of SELF and SCREENing; INTERaction: Electrostatic interaction, sum of SCREENing and COULomb (eq(36), but taking account of the bonded, 1-3, and 1-4 exclusion in the Coulomb term, see above). The term "ELEC" in line "DYNA EXTERN>..." is the total of the electrostatic energy PLUS the hydrophobic solvation energy (may change this in the future to avoid confusion): ELEC: Sum of HYDRophobic, SELF, SCREENing, COULomb. Equation numbers refer to Schaefer & Karplus, J. Phys. Chem. 100 (1996), 1578. see also: test cases c26test/ace1.inp and c26test/ace2.inp. =============================================================================== CHARMM Element doc/adumb.doc 1.1  File: ADUMB, Node: Top, Up: (chmdoc/commands.doc), Next: Syntax Adaptive Umbrella Sampling Module Setting up of adaptive umbrella potentials. Currently supported types of umbrella potentials are functions of dihedral angles and functions of the potential energy of the system (energy sampling). WARNING: The module is still being developed and some details are likely to change in future versions. Please report problems to Christian Bartels at cb@brel.u-strasbg.fr REFERENCES: C. Bartels & M. Karplus, J. Comp. Chem. 18 (1997) 1450- C. Bartels & M. Karplus, J. Phys. Chem. 102 (1998) 865- M. Schaefer, C. Bartels, & M. Karplus, J. Mol. Biol. (1998) * Menu: * Syntax:: Syntax of the ADUMB commands * Function:: Purpose of each of the commands * Examples:: Usage examples of the ADUMB module  File: ADUMB, Node: Syntax, Up: Top, Previous: Top, Next: Function Syntax [SYNTAX ADUMB functions] Syntax: ADUMb DIHE NRES int TRIG int POLY int 4X(atom-spec) ADUMb ENER NRES int TRIG int POLY int MAXE real MINE real [MAXT real] [MINT real] ADUMb INIT NSIM int [UPDA int] [EQUI int] [TEMP real] [AGIN real] [NEXT int] [THRE real] [UCUN int] [WUNI int] [RUNI int] ADUMb PROB UCUN int [TEMP real] [PUNI int] [TUNI int] ADUMb STON ADUMb STOFf where: atom-spec ::= { segid resid iupac } { resnumber iupac }  File: ADUMB, Node: Function, Up: Top, Previous: Syntax, Next: Examples 0. Introduction The module provides commands to define degrees of freedom along which adaptive umbrella potentials are applied in molecular dynamics simulations. Statistics on the sampling of the degrees of freedom are recorded during the md simulations and periodically used to update the umbrella potential such that uniform sampling of the degrees of freedom can be expected. Currently, dihedral angles and the potential energy are supported as degrees of freedom. If several degrees of freedom are defined, multidimensional adaptive umbrella sampling is performed. Two sorts of input/output files are used by the module. The "umbrella" files contain the umbrella potentials that were used in the simulations together with the statistics of the sampling of the bins during the simulations. Based on this information the potential of mean force can be calculated and the umbrella potential expected to lead to uniform sampling can be determined. The second sort of files contains the values of the umbrella coordinates (=degree of freedom for adaptive umbrella sampling) for each time step in which coordinates were saved to the trajectory files. The umbrella coordinates are normalized to the range 0 to 1, independent of the degrees of freedom used. From the umbrella coordinates saved, weighting factors can be calculated which are needed to calculate average properties of the unbiased system. 1. ADUMb DIHE Define a dihedral angle as degree of freedom for adaptive umbrella sampling. To record the statistics the degree of freedom is partitioned into NRES bins. The umbrella potentials are represented as a linear combination of two times TRIG trigonometric functions and polynomial functions of degree 0 to POLY - 1. Repeating the command results in a multidimensional adaptive umbrella potential. The coordinates written to the umbrella coordinates file are normalized to the range 0 to 1 with 0 corresponding to -180 degrees and 1 corresponding to +180 degrees. 2. ADUMb ENER Define the potential energy as degree of freedom for adaptive umbrella sampling. NRES, TRIG and POLY have the same meaning as in ADUMb DIHE. MINE and MAXE specify the potential energy range: Statistics on the sampling are recorded in the range MINE-0.5*(MAXE-MINE) to MAXE+0.5*(MAXE-MINE). In the range outside of MINE to MAXE the umbrella potential is kept constant to prevent the system from leaving the range in which statistics are recorded. MINT and MAXT (default values: 273 K and 1000 K, respectively) are minimal and maximal temperatures to restrict sampling in the relevant temperature range. To set up a system, get a rough estimate of the potential energy of the system at the desired TMIN and TMAX (from short unbiased simulations at TMIN and TMAX). Set EMIN and EMAX to the values determined minus/plus a small tolerance, respectively. The coordinates written to the umbrella coordinates file are normalized to the range 0 to 1 with 0 corresponding to MINE-0.5*(MAXE-MINE) and 1 corresponding to MAXE+0.5*(MAXE-MINE). 3. ADUMb INIT Defines or redefines the parameters for adaptive umbrella sampling and initializes the umbrella potential. The umbrella potential is updated every UPDAte steps. After each update, no statistics are recorded for EQUI steps. For the remaining UPDA - EQUI steps, statistics on the sampling of the umbrella coordinates are recorded and stored separately from previous statistics and together with the umbrella potential active when recording the statistics. NSIM separate statistics can be kept in memory. If the number of updates performed in a run exceeds NSIM, the oldest statistics are discarded to make space for the most recent statistics. After each update the umbrella potential and the statistics are written to standard output (the log file). The written table contains, from left to right, the number of the bin, the number of integration time steps in which the system was in the bin since the last update, the potential of mean force calculated with the WHAM equations, the negative of the updated umbrella potential (potential of mean force modified to restrict sampling if necessary and fitted to the set of trigonometric and polynomial functions), the total number of times the bin was visited in the entire simulation, and the umbrella coordinates of the center of the bin. The temperature TEMP should be set to the temperature used in the simulations. It is used to calculate the umbrella potentials from the sampling statistics and to restrict sampling if potential energy sampling is performed. Umbrella coordinates are written to unit UCUN. At each update, the statistics are written to unit WUNI together with the umbrella potential active when recording the statistics. Statistics from previous runs can be read from unit RUNI. The statistics read must be from adaptive umbrella sampling simulations with the same parameters as the present one, in particular, the same degrees of freedom have to be used as umbrella coordinates. If adaptive umbrella sampling of the potential energy is used, umbrella potentials from runs at different temperatures can be read by repeating the ADUMb INIT command with RUNI set to the unit containing the statistics of each of the runs and TEMP set to the temperature of the run. To define the umbrella potential of bins for which no statistics have been acquired so far, the umbrella potential has to be extrapolated. In the current implementation (might change in future implementations), the umbrella potential of the bins that were not sampled is set to the same value (ext-cons). To determine ext-cons, the potential of the bins that were sampled is linearly extrapolated for NEXT bins, and the maximal value (max-extrapolated) of the linearly extrapolated potentials is determined. Then, the minimal value (min-sampled) of the potentials of the bins that were sampled is determined and ext-cons is set to min-sampled or max-extrapolated whatever value is smaller. A few statistics that differ significantly from the rest of the statistics can be due to problems with the convergence caused by the extrapolation or due to the occurrence of rare events. In the former case, outliers should occur only in the first few simulations and it is advantageous to eliminate them. By default, the module eliminates statistics that differ from the averaged statistics by THRE times the average deviation. If one wants to prevent statistics from being eliminated THRE has to be set to a value larger than NSIM. At each update, the deviations of the statistics from the averaged statistics is printed to standard output (log file), e.g., 0 Deviation of simulation 1 : 0.955 0 Deviation of simulation 2 : 0.513E-01 0 Deviation of simulation 3 : 0.787E-01 0 Deviation of simulation 4 : 0.292 0 Deviation of simulation 5 : 0.170 0 Deviation of simulation 6 : 0.201 0 Deviation of simulation 7 : 0.933 0 Deviation of simulation 8 : 0.208 0 Deviation of simulation 9 : 0.270 0 Deviation of simulation 10 : 0.131 0 Deviation of simulation 11 : 0.394 0 Deviation of simulation 12 : 1.52 0 Deviation of simulation 13 : 0.969 0 Deviation of simulation 14 : 0.502 0 Deviation of simulation 15 : 1.47 0 Deviation of simulation 16 : 2.97 -1 Deviation of simulation 17 : 210. 0 Deviation of simulation 18 : 0.695E-01 0 Deviation of simulation 19 : 0.160 0 Deviation of simulation 20 : 0.450 The 0 or -1 on each line indicates whether the statistics of a particular simulation are used (0) or were discarded (-1) based on the THRE criterion. For complex systems, there might exist no umbrella potential that enables the system to diffuse rapidly along the umbrella coordinate. In such cases it has been found to be advantageous to give a higher weight to the most recent statistics. This is implemented using the AGINg factor. For an umbrella potential calculated from n statistics, the i'th statistics (i=1,2,..,n) are weighted by AGINg**(n-i). 4. ADUMb PROB Average properties of the unbiased system can be obtained by weighting the conformations of an adaptive umbrella sampling run by appropriate factors. The ADUMb PROB command calculates these weighting factors from the umbrella coordinates read from unit UCUN and writes them to unit PUNI. For the command to work the umbrella potentials and statistics from the run must have been read with the ADUMb INIT command. If the potential energy was used as umbrella coordinate, the TEMP specifies the temperature at which properties of the unbiased system should be calculated. 5. ADUMb STON ADUMb STOFf By default statistics on the sampling of the umbrella coordinates are recorded in each call to the energy routines. The ADUMb STOFf command prevents that statistics are recorded. This might be useful when doing a minimization or running a md simulation with an umbrella potential that should not change during the simulation.  File: ADUMB, Node: Examples, Up: Top, Previous: Function, Next: Top Examples This examples are meant to be a partial guide in setting up an input file for ADUMB. There are three test files, adumb-phichi.inp, adumb-enum.inp and ace2.inp. Example (1) ----------- Set up and run an adaptive umbrella sampling simulation using two dihedral angles as umbrella coordinates. ! define the phi and chi1 dihedral angle as the two umbrella coordinates umbrella dihe nresol 36 trig 6 poly 1 pept 1 N pept 1 CA pept 1 CB pept 1 OG1 umbrella dihe nresol 36 trig 6 poly 1 pept 1 CY pept 1 N pept 1 CA pept 1 C umbrella init nsim 100 update 10000 equi 1000 thresh 10 temp 300 - ucun 10 wuni 11 ! perform adaptive umbrella sampling md simulation dynamics nose tref 300 qref 20 start - nstep 20000 timestep 0.001 - ihbfrq 0 inbfrq 10 ilbfrq 5 - iseed 12 - nprint 1000 iprfreq 1000 - isvfrq 1000 iunwrite -1 iunread -1 - wmin 1.2 Example(2) ---------- Set up and run an adaptive umbrella sampling simulation using the potential energy as umbrella coordinate (=energy sampling, multicanonical simulation, entropic sampling). ! set up umbrella; the range of relevant potential energies is assumed to ! extend form -50 kcal/mol to 100 kcal/mol. umbrella ener nresol 200 trig 20 poly 5 mine -50 maxe 100.0 mint 280 maxt 2000 open write formatted unit 9 name @9enum.umb open write formatted unit 10 name @9enum.uco open write unformatted unit 11 name @9enum.cor umbrella init nsim 100 update 10000 equi 1000 temp 1000 thres 100 - wuni 9 ucun 10 ! energy sampling simulation dynamics langevin start - nstep 50000 timestep 0.001 - inbfrq 10 ilbfrq 10 rbuffer 0.0 tbath 1000 - iseed 12 - nprint 1000 iprfreq 1000 - isvfrq 1000 iunwrite -1 iunread -1 - nsavc 100 iuncrd 11 - wmin 1.2 Example(3) ---------- Determine the weighting factors to calculate properties of the unbiased system. ! define the umbrella coordinates umbrella ener nresol 200 trig 20 poly 5 mine -50 maxe 100.0 mint 280 maxt 2000 open read formatted unit 10 name ../scr/@n.umb umbrella init nsim 100 update 10000 equi 1000 runi 10 temp 1000 thres 200 ! translate umbrella coordinates into probability factors at 300K open read formatted unit 11 name ../scr/@n.uco open write formatted unit 12 name ../scr/@nT300K.pfa umbrella prob ucun 11 puni 12 temp 300 ! translate umbrella coordinates into probability factors at 1000K open read formatted unit 11 name ../scr/@n.uco open write formatted unit 12 name ../scr/@nT1000K.pfa umbrella prob ucun 11 puni 12 temp 1000 CHARMM Element doc/analys.doc 1.1  File: analys, Node: Top, Up: (chmdoc/commands.doc), Next: Description Analysis Commands The ANALysis command is an energy and structure analysis facility that has been developed to examine both static and dynamic properties. The current code allows energy partition analysis and energy contribution analysis from free energy simulations. It also can produce a detailed printout of structural and energy term contributions for selected atoms * Menu: * Description:: Description of analysis facility * Energy:: Energy partitioning  File: analys, Node: Description, Up: Top, Previous: Top, Next: Energy Description of the ANALysis Command Syntax: ANALys { ON } { TERM { [ALL] } { NONBond } [UNIT int] atom-selection } { { ANY } { [NONOnbond] } } { OFF } ON Enable energy partition analysis and disable FAST routines. OFF Disable analysis and restore FAST option defaults. TERM Setup energy term print data and disable FAST routines. ALL (default) Print energy terms involving only selected atoms ANY Print energy terms when any of the atoms is selected NONBond In addition to internal terms, also print nonbond terms NONOnbond (default) Do not print electrostatic and vdw energy data UNIT integer Write the energy term printout data to a formatted file Otherwise, write data to the output file.  File: analys, Node: Energy, Up: Top, Previous: Description, Next: Top Energy term option of the ANALysis Command The ANALysis ON command enables energy partition analysis and disables the FAST routines. This will slow the calculation (especially on vector machines), but allows a detailed, atom by atom, energy analysis. Everytime the energy routine is invoked, the energy for each atom is stored in the ECONT array. During PERT dynamics, the EPCONT is filled with the time average energy difference on a atom by atom basis including every step of dynamics. This allows the free energy differences to be analyzed based on atom contributions. The energy partition array can be accessed with the SCALar ECONt commands. *note Econt:(chmdoc/scalar.doc). The sum of all of the elements of the ECONT array is usually the total energy, but some energy terms, such as extended electrostatics, will not be included. The command: SCALar ECONT STATistics can be used to check the total energy and the command SCALar EPCONT .... can be used to examine atom contributions to energy differences for PERT. The ANALysis TERM command will cause all selected energy terms to be printed to the specified output unit (default: standard CHARMM output). The ALL keyword (default) will list elements where all atoms are selected. The ANY keyword will cause terms including any selected atom to be listed. The NONOnbond keyword (default) suppresses the listing of vdw and electrostatic energy terms. The NONBonded keywords also allows the analysis of vdw and electrostatic interactions. The ANALys OFF command enables the FAST routines and disables the resetting of the ECONT array (i.e. the ECONT array will not change, but may still be accessed. This command also disables the energy term analysis. CHARMM Element doc/aspenr.doc $Revision: 1.1 $  File: ASPENR, Node: Top, Up: (chmdoc/commands.doc), Next: Syntax Atomic Solvation Parameter Based Energy Purpose: calculate solvation free energy and forces based on the exposed surface area of each atom using Atomic Solvation Parameters. Please report problems to brbrooks@helix.nih.gov REFERENCES: M. Wesson and D. Eisenberg, 19??. * Menu: * Syntax:: Syntax of ASP input * Structure:: Structure of the .surf file containing ASP data * Examples:: Usage examples of the ASP module  File: ASPENR, Node: Syntax, Up: Top, Previous: Top, Next: Structure Syntax [SYNTAX ASP functions] Syntax: The ASP specifications can be specified any time prior to an energy calculation and can be input either through reading a file or parsed directly off the command line - although the file route is more usual. Once turned on, the ASP energy term is in place during the course of the CHARMM run, i.e., it cannot be turned off except using the skipe command, see *note Skipe (chmdoc/energy.doc). Reading surf file: open unit 1 read vap_to_wat_kd.surf read surf unit 1 close unit 1  File: ASP, Node: Structure, Up: Top, Next: Examples, Previous: Syntax This module computes solvation energies and forces based on the surface area model proposed by Wesson and Eisenberg, i.e., E_solv = Sum (Gamma_i * ASA_i + Eref_i), where Gamma_i is a parameter describing the free energy cost of burying atom i (in units of cal/mol/A^2), ASA_i is the surface area of atom i with radius RvdW_i and probe radius Rprobe and Eref_i is a reference solvation energy. The analytic expressions for atomic surface areas and corresponding cartesian derivitives are used in these calculations. The values of the required parameters are read from a "surf" file which has the following syntax: * file: vap_to_wat_kd.surf * ! Note: These are asp's from Wolfendon water to vapor numbers, ! adjusted for standard state by Kyte and Doolittle. ! They are in units of cal/(mol*A**2). ! Table of ASP's ! 1.400000 ! the probe radius ! ! residue-type atom-name asp-value radius reference-area swap-pairs ANY H* 00.0 -1.0 0.0 ! ignore hydrogens ANY C 04.0 1.90 0.00 ANY OT1 -112.0 1.40 0.00 OT2 ANY OT2 -112.0 1.40 0.00 OT1 ANY N -112.0 1.70 0.00 . . . . TRP CZ2 04.0 1.90 0.00 TRP CZ3 04.0 1.90 0.00 TRP CH2 04.0 1.90 0.00 ASN OD1 -112.0 1.40 0.00 ASP OD1 -112.0 1.40 0.00 OD2 ASP OD2 -166.0 1.40 0.00 OD1 END Notes: -ANY refers to any residue type -A negative radius causes the atom to be ignored (such as hydrogens,...) -Atom name can use CHARMM wildcard rules (not residue names). -These commands ar eprocessed sequentially. If an atom is matched by more then one line the LAST line is used. -This file is free field format.  File: ACE, Node: Examples, Up: Top, Previous: Structure, Next: Top Examples To set up energy calculations/simulations/minimizations with the ASP potential, the following call is expected to be adequate in most situations: open unit 1 read form name vap_to_wat_kd.surf read surf unit 1 close unit 1 When you do an energy calculation, dynamics or minimization with ASP, you get columns in the log file printout with energy terms for ASP, e.g., ENER ENR: Eval# ENERgy Delta-E GRMS ENER INTERN: BONDs ANGLes UREY-b DIHEdrals IMPRopers ENER EXTERN: VDWaals ELEC HBONds ASP USER ---------- --------- --------- --------- --------- --------- ENER> 0 -44.02560 7.74091 6.01738 ENER INTERN> 0.00000 0.04160 0.00000 0.00000 0.04556 ENER EXTERN> 5.95140 -42.32325 0.00000 -7.74091 0.00000 ---------- --------- --------- --------- --------- --------- and the same during minimization and dynamics. see also: test cases c27test/aspenr.inp CHARMM Element doc/block.doc 1.1  File: BLOCK, Node: Top, Up: (chmdoc/commands.doc), Next: Syntax The commands described in this section are used to partition a molecular system into "blocks" and allow for the use of coefficients that scale the interaction energies (and forces) between these blocks. This has a number of applications, and specific commands to carry out free energy simulations with a component analysis scheme have been implemented. The lambda-dynamics, an alternative way of performing free energy calculations and screening binding molecules, has also been implemented. Subcommands related to BLOCK will be described here. To see how to output the results of a dynamics run, please see DYNAMICS documentation (keywords are IUNLDM, NSAVL, and LDTITLE). Please refer to PDETAIL.DOC for detailed description of the lambda dynamics and its implementation. BLOCK was recently modified so that it works with the IMAGE module of CHARMM. As some changes to the documentation were necessary anyways, it was decided to also improve the existing documentation. The Syntax and Function section below are relatively unchanged; the added documentation is in the Hints section (READ IT if you are using BLOCK for the first time!). Comments/suggestions to boresch@tammy.harvard.edu. * Menu: * Syntax:: Syntax of the block commands * Function:: Purpose of each of the commands * Hints:: Some further explanations/hints * Limitations:: Some warnings...  File: BLOCK, Node: Syntax, Up: Top, Next: Function Syntax of BLOCK commands BLOCk [int] Subcommands: miscellaneous-command-spec ! see *note miscom:(chmdoc/miscom.doc). CALL int atom-selection LAMBda real COEFficient int int real - [BOND real] [ANGL real] [DIHEdral real] [ELEC real] [VDW real] NOFOrce FORCe FREE_energy_evaluation [OLDLambda real] [NEWLambda real] - FIRSt int [NUNIT int] [BEGIn int] [STOP int] [SKIP int] - [TEMPerature real] [CONTinuous int] [IHBF int] [INBF int] [IMGF int] INITialize CLEAr Energy_AVeraGe [OLDLambda real] [NEWLambda real] - FIRSt int [NUNIT int] [BEGIn int] [STOP int] [SKIP int] - [CONTinuous int] [IHBF int] [INBF int] [IMGF int] COMPonent_analysis DELL real NDEL int [TEMPerature real] - FIRSt int [NUNIT int] [BEGIn int] [STOP int] [SKIP int] [IHBF int] [INBF int] [IMGF] int AVERage {DISTance int int} {STRUcture} [PERT] [TEMPerature real] [OLDLambda real] [NEWLambda real] - FIRSt int [NUNIT int] [BEGIn int] [STOP int] [SKIP int] LDINitialize int real real real real RMBOnd RMANgle LDMAtrix LDBI int LDBV int int int int real real int LDRStart LDWRite IUNL int NSAVL int END  File: BLOCK, Node: Function, Up: Top, Previous: Syntax, Next: Hints 1) BLOCk [int] enters the block facility. The optional integer is only read when the block structure is initialized (usually the first call to block of a run) to specify the number of blocks for space allocation. If not specified, the default of three is assumed. 2) END exits the block facility. The assignment of blocks, the coefficient weighting of the energy function, the force/noforce option, etc. remain in place. For the terms of the energy function that are supported, each call to ENERGY (either directly or through MINIMIZE, DYNAMICS, etc. commands) results in an energy and force weighted as specified. The matrix of interaction coefficients is printed upon exiting. 3) CALL removes the atoms specified by "atom-selection" from their current block and assigns them to the block number specified by the integer. Initially all atoms are assigned to block 1. If atoms are removed from any block other than block 1, a warning message is issued. If blocks are assigned such that some energy terms (theta, phi, or imphi) are interactions between more than two blocks, a warning is issued when the END command is encountered. (Take such warnings seriously; this is a severe error and indicates that something is wrong. However, the problem might be not the CALL statement (or the atom selection) itself; quite possibly your hybrid molecule was generated impromperly) 4) LAMBda sets the value of lambda to "real". This command is only valid when there are three blocks active. Otherwise multiple COEF commands may be used to set the interaction coefficients manually. LAMBda x is equivalent to (let y=1.0-x) COEF 1 1 1.0 COEF 1 2 y COEF 1 3 x COEF 2 2 y COEF 2 3 0.0 COEF 3 3 x 5) COEF sets the interaction coefficient between two blocks (represented by the integers) to a value (the real number). When the block facility is invoked, all of the atoms are initially assigned to block 1 and all interaction coefficients are set to one. The required real value (first specified) scales all energy terms expect those specific terms which are named with alternative corresponding scale factors. 6) NOFOrce specifies that in subsequent energy calculations, the forces are not required. This is economical when using the post-processing commands (FREE,EAVG,COMP). Forces may be turned back on with the FORCe command; this is necessary before running minimizations and dynamics if there was a prior NOFO command. 7) FREE calculates a free energy change using simple exponential averaging, i.e. the "exponential formula". If the old and new lambdas (OLDL,NEWL) are specified (can only be done when three blocks are active), the perturbation energy is calculated for these values (i.e. FREE gives you the free energy difference between NEWLambda and OLDLambda via perturbation from the lambda value at which your trajectory was calculated. If not, the current coefficient matrix is used (FREE should be used with three blocks, and the use of OLDL and NEWL is recommended). FIRSt_unit, NUNIt, BEGIn, STOP, and SKIP specify the trajectory/ies that is/are to be read (for a further description see the TRAJ command elsewhere in the CHARMM documentation). TEMPerature defaults to 300 K and gives the temperature value to be used in k_B*T. CONTinuous specifies the interval for writing cumulative free energies. A negative value causes binned (rather than cumulative average) values to be written. Be careful to make sure that you use correct non-bonded lists (see the hints section!) 8) INITialize is called automatically when the BLOCK facility is first entered and may also be called manually at some other point. All atoms are assigned to block one and all interaction coefficients are set to their initial value. 9) CLEAr removes all traces of the use of the BLOCK facility. The next command should generally be END, and then CHARMM will operate as if BLOCK had not ever been called. 10) EAVG The average value of the potential energy during a simulation can be calculated with the EAVG (Energy_AVeraGe) command. The parsing is very much like the FREE command above. The most frequent use of this command is to calculate the average value of dV/dlambda during the course of a simulation for use in thermodynamic integration. CONTinuous specifies the interval for writing cumulative free energies. A negative value causes binned (rather than cumulative average) values to be written. Be careful to make sure that you use correct non-bonded lists (see the hints section!) The command accepts the OLDL / NEWL option, similarly to FREE, but for EAVG it is recommended to set up the interaction matrix (using COEF commands) yourself -- see the hints section. 11) [COMP] The COMP module is essentially a modified version of the EAVG module which aside from calculating = at a given value of lambda l(i) will also give you expectation values of this quantity at l(i+-1), l(i+-2) etc. based on perturbation theory. COMP requires 4 blocks. Put the usual WT (reactant) in block 2 and MUT (product) in block 3. Put the portion of the environment whose contribution to the free energy change is desired into block 4 (this can be everything else, or just a subset) (Note that the same can be achieved easily with the EAVG command) You have to set up your own coefficient matrix. Much of the parsing is like the EAVG command. CONT is not supported. Two special subcommands (required) are DELL and NDEL. The normal output of COMP is evaluated at the lambda of the simulation. However, COMP also evaluates the same ensemble averages perturbed to lambda = lambda +/- {0,1,2,...NDEL}*DELL. This (sometimes) helps the quadrature in thermodynamic integration. Note that NDEL must be at least 1, and DELL should not be zero. (You have to specify these values; the default values will lead to an invalid input, i.e. you bomb...) Be careful to make sure that you use correct non-bonded lists (see the hints section!) A word of warning: If your initial ensemble average (at the lambda of the simulation) is not well converged, then your perturbed values are most likely random numbers. The approach taken by COMP is theoretically sound, but it should only be applied if convergence has been established! The output format of COMP is somewhat messy: COMP first prints = at lambda = lambda - NDEL*DELL lambda - (NDEL-1)*DELL ... lambda lambda + DELL ... lambda + NDEL*DELL; then it prints an average (integral) value over these results. The meaning of this last value is unclear to me. In earlier versions of this documentation, COMP has been recommended over EAVG. In my experience the opposite is true. There is little COMP can do which you can't do with EAVG (aside from obtaining expectation values for dU/dl). (Maybe the unclear output of the COMP module is the main reason why I don't like it). 12) [AVER] The AVERage command is used to extract ensemble average structural properties from a dynamics simulation. Features in this implementation allow averages taken over ensembles that are perturbed from that which the simulation corresponds to. This is particularly useful for calculating the average structure expected at lambda=0.0 from a simulation run at lambda=0.1, for example. One may calculate average structures [STRUcture] and average distances [DISTance int int; where the two integers are the atom numbers between which the average distance is requested], currently. The PERT keyword indicates that a perturbed ensemble from the dynamics trajectory is desired, with TEMPerature giving the temperature to use in the exponential for the perturbation (defaults to 300 K), OLDLambda and NEWLambda are the lambdas for which the simulation was run and for which the ensemble is requested, respectively (only valid if three blocks are active; if these are not specified, the perturbation energy is calculated with the current coefficient matrix), and the remaining keywords are used to specify the trajectory. NOTE: TO THE BEST OF MY KNOWLEDGE THIS COMMAND HAS NOT BE MAINTAINED (so you are on your own if you use it!) 13) LDINitialize specifies input parameters for running lambda dynamics. It sets up the value of lambda**2, the velocity of the lambda, its mass and reference free energy (or biasing potential). E.g, the following input lines set up parameters for the third lambda with [lambda(3)]**2 = 0.4, lambdaV(3) = 0.0, lambdaM(3) = 20.0, and lambdaF(3)=5.0 (note that lambdaF(1) should always be set to zero). LDIN 3 0.4 0.0 20.0 5.0 For more details, see Node Hints, section "lambda-dynamics simulations". 14) LDMAtrix will automatically map the input lambda**2 values onto the coefficient matrix of the interaction energies (and forces) between blocks. 15) LDBI provides an option on applying biasing potentials on lambda variables. The integer value specifies the total number of biasing potentials to be used. E.g, LDBI 3 will include total of 3 biasing potentials in the simulation. 16) LDBV sets up the specific form of the biasing potentials. At the moment, the functional form is of power law and allows three different classes (for details see "the actual simulations"). The input format is LDBV INDEX I J CLASS REF CFORCE NPOWER e.g. LDBV 2 2 3 3 0.0 50.0 4 will assign the second biasing potential acting between lambda(2) and lambda(3). The potential form belongs to the third class with reference value of zero, the force constant of 50 kcal/mol and the power of four. 17) LDRStart is used to restart the lambda dynamics runs. 18) LDWRite specifies the FORTRAN output unit No. and the frequency for writing lambda histogram by assigning an integer to IUNL and an integer to NSAVL. (IUNL and NSAVL can be reset in DYNAmic command, see *note dynamc:(chmdoc/dynamc.doc) ) 19) RMBOnd and RMANgle are used when no scaling of bond and angle energy terms is desired. END  File: BLOCK, Node: Hints, Up: Top, Previous: Function, Next: Limitations A warning is in order: the BLOCK module is quite user-unfriendly, AND the user (=you) has to know what he/she is doing, otherwise you won't get anywhere! (Of course, this could be a blessing in disguise) There are two applications for BLOCK: (i) Mere use as an energy partitioning facility, which may, e.g., very helpful as an alternative to the INTEraction energy command and (ii) use in free energy simulations. The focus here is on free energy applications. The following paragraphs assume that you are familiar with the theory of free energy difference simulations (e.g. Brooks et al. Advances in Chem. Physics, Vol. LXXI, 1988, chapter V); the emphasis here is to show how a rough tool as BLOCK can be used to implement the theory in a program and (of course) how to use it. Using BLOCK in order to calculate a free energy difference consists out of two rather dissimilar parts (as far as practical problems are concerned): (i) Run your system at various values of lambda and save trajectories. (ii) Postprocess the trajectories with the FREE or the EAVG command (possibly COMP), use the quantities which these modules give you to calculate the free energy difference. (i) The actual simulations ========================== It's probably easiest to use a concrete example, and the free energy difference between ethane and methanol in aqueous solution is used for that purpose. BLOCK is a so-called dual topology method (D. Pearlman, JPC 1994, 98, 1487) i.e. one has to duplicate any atom that is different with respect to any of its parameters. In the ethane/methanol case this means that you have to run with a solute which looks something like H1 \ /H4 \ C1E ---- C2-H5 H2 = { } \H6 / C1M --- OG / \HG1 H3 (and there is water.) Conceptually, this system is divided into three regions: environment: water, H1, H2, H3 (the region where nothing changes) reactant: C1E, C2, H4, H5, H6 (ethane half) product: C1M, OG, HG1 (methanol half), where of course the role of reactant and product is interchangeable. The steps involved to start running dynamics are as follows: (1) set up the hybrid (generate psf). In principle straightforward, but there is a practical pitfall: The autogenerate angles and dihedrals option(s) may produce artificial dihedrals/angles between the two/three parts of the system, e.g. you don't want angles H1-C1E-OG etc. or dihedrals H3-C1M-C2-H4 etc. Also, make sure to specify nonbonded exclusions between the reactant and product part, otherwise you'll get endless distance warnings and may even bomb if two atom positions coincide. (2) Place the hybrid into water (stochastic or periodic boundary conditions -- yes, IMAGE is now supported) as usual (3) Partition the system, i.e. enter BLOCK The following script fragment will do the trick: block 3 call 2 sele end call 3 sele end end (reactant and product have to be defined according to your system). BLOCK 3 initializes the block module with 3 blocks, all atoms are in block 1. The two CALL commands bring the reactant and product part of the system into block 2 and 3 respectively. (4) Run the necessary MD simulations. Let's assume that you decide to use the following values of lambda, lambda = 0.1, 0.3, 0.5, 0.7, 0.9. You want to start your simulation at lambda = 0.1 and you have already partitioned your system as shown in (3). (This information is kept within the same script between calls to block, but it is not saved in restart files or the psf, i.e. you have to repeat this step (as well as step (3)) in every input file). Enter block again, e.g. block lamb 0.1 end From now on interactions between the 3 blocks will be scaled according to the following matrix (lambda = l = 0.1 ==> 1-l = 0.9): block | 1 2 3 ------|-------------------- 1 | 1.0 1-l l 2 | 1-l 1-l 0. 3 | l 0. l Please note that BLOCK will first calculate an interaction, then check to which block the two atoms belong and scale the energy (and forces) appropriately. Therefore, if the distance between 2 atoms is zero (e.g. in the ethane/methanol example I would define C1M and C1E on top of each other!) then you need non-bonded exclusions, otherwise you encounter a division by 0 error! The LAMB command is a shortcut for the following sequence of COEF commands, the following code fragment should be self-explanatory: block coef 1 1 1.0 coef 1 2 0.9 coef 1 3 0.1 coef 2 2 0.9 coef 2 3 0.0 coef 3 3 0.1 end BLOCK only accepts and uses symmetric matrices, i.e. it doesn't matter whether you specify COEF 1 2 or COEF 2 1. Whenever you now call the energy routines, the energies/forces returned from them will be scaled according to the matrix you have set up. Minimizers and Dynamcis can be used as always. So you are ready to run dynamics, and for arguments sake say that you run at every value of lambda 10,000 steps equilibration and 20,000 steps production (i.e. you save coordinates to trajectories) You don't need to save every step, every 5th to 20th step is probably more than enough. (If you saved every step you'd obtain highly correlated data, i.e. you have larger trajectories, but you won't gain anything in terms of convergence.) (ii) Post-processing -- how to obtain a free energy difference ============================================================== At this point in our example, you would have five trajectories corresponding to lambda = 0.1, 0.3, ..., 0.9 The BLOCK module now has to be used to obtain the average quantities you need for either the exponential formula (FREE) or thermodynamic integration (EAVG,COMP) from the trajectories you generated in step (i) (1) At this point, some issues regarding the non-bonded list have to be considered. No special considerations were necessary while running dynamics (aside from having some non-bonded exclusions where necessary); you just set up list updates as usual. During post-processing there are two considerations: (a) efficiency -- you just want to calculate the necessary subset of interactions (otherwise your post-processing run will take about as much time as the simulation itself), and (b) proper list-updating. (a) Efficiency: In none of the post-processing routines do you need the interactions between particles that belong to the environment; therefore you should avoid calculating them. This can be done easily by specifying cons fix sele end Note that this is not necessary, but it will reduce the CPU time necessary from hours to minutes (and results are identical!) However, if you had atoms belonging to reactant or product or both FIXed during the simulations in step (i), you MUST NOT FIX them now; otherwise you'll omit contributions. (b) List updating: While the efficiency considerations in principle are optional, you have to follow one of the two strategies below otherwise you'll get erroneous results. If you used IMAGE, you have to use the second protocol! Originally, the BLOCK post-processing commands would not do any list updating. This meant that you had to have a nonbonded list which would include all possible interactions before starting post-processing -- don't forget that you post-process over, e.g., 20 ps and particles will move quite far. You can easily create such a nonbonded list by specifying a CUTNB value of, e.g. 99. or 999. Ang (surely, all possible interactions will be included). A CHARMM script looks approximately as follows: !set up system (psf, initial coordinates) block !partition system end cons fix sele end ==> energy cutnb 99. !open trajectories block !postprocessing end In this case, do not use the inbf, ihbf and imgf options of the post-processing commands, they will default to 0, i.e. no update. This approach, however, CANNOT work with IMAGES! Proper use of IMAGEs requires that the minimum image convention is checked periodically, i.e. particles have to be repartitioned between primary and image region. As the BLOCK post-processing commands now understand INBF, IHBF and IMGF, this doesn't pose a problem. However, the automated update is not supported (if you specify a negative value, you'll get a mild warning and the system will default to +1), and I recommend that you use 1 for all frequencies (don't forget, the frames in your trajectory are several steps apart, i.e. in general an update may be necessary) The above scheme now looks like: !set up system (psf, initial coordinates) block !partition system end cons fix sele end ! set up images if needed ==> energy !open trajectories block eavg inbf 1 ihbf ? (imgf 1) end Unless you have explicit hbond terms, ihbf can of course be 0! (Please note that there may or may not be problems with CRYSTAL, see Limitations section) (2) The actual post-processing commands. In the following I'll explain how to set things up for FREE, EAVG and COMP (as well as why). To speed up things further, you'll also want to specify the NOFOrce option at some point. FREE: This module allows you to calculate the necessary ensemble average for the exponential formula. Using our example, you can for example estimate the free energy difference between l=0.1 (a value at which you ran a trajectory) and l=0.0, or, based on your l=0.1 trajectory the free energy difference between l=0.0 and 0.2 (double wide sampling), i.e. A(0.0)-A(0.1) = -k_B*T*ln _(l=0.1) or A(0.2)-A(0.0) = -k_B*T*ln _(l=0.1) You should set up your system with 3 blocks and the usual environment, reactant and product partitions. Before entering block to issue the free command, you have to open the trajectory/ies. ! all the stuff shown above for non-bond lists open file unit 10 read name dat01.trj block free oldl 0.1 newl 0.0 first 10 nunit 1 [temp 300. - inbf 1 imgf 1] end or, for double wide sampling, the free line would be replaced by free oldl 0.0 newl 0.2 first 10 nunit 1 [temp 300. - inbf 1 imgf 1] Here dat01.trj is the trajectory which contains your 20 ps of dynamics at lambda = 0.1. Based on the oldl/newl values (which correspond to A(newl) - A(oldl)), FREE generates the appropriate interaction matrix, which it prints; I recommend that you try to understand why it generates this matrix! FIRST is the unit number of the first trajectory file (10 in our example), NUNIT is the number of trajectories (1 in our example). These (and the other options regarding the trajectories work as in any other post-processing command in CHARMM, see e.g. the TRAJ command) The update frequencies are optional depending on how you decided to handle your non-bonded updates. temp defaults to T=300 K, cf. equations above. If you specify CONT +n, you'll get a cumulative average every n steps; in this case the last value equals the final result; if you specify CONT -n, you'll get the average over every n frames, plus of course the final result at the end. Note that trajectories are not rewound after use; i.e. before any subsequent FREE (or EAVG,COMP) command you have to rewind (or reopen) them! Once you have all the free energy pieces you need, you simply add them up to obtain the free energy difference (beware of sign errors depending on how you defined oldl/newl) EAVG: The main use of this module lies in obtaining the required ensemble averages for thermodynamic integration. The most significant difference to EAVG is that you have to specify your own interactions matrix. BLOCK uses linear coupling in lambda in the potential energy function, i.e. V(l) = V0 + (1-l)*V_reac + l*V_prod, where V0 contains all the intra-environment terms, V_reac are the intra-reactant and reactant-env. interactions, and V_prod are the intra-product and product-env. interactions, respectively. The quantity of interest in TI is dV/dl; for the above potential energy function we have dV/dl = V_prod - V_reac It's very easy to obtain this quantity from EAVG. Use 3 blocks, partition the system as before. ! all the stuff shown above for non-bond lists open file unit 10 read name dat01.trj block coef 1 1 0. coef 1 2 -1. coef 2 2 -1. coef 1 3 1. coef 2 3 0. coef 3 3 1. eavg first 10 nunit 1 [inbf 1 imgf 1 cont +-n] end You will calculate the average interaction energy over all the frames in the trajectory according to the following (symmetric) matrix 0.0 -1.0 -1.0 1.0 0.0 1.0; i.e. it's easy to see that the above script will give you _(l=0.1). If you post-process the other trajectories (l=0.3, 0.5, ..,0.9) in an analogous fashion, you just have to approximate the TI integral by the trapezoidal formula (for basic Newton Cotes formulae (open and closed) see, e.g., Numerical Recipes), i.e. in this case you would have dA = 0.2 * (dV(0.1)+dV(0.3)+...+dV(0.9)), where dV(0.1) = _(l=0.1), etc. The above is an example of the basic use of EAVG. You automatically get the formal components according to interaction type. Cont +-n works similarly to the FREE case. If you wanted to exclude the intramolecular contributions from ethane and methanol you could set up a slightly different coefficient matrix, i.e. coef 1 1 0. coef 1 2 -1. coef 2 2 0. coef 1 3 1. coef 2 3 0. coef 3 3 0. and you'll get only the solute-solvent contributions. You can use more blocks (m > 3) to extract only a subset of interactions, e.g. block 1: environment - x block 2: reactant block 3: product block 4: x, where x is the region of interest, e.g. a specific sidechain in a protein (but not the one that is mutated!) Using EAVG with an appropriate coefficient matrix, e.g. coef 1 1 0. coef 1 2 0. coef 1 3 0. coef 1 4 0. coef 2 2 0. coef 2 3 0. coef 2 4 -1. coef 3 3 0. coef 3 4 1. coef 4 4 0. will give you (after integration over lambda) the free energy contribution of the interaction of sidechain x with the mutation site. Note that such formal free energy components may be (strongly) path-dependent. These last two examples have hopefully provided a flavor of what can be done with the EAVG module. COMP: This module is also used for thermodynamic integration. It always operates with four (and only four) blocks, just as the advanced example last given for EAVG, so it facilitates COMPonent analysis. Here I want to focus on the second unique aspect of COMP, it's capability to extrapolate additional datapoints, and so I consider in the framework of our ethane/methanol example the "special" case where I want the total free energy difference (as before in EAVG). In order to do this, the system needs to be partitioned as follows block 1: -- block 2: reactant block 3: product block 4: environment Whereas EAVG gave us _l only for those lambda values at which we had actually done the simulations, COMP gives us additional values via perturbation (see Bruce Tidor's thesis). Using ! all the stuff shown above for non-bond lists open file unit 10 read name dat01.trj block coef 1 1 0. coef 1 2 0. coef 1 3 0. coef 1 4 0. coef 2 2 -1. coef 2 3 0. coef 2 4 -1. coef 3 3 1. coef 3 4 1. coef 4 4 0. comp first 10 nunit 1 [inbf 1 imgf 1] dell 0.06667 ndel 1 end will now give us _l at l=0.03334, l=0.1 and l=0.16667. If we use the same script on the other trajectories, we have 15 instead of 5 datapoints for the integration, i.e. we can obtain dA as dA = 0.06667 * (dV'(0.03334)+dV(0.1)+...+dV'(0.96667)), where dV(0.1) = _(l=0.1), etc. and the ' indicates that this is a perturbed quantity. In principle, this should give a better numerical integration; however, in practice everything depends on how well your actual data (l=0.1, 0.3, ...,0.9) are converged. There is no check whether your ndel/dell combination is meaningful; and you cannot run COMP without using the perturbation feature, i.e. NDEL should be set to at least 1 (valid values are 1 through 5). The defaults (if you don't specify ndel/dell) lead to an invalid input (This should be fixed...) (iii) Lambda-dynamics simulations ================================= In an efforts to make the transition from using previous subcommands to running the lambda dynamics as smoothly as possible, we purposely parallel new syntax after the COEF subcommand. There are total of eight new keywords for setting up new dynamics. They are classified according to their functionality. (a) LDINitialize and LDMAtrix These two keywords are basic commands for starting the lambda dynamics run. The correct use of them is tied together with the BLOCK and CALL commands. Using the same example as the one given in "the actual simulations", the input script fragment will be as following: block 3 call 2 sele end call 3 sele end LDIN 1 1.0 0.0 20.0 0.0 LDIN 2 0.9 0.0 20.0 0.0 LDIN 3 0.1 0.0 20.0 0.0 LDMA end Here, the LDINitialize command models after the COEF command with the format LDIN INDEX LAMBDA**2 LAMBDAV LAMBDAM LAMBDAF Several comments are in order. First, notice that [lambda(1)**2] = 1.0 and [lambda(2)]**2 + [lambda(3)]**2 = 1.0. They are quite similar to the inputs of COEF subcommand. However, since one index instead of a pair is required here, only diagonal elements of the interaction coefficient matrix are specified. To fill up the matrix, LDMA is provided to finish the job automatically. In general, if there is total of N blocks, the first one is by default assumed to be the region where nothing changes. Therefore, [lambda(1)**2] = 1.0 is always true. The condition N ____ \ / [lambda(i)**2] = 1.0 (1) ---- i = 2 has to be satisfied for the partion of the system Hamiltonian. Due to some technical reasons in our implementation (details see PDETAIL.DOC), we have used [lambda(i)**2] instead of lambda(i) in our partion of the system Hamiltonian. Next, to make sure the above condition is met at any given simulation step, we have also enforced a condition containing velocities of the lambda variables N ____ \ / lambda(i)*lambdaV(i) = 0.0 (2) ---- i = 2 We used lambdaV(i) = 0.0 in the above script just to simplify the input. As far as the mass parameter lambdaM is concerned, the minimum requirement is that the value of mass has to be chosen such that the time step (or frequency) of lambda variables is consistent with that used for spatial coordinates x, y, z. Since the lambda variable is introduced into the system by using extended Lagrangian, considerations gone into the similar quantities, such as the adjustable parameter Q in a Nose thermostat are applicable to the choice of lambdaM. Some crude estimation can be made by examining the derivative of the system Hamiltonian with respect to the lambda, the curvature (simple harmonic approximation) or energy difference between two end-point states (0 and 1). Our experience has indicated that a conservative choice of the mass, i.e. a little bit heavier mass than that of the crude estimate, serves us well so far. The biasing potential LAMBDAF has two functions: (1) In the screening calculations LAMBDAF corresponds to the free energy difference of the ligands in the unbound state. Such calculations can identify ligands with favorable binding free energy and a ranking of the ligands can be obtained from the probability of each ligand in the lambda=1 state; (2) In precise free energy calculations, LAMBDAF corresponds to the best estimate of free energy from previous calculations. Therefore the estimate of free energy can be improved iteratively. (b) LDBI and LDBV In order to provide better control over simulation efficiency and sampling space, an option of applying biasing (or umbrella) potentials is furnished. LDBI specifies how many biasing potentials will be applied and LDBV supplies all the details. The general input format is LDBV INDEX I J CLASS REF CFORCE NPOWER Let us look at the following script block LDBI 3 LDBV 1 2 2 1 0.2 40.0 2 LDBV 2 3 3 2 0.6 50.0 2 LDBV 3 2 3 3 0.0 20.0 2 end It states that there is total of 3 biasing potentials. The first one (INDEX = 1) is acting on lambda(2) itself (I = J = 2), the second one on lambda(3) and the third one is coupling lambda(2) and lambda(3) together. At the moment, three different classes of functional forms are supported: CLASS 1: __ | CFORCE*(lambda - REF)**NPOWER if lambda < REF V =| | 0 otherwise |__ CLASS 2: __ | CFORCE*(lambda - REF)**NPOWER if lambda > REF V =| | 0 otherwise |__ CLASS 3: V = CFORCE*[lambda(I) - lambda(J)]**NPOWER (c) LDRStart LDRStart is used only if for some reason, e.g. execution of EXIT command, the logical variable QLDM for the lambda dynamics has been set to false. In this case, to restart the dynamics, LDRStart can be used to reset QLDM = .TRUE.. However, if LDIN is also being used in restarting the dynamics, it will automatically reset QLDM. Therefore, LDRS does not need to be called in this case. (d) LDERite LDWRite provides specifications for writing out lambda dynamics, i.e. the histogram of the lambda variables, the biasing potential etc. The integer variable IUNLdm is the FORTRAN unit on which the output data (unformatted) are to be saved. The value of the integer NSAVL sets step frequency for writing lambda histograms. IUNLdm is defaulted to -1 and NSAVL is defaulted to 0. Both IUNLdm and NSAVl can be reset in DYNAmics command (Please refer to *note dynamc:(chmdoc/dynamc.doc) for details). the following script will set IUNLdm with unit No. 8 and NSAVL equal to 10: LDWRite IUNL 8 NSAVL 10 (e) RMBOnd and RMANgle Since each energy term is scaled by lambda, RMBOnd and RMANgle can prevent bond breaking caused by such scaling during dynamic simulations. Alternatively one can fix bonds (and angles) using SHAKE. But is is not always possible. END  File: BLOCK, Node: Limitations, Up: Top, Previous: Hints (1) Please be advised (again) that the AVERage command is unsupported, and I would not be surprised if it does not work (anymore). Unless someone who understands this module better than I do maintains it, I recommend that we remove it. (2) BLOCK now coexists with IMAGE "peacefully" and essentially transperantly to the user. It works correctly for the case of a periodic water-box (cf. the block3.inp testcase). I would, however, check carefully whether things really work before I would use it on something fancier like infinite alpha helices. Similarly, it is not clear to me whether things work with the CRYSTAL facility. If one modifies block3 as to use CRYSTAL instead of IMAGE things (seem to) work. On the other hand, I know that I didn't support XTLFRQ in the post-processing routines as I don't understand its meaning. I'll fix things if someone is willing to help me with the bits and pieces I don't understand. (3) Bond and bond angle terms (including Urey-Bradleys). Be advised that if you run a simulation at lambda = 0 or lambda = 1 you may effectively remove bond (and bond angle terms) as they get scaled by zero. In other words, you would have ghost particles that can move freely through your systems, and this leads to all sorts of nasty side-effects. Furthermore, this approach is not sound theoretically (S. Boresch & M. Karplus, unpublished). So in general, avoid running at lambda = 0 and 1. If you have your bonds constrained you're safe as the constraint will keep things together (that won't take care of angles however!) In order to avoid artifacts from noisy, diverging bond and bond angle contributions throw them out during post-processing, e.g. by using the SKIP BOND ANGL UREY command before starting block post-processing. If you want to see what can go wrong, look at the block2 test-case... CHARMM Element doc/cadpac.doc $Revision: 1.2 $  File: Cadpac, Node: Top, Up: (chmdoc/commands.doc), Next: Description Combined Quantum Mechanical and Molecular Mechanics Method Based on CADPAC in CHARMM by Paul Lyne paul@tammy.harvard.edu * Menu: * Description:: Description of the CADPAC commands * Using:: How to run CADPAC in CHARMM * Installation:: How to install CADPAC in CHARMM environment * Status:: Status of the interface code  File: Cadpac, Node: Description, Up: Top, Next: Usage, Previous: Top The CADPAC QM potential is initialized with the CADPac command. [SYNTAX CADPac] CADPac [REMOve] [EXGRoup] (atom selection) REMOve: Classical energies within QM atoms are removed. EXGRoup: QM/MM Electrostatics for link host groups removed. The syntax of the CADPAC command in CHARMM follows closely that of the GAMESS command.  File: Cadpac, Node: Usage, Up: Top, Next: Status, Previous: Description For complete information about CADPAC input see Chapter 1 in the CADPAC distribution. A QM-MM job using CADPAC needs four input files. The first is the normal CHARMM input file containing the CADPac command. The second file is the CADPAC input file specifying the basis set to be used and the Hamiltonian that is needed. The third and fourth files are libfil.dat and modpot.dat respectively. These are the library and model potential files that are supplied with CADPAC. Cadpac Input File ----------------- For the CADPAC input file the following cards must be present: TITLE, BASIS, ATOMS, RUNTYP, START, FINISH. TITLE: The keyword is always at the start of the input file and is followed by a one-line title on the next line of the input. BASIS: This descirbes the basis set to be used for the QM region if a generic basis set is required. Examples include STO3G,321G,631G,321G*,631G*. These are the most common. Other basis sets are descibed in the CADPAC documentation. It is also possible to run a calculation using specific basis sets for individual atoms. If this feature is required then the BASIS keyword should be ommitted and the LIBRARY keyword is used for each atom in the QM region. For a more detailed description of the library command please refer to the official CADAPC documentation. All the basis sets that are supported by CADPAC are found in the files libfil.dat and modpot.dat. ATOMS: This keyword is always required. RUNTYP: For the purposes of QM-MM calculations this will either be ENERGY for a single point calculation or GRADIENT if the forces are also required. For any minimization or dynamics calculations the GRADIENT keyword should be used. START: This keyword is always required. FINISH: This keyword is always required. Hamiltonians ------------ The Hamiltonian is HF unless otherwise specified. The Hamiltonian can be changed by inseerting the appropriate keyword after the RUNTYP key. For example MP2 Performs an MP2 calculation MP3 Performs an MP3 calcualtion CI Performs a Configuration Interaction calculatio (please refer to the official CADAPC manual) For DFT calculations use the KOHNSHAM keyword: KOHNSHAM LDA MEDIUM GRDWT Performs an LDA calculation with a medium sized grid for numerical quadrature. KOHNSHAM BLYP LARGE GRDWT Performs a non-local BLYP calculation with a large sized grid For other functionals see the official CADPAC manual. CADPAC I/O ---------- CADPAC has hard wired units 1,2 and 3 for the libfil.dat, modpot.dat and cadpac input file so avoid using these elsewhere in the CHARMM stream. Other units that CADPAC commonly uses for the grid, integrals etc are 13,14,18,35,53,and 54. Examples -------- An example of a CADPAC input file to run with CHARMM: TITLE ! Required this is a test ! Put whatever you like on one line BASIS STO3G ! Generic basis set to be used ATOMS ! Required GRADIENT ! Run type. Use this for optimizations START ! Required FINISH ! Required The above input file tells CADPAC to use an STO-3G basis for the atoms in the QM region. CADPAC will perform a gradient evaluation each time that it is called by CHARMM. If you require just a single point calculation without gradients just use ENERGY instead of GRADIENT. The input file above will perform a HF calculation. A DFT calculation is invoked as follows: TITLE ! Required this is a test ! Put whatever you like on one line BASIS STO3G ! Generic basis set to be used ATOMS ! Required GRADIENT ! Run type. Use this for optimizations KOHNSHAM LDA MEDIUM GRDWT START ! Required FINISH ! Required DF jobs are invoked by the KOHNSHAM card which takes the type of functional and grid to be used as arguments. In this case an LDA functional is used. Alternatives include BLYP, B3LYP. For details see the CADPAC distribution. A sample shell script to run CHARMM with CADPAC is: #!/bin/tcsh -f # parameters: # 1 data file name # echo starting date echo $1 set HOME= {where CADPAC data files are} # data set and output in home directory # set data=$HOME/$1.inp set output=$HOME/$1.out2 # make a temporary directory to hold the workfiles cd /tmp mkdir $1 cd $1 # basis set library file assigned to fort.1 # pseudopotential library on fort.2 # the CADPAC input file is copied to UNIT 3 cp $HOME/$1.str fort.3 cp $HOME/$1.par . cp $HOME/libfil.dat fort.1 #cp $HOME/modpot.dat fort.2 # # run the program charmm.exe < $data rm -r ../$1 An example file can be found in test/c25test/cwat.inp. This input file also uses cwat.str and the sample run script runcwat.  File: Cadpac, Node: Status, Up: Top, Next: Top, Previous: Usage CADPAC/CHARMM interface status (February 1997) - CADPAC, GAMESS and QUANTUM keywords cannot coexist in pref.dat - CADPAC recognizes atoms by their masses as specified in the RTF file - The program runs on ALPHA, SGI, C90, IBMRS, HPUX platforms. - There are references to a parallel version in the code. This has not been fully tested yet and so won't be included until a future release. CHARMM Element doc/cff.doc 1.1 ^_ File: CFF, Node: Top, Up: (chmdoc/commands.doc), Next: Usage Consistent Force Field (CFF) * Menu: * Usage:: How to use CFF with CHARMM standalone * Status:: Current status of CFF implementation in CHARMM * Theory:: Basis for, parameterization and performance of CFF * Funcform:: Functional form of the CFF energy expression * Refs:: References to papers describing CFF ^_ File: CFF, Node: Usage, Up: Top, Next: Status, Previous: Top In order to use CFF in CHARMM, the user has to issue the following commands: 1. use cff 2. read cff parameter file 3. (a) read rtf name , or (b) read psf name 4. read sequence ! if input is via the rtf route (step 3 (a)) 5. generate 6. read coord, or ic build ! if input is via the read rtf/sequence route. When using CFF95 or later Step 3a requires a CFF-capable rtf file. This means a file in which BOND records have been replaced by analogous DOUBLE records for cases in which the chemical structure has a double bond. Note that CFF-capable rtf files are *back compatible*. That is, such rtf files can equally well be used for calculations that utilize the CHARMM force field. Thus, it is *not* necessary to maintain two versions of the rtf files. NOTE: 1. no binary parameter files are supported for CFF. 2. CFF is an all hydrogen force field -- i.e., extended atoms are not supported Examples of CFF usage in CHARMM are given in the ccfftest directory. ^_ File: CFF, Node: Status, Up: Top, Next: Theory, Previous: Usage Status of CFF implementation into CHARMM (May 1999) ============================================================= This implementation of CFF in CHARMM is principally due to Rick Lapp (MSI) and William Young (MSI). Features currently supported in CHARMM/CFF (1) energy and first derivatives (2) minimization (3) dynamics (4) most ATOM based cutoff options Major features NOT currently implemented in CHARMM/CFF: (1) second derivatives (2) bonds between primary atoms and image atoms. (3) Cutoff options currently not supported are group-based cutoffs, distance shifting and force-based switching. (4) Fast multipoles. Other known limitations: (1) correlation analysis tools have not been implemented for CFF specific energy terms -- e.g. it is not possible to calculate the correlation function for an out-of-plane bending angle, etc ... (2) only all-atom models (no extended atoms) There are probably other problems/limitations/bugs. Your comments about limitations of the current CFF implementation in CHARMM (and bugs) will be very valuable. Please direct comments to: William Young, MSI e-mail: wyoung@msi.com phone: (619)799-5348 KNOWN BUGS: ^_ File: CFF, Node: Theory, Up: Top, Next: Refs, Previous: Status The aim of the CFF development is a force field that is: * broad, covering a relatively large number of differing functional groups, * accurate, achieved via accurate reproduction of the quantum mechanical energy surfaces, * consistent between differing phases and molecular environments, * applicable to a wide range of molecular properties, * consistent between differing types of molecules, such as interaction of protein active sites with ligands, or assemblies of proteins with nucleic acids or with solvent. Quantum mechanical forcefields The intramolecular parameters constituting the current generation of forcefields are based on the energies and energy derivatives computed by ab initio quantum mechanical procedures for a series of model compounds. CFF uses quantum computations in the Hartree-Fock approximation with the 6-31G* basis set to expand the wavefunctions [1][2]. The quantum mechanical energies and the energy first derivatives (gradients) and second derivatives (Hessians) were computed for the equilibrium molecular structures, at conformational energy barriers, and for a set of distorted structures. The distorted molecular structures were generated by randomly deforming all the internal coordinates, as well as by systematically rotating about individual bonds. These quantum observables were fit to the energy expression to obtain the Class II parameters [3][4]. Many of the atomic partial charges were also determined quantum mechanically. The intermolecular parameters of the forcefield may also, in principle, be computed quantum mechanically [5]. The remaining CFF forcefield intermolecular or nonbond parameters were computed by fitting to experimental crystal lattice constants and sublimation energies of crystals [6][7][8]. Internal energy terms The energy of the molecule or assembly is expressed in terms of internal coordinates such as bond lengths, bond angles, and dihedral angles. For Class II forcefields this set of descriptors is greatly expanded by including cross terms, that is, the interactions between bond lengths and angles, between pairs of angles, etc. CFF contains, in all, twelve types of energy terms: bond stretching, valence angle bending, valence dihedral angles, out-of-plane deformation, and eight cross terms. The cross terms extend the accuracy and range of application of the forcefield by including the effect of neighboring atomic positions on each of the bond lengths, valence angles, and dihedral angles. ^_ File: MMFF, Node: Funcform, Up: Top, Next: Refs, Previous: Theory Energy functional forms The energy expression may be decomposed into diagonal terms that depend on a single molecular internal coordinate such as a bond length, coupling terms between internal coordinates, and nonbond internuclear distances. This energy is fit to the quantum mechanical energy. 1. Bond stretching. Ebond = K2 * (b - b0)^2 + K3 * (b - b0)^3 + K4 * (b - b0)^4 (1) where K2, K3 and K4 are the quadratic, cubic and quartic forcefield parameters or force constants, b is the bond length, and b0 is the reference value of the bond length. 2. Angle bending. Eangle = K2 * Delta^2 + K3 * Delta^3 + K4 * Delta^4 (2) where Delta = Theta - Theta0 is the difference between the actual and reference bond angles. 3. Out-of-plane bending. Eoop = K * (Chi - Chi0)^2 (3) where chi is an out-of-plane coordinate as defined by Wilson et al.[9] 4. Torsion energy, in order to reflect differing hybridizations about the bonded atoms, must contain one-, two-, and threefold periodic terms: Etorsion = SUM(n=1,3) { V(n) * [ 1 - cos(n*Phi - Phi0(n)) ] } (4) where phi is a dihedral angle. 5. Stretch-Stretch interaction between two bonds in a valence angle. Ebond-bond = K(b,b') * (b - b0) * (b' - b0') (5) 6. Stretch-Bend interaction between an angle and its bonds. Ebond-angle = K * (b - b0) * (Theta - Theta0) (6) 7. Bend-Bend-Twist interaction between a dihedral angle and its two valence angles. Eangle-angle-torsion = K * (Theta - Theta0) * (Theta' - Theta0') * (Phi - Phi1(0)) (7) 8. Stretch-Twist interaction between a dihedral angle and its end bonds. Eend_bond-torsion = (b - b0) * SUM { V(n) * cos[n*phi] } (8) 9. Stretch-Twist interaction between a dihedral angle and its middle bond. Emiddle_bond-torsion = (b - b0) * { F(1) * cos(phi) + F(2) * cos(2 * phi) + F(3) * cos(3 * phi) } (9) 10. Bend-Twist interaction between a dihedral angle and its valence angles. Eangle-torsion = (Theta - Theta0) * { F(1) * cos(phi) + F(2) * cos(2 * phi) + F(3) * cos(3 * phi) } (10) 11. Bend-Bend interaction between two valence angles with a common vertex atom. Eangle-angle = K * (Theta - Theta0) * (Theta' - Theta0') (11) 12. Stretch-Stretch interaction between the two end bonds in a dihedral angle. Ebond-bond_1_3 = K(b,b') * (b - b0) * (b' - b0') (12) Finally, the nonbond energy between atoms in different molecules or between atoms separated by three or more bonded atoms is given by the sum of the Coulombic electrostatic interaction and a van der Waals energy of the 9-6 form: 13. Coulombic electrostatic interaction. Ecoul = 332.0716*qi*qj/(D*Rij) (13) where qi and qj are the atomic partial charges on atoms i and j, Rij is the distance between them and D is the dielectric constant. 14. Van der Waals interaction. Evdw = eps(ij) [2*r*(ij)/r(ij)**9 - 3*r*(ij)/r(ij)**6] (14) where r*(ij) = [(r(i)**6 + r(j)**6))/2]**(1/6) (15) eps(ij) = 2 sqrt(eps(i) * eps(j)) * r(i)^3 * r(j)^3/[r(i)^6 + r(j)^6] (16) where eps(ij) and r*(ij) are the negative of the minimum van der Waals energy and that distance between atoms i and j where the minimum occurs, respectively. Eps(ij) and r*ij are computed from the individual atomic parameters eps(i), eps(j), r*i, and r*j by the Waldman-Hagler combination rules [10]. The Hartree-Fock method, and to a lesser extent other quantum mechanical methods, results in systematic deviations from experiment. For example, bond lengths tend to be too short and bond-stretching vibrational frequencies too high [11]. However, by comparison with experimental gas-phase molecular structures and vibrational frequencies, these deviations may be compensated for. In general, the energy expression may be scaled using five constant factors, one for each of the classes of energy terms: bonds, angles, torsion angles, out-of-planes and all coupling terms [12]. The scaled energy is then: Ediagonal = Sb * SUM{Ebond} + Stheta * SUM{Eangle} + Sphi * SUM{Etorsion} + Schi * SUM{Eoop} (17) Ecross = Sc * SUM{eight cross terms} (18) The reference values b0 and q0 are also adjusted to fit experimental data. All these values may differ among different types of bonds, bond angles, and torsion angles. For the special case of hydrocarbons, the corrections are especially well determined by gas-phase measurements. For hydrocarbons, the best values of the scale factors are: Sb(C-C) 0.88 Sb(C-H) 0.83 Stheta (all angles) 0.81 Sphi (all torsions) 0.84 Schi (all out-of-planes) 1.00 Sc (all cross terms) 0.87 The reference bond lengths for hydrocarbons were also adjusted. Although the use of the quantum calculation greatly amplifies the available data so that only a few such corrections are necessary for the complete Class II forcefield, for the majority of functional groups (molecular types) no accurate gas-phase data are available. However, the Sb, Stheta, Sphi, and Sc constants are transferrable among different types of bonds, bond angles, and torsion angles. Therefore, the same scale factors are used in Eq. 17 and Eq. 18 in the final empirically scaled forcefield. In general, the reference values b0 and theta0 are determined from high-level quantum mechanical calculations on the model compounds. Validation of the CFF forcefield Table 1 shows the accuracy of the CFF forcefield for several common classes of molecules, compared with experimental gas-phase results. Table 1. Summary of rms deviations between experimental and CFF-calculated structural parameters, vibrational frequencies, and energy differences. bond valence torsion freq. energy length angle angle diff. (Ang) (deg) (deg) (cm-1) (kcal mol-1) hydrocarbons 0.02 0.9 1.2 40 0.93 alcohols 0.02 1.7 1.7 37 0.71 aldehydes & ketones 0.01 1.1 2.3 32 0.62 amines 0.00 0.9 -- 18 0.62 carboxylic acids 0.02 1.6 1.0 34 0.78 esters 0.02 1.7 0.5 42 1.88 ethers 0.01 0.9 1.1 41 0.40 heterocycles 0.01 1.0 0.0 35 --- sulfides 0.01 1.4 2.5 45 --- disulfides 0.01 0.9 2.0 43 --- thiols 0.01 1.6 1.0 -- 0.21 average 0.01 1.2 1.3 37 0.77 The frequencies are harmonic vibrational frequencies and the energy differences include conformational energy differences and energy barriers to internal rotation between stable conformers. ^_ File: MMFF, Node: Refs, Up: Top, Previous: Funcform, Next: Top References [1] Hariharan, P. C.; Pople, J. A. Theor. Chim. Acta 28, 213-222 (1973). [2] Francl, M. M.; Pietro, W. J.; Hehre, W. J.; Binkley, J. S.; Gordon, M. S.; DeFrees, D. J.; Pople, J. A. J. Chem. Phys. 77, 3654-3665 (1982). [3] Dinur, U.; Hagler, A. T. In Reviews in Computational Chemistry, Vol. 2, K. B. Lipkowitz; D. B. Boyd, Eds., VCH Publishers: New York, 99-164 (1991). [4] Maple, J. R.; Hwang, M.-J.; Stockfisch, T. P.; Dinur, U.; Waldman, M.; Ewig, C. S.; Hagler, A. T. J. Comp. Chem. 15, 162-182 (1994). [5] Dinur, U.; Hagler, A. T. J. Amer. Chem. Soc. 111, 5149-5151 (1989). [6] Hagler, A. T.; Huler, E.; Lifson, S. J. Amer. Chem. Soc. 96, 5319-5327 (1974). [7] Hagler, A. T.; Lifson, S.; Dauber, P. J. Amer. Chem. Soc. 101, 5122-5130 (1979a). [8] Hagler, A. T.; Dauber, P.; Lifson, S. J. Amer. Chem. Soc. 101, 5131-5141 (1979b). [9] Wilson, E. B., Jr; Decius, J. C.; Cross, P. C., Molecular Vibrations; Dover: New York, 1955, Chapter 4. [10] Waldman, M.; Hagler, A. T. J. Comp. Chem. 14, 1077-1084 (1993). [11] Michalska, D.; Schaad, L. J.; Carsky, P.; Hess, Jr., B. A.; Ewig, C. S. J. Comp. Chem., 9, 495 (1988). [12] Hwang, M.-J.; Stockfisch, T. P.; Hagler, A. T. J. Amer. Chem. Soc. 116, 2515-2525 (1994). CHARMM Element doc/cfti.doc $Revision: 1.1 $  File: CFTI, Node: Top, Up: (chmdoc/perturb.doc), Next: Constraints CFTI: conformational energy/free energy calculations * Menu: * Constraints:: Note on constrained optimization implementation * CFTINT:: Description and syntax of standard conformational free energy thermodynamic integration * CFTIM:: Description and syntax of multidimensional onformational free energy thermodynamic integration  File: CFTI, Node: Constraints, Up: Top, Previous: Top, Next: CFTINT Constraints: Energy minimization with holonomic constraints has been implemented. There are no special commands for this option. Charlie Brook's TSM module allows for MD simulations with constrained values of selected conformational coordinates - distances, atoms, dihedrals. This has been expanded to also allow energy minimization using several algorithms. The method is an alternative to using harmonic restraints in generating structures of flexible molecules with desired properties, or generating adiabatic profiles. To use this option, simply enter the 'TSM' module and give set of 'FIX' commands to define set of fixed internal coordinates (see perturb.doc for details). Next specify an energy minimization (see minmiz.doc). Algorithms that work: SD, CONJ, POWE (ABNR works also, for reasons unclear to me, KK)  File: CFTI, Node: CFTINT, Up: Top, Previous: Constraints, Next: CFTIM CFTI: standard (one-dimensional) conformational thermodynamic integration Description of method Method expands the capabilities of the TSM module. The TSM module employs the Thermodynamic Perturbation (TP) approach to conformational free energy simulations. The basis of the calculation is a MD simulation with a constrained value of a conformational coordinate. With minimal modifications, the alternative Thermodynamic Integration (TI) method is added on. In the modified code the user has the option of using TP only (as previously) or activating TI, in which case the same simulation and data files are used to give both TP and TI results. [SYNTAX CFTI] All commands are parsed by the TSM command parser, so should be within a 'TSM ... END' block. CFTI command activates the thermodynamic integration calculation the context of use should be the same as for a thermodynamic perturbation run, i.e. some coordinates should be fixed by 'FIX ,,,', saving data to a disk file should be specified by 'SAVI ...', and one perturbation should be defined by 'MOVE ...' The derivative dA/dx is calculated for the coordinate deifned in the 'MOVE ...' statement. This coordinate has to be also fixed with 'FIX ...'; other coordinates may also be fixed if desired. See test case cftigas.inp for details Notes: 1) The formatted data file generated by 'SAVI ...' may be read by both TI postprocessing command (CFTJ) and TP postprocessing (POST). The SAVI 'NWIN' keyword has meaning only for TP, it can be set to an arbitrary value if TI only is to be used. 2) For consistency with TP, the 'BY ' part of the 'MOVE' command was retained. The value has meaning for TP only, it can be set to an arbitrary number if TI only is to be used. CFTJ [TEMP ] [UICP ] [CONT ] Command to calculate the conformational free energy derivative dA/dx = as well as the energy-entropy components: d/dx, -TdS/dx Data is read in from the formatted file generated by the 'SAVI ...' command TEMP - specifies temperature, needed for energy-entropy components UICP - specifies unit with data CONT - defines length of data block for error analysis e.g. if data file has 1000 entries, 'CONT 100' will divide data into 10 blocks and calculate the standard deviation of the mean of the block averages CFTA [FIRSt ] [NUNIt ] [BEGIn ] [STOP ] [SKIP ] [CONT ] [TEMP ] Command activates analysis of CFTI-generated trajectory. Trajectory coordinate file(s) should be in consecutive units FIRST, NUNI, BEGIN, STOP, SKIP - define trajectory reading CONT, TEMP - as in CFTJ Examples of usage : see test case cftigas.inp  File: CFTI, Node: CFTIM, Up: Top, Previous: CFTINT, Next: Top CFTM: multidimensional conformational thermodynamic integration Description of method This is a new approach. MD simulations are performed with several conformational coordinates simultaneously constrained to fixed values. The partial derivatives of the conformational free energy with respect to all the coordinates in the fixed set are calculated from this one simulation. The free energy gradient may be used in different ways to explore conformational free energy surfaces of flexible molecules. Method expands the capabilities of the TSM module. Only TI calculations possible, no corresponding TP analysis possible. [SYNTAX CFTM] All commands are parsed by the TSM command parser, so should be within a 'TSM ... END' block. CFTM command activates the multidimensional TI method the context of use should be the same as for a thermodynamic perturbation run, i.e. several coordinates should be fixed by 'FIX ,,,', saving data to a disk file should be specified by 'SAVI ...'. and a perturbation should be defined by a 'MOVE ...' statement for each of the fixed coordinates. See test case cftmgas.inp for details Notes: 1) The formatted data file generated by 'SAVI ...' may be read by both TI postprocessing command (CFTJ) and TP postprocessing (POST). The SAVI 'NWIN' keyword has no meaning in CFTM. The CFTM data file format is different than that for CFTI. 2) For consistency with TP, the 'BY ' part of the 'MOVE' command was retained. The value has no meaning in CFTM. 'INTE' keyword has to be specified within the 'MOVE' command. CFTC [TEMP ] [UICP ] [CONT ] Command to calculate the conformational free energy derivatives dA/dx_i = as well as the energy-entropy components: d/dx_i, -TdS/dx_i Data is read in from the formatted file generated by the 'SAVI ...' command TEMP - specifies temperature, needed for energy-entropy components UICP - specifies unit with data CONT - defines length of data block for error analysis e.g. if data file has 1000 entries, 'CONT 100' will divide data into 10 blocks and calculate the standard deviation of the mean of the block averages Output includes all individual partial derivatives, and optionally their analysis into groups. The derivative with respect to a path direction is also calculated. CFTB [FIRSt ] [NUNIt ] [BEGIn ] [STOP ] [SKIP ] [CONT ] [TEMP ] Command activates analysis of CFTM-generated trajectory. Trajectory coordinate file(s) should be in consecutive units FIRST, NUNI, BEGIN, STOP, SKIP - define trajectory reading CONT, TEMP - as in CFTJ Output is the free energy gradient with respect to the set of fixed coordinates, the derivative along a specified direction (see DIRE) and optionally a group contribution analysis. CFTS [FIRSt ] [NUNIt ] [BEGIn ] [STOP ] [SKIP ] [CONT ] [TEMP ] [DUNI ] Analogous to CFTB, additionally writes out potential energy and dU/dx_i to a disk file specified by DUNI. NCOR NUMB NUMB specifies the number of internal coordinates involved (=NICP). Used in calculating the path derivative. DIRE LAMB The LAMB value specifies number of step (progress along reaction path). The following line(s) contain NICP real numbers defining a path vector. The vector will be normalized in side the program. The unit vector will be used to calculate derivatives of dA/dl, d/dl, -TdS/dl along the path from the gradients. Note: the vector components are read in free format CFTG NGRUp Define groups for group contribution analysis to free energy NGRUP is the number of groups. The following line(s) contain the integer group numbers of the coordinates (LGRUP(J),J=1,NICP) in free format After that follow line(s) with group symbols (i.e. tags that will be used to denote the groups) in (20A4) format (GSYM(J),J=1,NGRUP) Example of usage: The system is a decapeptide, we calculate derivatives with respect to all phi and psi backbone dihedrals (NICP=18). In the 18 'MOVE ...' commands we specify the 9 phi first and the 9 psi at the end. The following will calculate and print out an aggregate of all phi and all psi contributions labelled by the tags 'PHI' and 'PSI': cftg ngrup 2 1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2 PHI PSI cfts Note on sign of derivatives: In both CFTI and CFTM it is possible to obtain a derivative value with incorrect sign by cleverly manipulating the atom selections in the 'MOVE ...' command. A simple way of checking the sign is to run a 1-D test case using both TI and TP postprocessing (see test case cftigas.inp). A general rule is to think about how the coordinate is defined and how motions of fragments influence it. E.g. for a distance between atoms A and B, the coordinate is the length of the vector from A to B. Perturbations (TP) involve actual displacements of A and B along the =vector= from A to B; Derivative calculations (TI) do not involve actual motions of atoms, but rather predictions of how atomic positions will vary with infinitesimal coordinate changes. Moving B along this coordinate by delta > 0 will increase the coordinate, while moving A by delta will decrease the coordinate. I.e. to get correct sign of derivative you have to use the following scheme: either FIX DIST MOVE DIST BY 1.0 INTE - sele end or FIX DIST MOVE DIST BY 1.0 INTE - sele end sele end see test cases cftifas.inp and cftmgas.inp for examples. CHARMM Element doc/changelog.doc 1.1  File: ChangeLog, Node: Top, Up: (chmdoc/charmm.doc), Previous: (chmdoc/developer.doc), Next: (chmdoc/parallel.doc) CHARMM Developer's Change Log Entries in each node are recorded by CHARMM developers to indicate new and modified features of CHARMM during the development cycle, i.e., the alpha version period. ------------------------------------------------------ CHARMM22.0.b Release April 22, 1991 CHARMM22.0.b1 Release September 30, 1991 CHARMM22 Release January 1, 1992 c22g1 Release February 15, 1992 c22g2 Release July 7, 1992 c22g3 Release November 3, 1992 c22g4 Release March 1, 1993 c22g5 Release August 1, 1993 CHARMM23.0 c23a1 Developmental August 15, 1992 c23a2 Developmental October 25, 1992 c23f Developmental March 1, 1993 c23f1 Developmental March 15, 1993 c23f2 Developmental August 15, 1993 c23f3 Release February 1, 1994 c23f4 Release August 15, 1994 c23f5 Release March 15, 1995 CHARMM24.0 c24a1 Developmental February 15, 1994 c24x1 Evaluation February 15, 1994 c24a2 Developmental August 15, 1994 c24a3 Developmental March 15, 1995 c24b1 Release August 15, 1995 c24b2 Release February 15, 1996 c24g1 Release August 15, 1996 c24g2 Release February 15, 1997 CHARMM 25.0 c25a0 Developmental August 15, 1995 c25a1 Developmental February 15, 1996 c25a2 Developmental August 15, 1996 c25a3 Developmental February 15, 1997 c25b1 Release August 15, 1997 ------------------------------------------------------ * Menu: * C21-C22:: Modifications of Developmental CHARMM21 to CHARMM22 * C20-C22:: Major enhancements and developments in CHARMM22 * C22-C23:: Major enhancements and developments in CHARMM23 * C23-C24:: Major enhancements and developments in CHARMM24 * C24-C25:: Major enhancements and developments in CHARMM25  File: ChangeLog, Node: C21-C22, Up: Top, Previous: Top, Next: C20-C22 Summary of Modifications of Developmental CHARMM21 to CHARMM22 ------------------------------------------------------------------------------ Linear pressure ramping added to CPT code (see pressure.doc) ------------------------------------------------------------------------------ Frequency based crystal update is now supported Relevent new keyword is IXTFrq (see image.doc) ------------------------------------------------------------------------------ Constant Pressure and Temperature (CPT) dynamics (See PRESSURE.DOC) TRICLINIC unit cell is now supported. ------------------------------------------------------------------------------ Miscellaneous commands: UPPEr and LOWEr keywords added (see miscom.doc) ------------------------------------------------------------------------------ Minimization: new keyword (FMEM) for ABNER minimizer (see minimiz.doc) ------------------------------------------------------------------------------ Internal coordinates (see INTCOR.DOC) New commands: IC SAVE IC RESTore IC RANDom [iseed] Internal coordinates converted to double precision. ------------------------------------------------------------------------------ Coordinate Manipulation (See MISCOM.DOC and CORMAN.DOC) New inline command varaibles added: ?THETa , ?XMOVe , ?YMOVe , ?ZMOVe , ?RMS New CORMAN commands added: COOR HELIx COOR PUCKer COOR COVAraince COOR SEARch ... RBUFF ... ------------------------------------------------------------------------------ Energy, Angles Urey-Bradley 1-3 terms have been added as an option. Format of parameter file affected. (See IO.DOC) Energy analysis code added (ANALysis ON command). (See ANALYS.DOC) ------------------------------------------------------------------------------ NOE distance restraints (See CONS.DOC) Overhaulled to become a general distance restraint term. Commands syntax overhaulled as well. ------------------------------------------------------------------------------ PSF common structure modified Unused PSF arrays removed. All size limits increased. Binary file format changed to INTEGER*4 and REAL*8 PSF numbers added to ?variable list (See MISCOM.DOC). ------------------------------------------------------------------------------ Output redirecting implemented. (See MISCOM.DOC) OUTU replaces all writes to unit 6. ------------------------------------------------------------------------------ ATLIM modified to allow a limit of several days. PASMID has been changed to an integer which points the current day. See MISCOM.DOC ------------------------------------------------------------------------------ Free energy perturbation commands added. (See PERT.DOC) Several new commands and features have been modified to allow free energy perturbation simulations to be performed. ------------------------------------------------------------------------------ Partition function and classical free energy codee added to the vibrational analysis code. (See VIBRAN.DOC) Atom selection added for EDIT commands. Atom selection added for WRITE SECOnd-derivatives CARD command. ------------------------------------------------------------------------------ New time series commands and options (See CORREL.DOC) ENTER PUCKer ENTER HELIx ENTER RMS ENTER ENERgy ENTER RMS [MASS] atom-selection ENTER ATOM CROSsproduct ENTER FLUC CROSsproduct ENTER VECT CROSsproduct ENTER HBOND ENTER MODE ENTER RMS [MASS] [ORIEnt] ... TRAJ ... atom-selection MANTIME SQUARE (vectors now allowed) MANTIME ABS (vectors now allowed) MANTIME ACOS Off-by-one error removed in time series data (time series now do not start at time zero, but at time DELTA*SKIP). ------------------------------------------------------------------------------ Langevin dynamics modified. An improved algorithm has been incorporated which gives a more accurate integration at low gamma values as well as the proper brownian dynamics limiting values in the large gamma limit (and is more efficient). The gaussian random generator has been replaced to give a much more accurate distribution and uses only one random number call per atom by using an error function lookup table. ------------------------------------------------------------------------------ Miscellaneous commands added. (See MISCOM.DOC) DIVIde, EXONent, RANDom, and SHOW New miscellaneous variables added. ?RAND, ------------------------------------------------------------------------------ Precision and index limits improved. The entire program (except for the graphics section) has been converted to REAL*8 and INTEGER*4 from REAL*4 and INTEGER*2. ------------------------------------------------------------------------------ Constant Pressure and Temperature (CPT) dynamics added. (See PRESSURE.DOC) Pressure analysis code added. NTRFRQ usage modified so that it works for IMAGES and CRYSTAL. ------------------------------------------------------------------------------ Heuristic nonbond update feature added. (See NBONDS.DOC) ------------------------------------------------------------------------------ New (consistent) energy print format with search line indicators. ------------------------------------------------------------------------------ Graphics subsection added for workstations. ------------------------------------------------------------------------------ New GRADient option added for most minimization methods for searching for saddle points. ------------------------------------------------------------------------------ FAST option is now the default. It is no longer necessary to have the command "FAST 1" in order to use the efficient energy routines. ------------------------------------------------------------------------------ Constrained reference now only set for selected atoms for the CONS HARMonic command (the old method limited versatility). (See CONS.DOC) ------------------------------------------------------------------------------ Parallelization for shared memory multi-processor machines has been implemented. Functionality for the fast energy routines has been increased. The vector/parallel routines will now to no electrostatics and novdw as well as simple cut-offs. ------------------------------------------------------------------------------ SPECIfy command. Controls various options such as I/O buffer flushing maximum number of processors to be used and whether to use the fast nonbond list generator. ------------------------------------------------------------------------------ SYSTem "unix bourne shell commands" This command permits the user to issue Unix shell commands from the program. The command string must be enclosed in double quotes to prevent the CHARMm parser from converting the string to uppercase. ------------------------------------------------------------------------------ SHAKE FAST This command specifies the use of the new vector/parallel SHAKE ------------------------------------------------------------------------------ Deleted Features: The old VAX analysis facility has been removed. Sigma van der Waal switching and shifting options has been removed. BARRI command removed. ------------------------------------------------------------------------------  File: ChangeLog, Node: C20-C22, Up: Top, Previous: C21-C22, Next: C22-C23 Major Enhancements and Developments in CHARMM22 As CHARMM20 is not clearly defined, it is not straightforward to sort out major differences between the current version of CHARMM (CHARMM22.0) and a previous version (CHARMM20 or CHARMm21). The VAX version CHARMM on HUCHE1 turns out to be a "developmental" version towards CHARMM21 and contains the crystal facility, BLOCK, etc. The following is prepared by comparing the developmental VAX version CHARMM21 source code and that of CHARMM22.0. Obsolete Modules Deleted from CHARMM20 -------------------------------------- [1] GRAMPS It is supported only in the VAX version CHARMM20. TH:[MK.PROT.SOURCE.VAX]GRAMPS.FLX contains an interactive routine that writes several files for the command language interpreter for producing computer graphics on the Evans & Sutherland Multi-Picture-System called GRAMPS. This obsolete feature is no longer supported in CHARMM22. [2] PARAmeter Optimization PARMOP is not incorporated in the VAX version CHARMM20 either except at the point of command parsing. It seems that the feature has never been included in the central version. New Features in CHARMM22 ------------------------ [1] BLOCK The developmental CHARMM21 VAX version supports some BLOCK commands. The BLOCK commands are used to partition the molecular system into blocks and allows for the use of coefficients that scale the interaction energies between the blocks. Specific commands to carry out free energy simulations with a component analysis scheme have been implemented. [2] CRYStal The CRYStal commands are used to build a crystal with any space group symmetry, to optimize its lattice parameters and molecular coordinates and to carry out a vibrational analysis. The CRYSTAL program is incorporated into the IMAGE module. The VAX developmental version has a separate CRYSTL module. [3] COOR COVAri The new COORdinate subcommand COVAriance is added. It computes covariances of the spatial atom displacements of a dynamics trajectory for selected pairs of atoms. [4] CORR HELIx / CORR PUCKer The New CORRelation commands HELIx and PUCKer are introduced. The HELIx command computes time series of the helical axis orientation and PUCKer computes that of the sugar pucker phase and amplitude. [5] DRAW, GRAP The new module GRAPHICS provides CHARMM the capability of displaying molecular structures when run on a graphics workstation. (Currently works only on Apollo machines.) [6] HBTRim The HBTRim command deletes hydrogen bonds that have an energy of interaction that is higher than the specified cutoff. This command is used to reduce a list of hydrogen bonds to that of important hydrogen bonds. [7] MOLVIB MOLVIB is a general purpose vibrational analysis program, suitable for small to medium sized molecules (less than 50 atoms). It performs canonic force field calculations (KANO), crystal normal mode analysis for k=0 (CRYS) and other vibrational analyses in internal coordinates or in Cartesian coordinates. Details are documented in molvib.doc. [8] PERT The PERTurbe command allows the scaling between PSFs for use in energy analysis, comparisons, slow growth free energy simulations, and widowing free energy simulations. This is a rather flexible implementation of free energy perturbation that allows connectivity to change. Also, three energy restraint terms (harmonic, dihedral and NOE) are subject to change which allows a flexible way in which to compute free energy differences between different conformations. [9] QUANTUM Quantum mechanical and molecular mechanical combined force field method is implemented by employing the semi-empirical SCF method of the MOPAC program. This module has not been tested nor documented. The code does not confirm CHARMM coding standards. The future of the code is not certain at the time of the current release. [10] RMSD The new RMSDyn routine is a modified CORMAN routine by William D. Laidig, which computes the RMS difference between two trajectory files and make a matrix of results. [11] RXNCOR The RXNCor command is used for defining a reaction coordinate for any molecule based on its structure and impose an umbrella potential along that reaction coordinate (i.e., to run activated dynamics along this coordinate) in order to trace out the free energy profile during the structural change along the coordinate. [12] SOLANA The solvent analysis facility computes solvent averaged properties, e.g., the solvent velocity autocorrelation function, mean-square displacement function, solvent-solvent radial distribution functions, solvent-reference site radial distribution function, and the solvent - reference site deformable boundary force. [13] TRAJ The new TRAJectory command is used to merges or to break up a dynamics coordinate or velocity trajectory into different numbers of units. [14] TSM The Thermodynamics Simulation Method module performs the free energy simulation. [15] Urey-Bradley Energy Term Urey-Bradley 1-3 terms have been added. The developmental CHARMM21 also includes U-B terms. [16] Update Two new non-bonded neighbour list updating schemes are introduced; one has something to do with an automated updating procedure and the other with the list generation algorithm. When INBFRQ is set to -1 (which is the default), heuristic testing is performed every time ENERGY is called and a list update is done if necessary. A new routine NBNDGC (nbndgc.src), a modification of NBONDG, is introduced. NBNDGC is based on a cubical grid searching algorithm and generates the nonbonded list in linear time, as opposed to quadratic. On the Convex C220, which is a vector machine, it is faster than NBONDG for any system larger than a few hundred atoms. [17] Integrator The leap-frog integrator has been implemented. While the "old" Verlet integrator is still available via the DYNA VERLet command (and is the default), the new integrator can be accessed by DYNA LEAP. The velocity Verlet integrator is also added in CHARMM. This new velocity Verlet integrator can be called by DYNA VVER. [18] Constant Pressure & Temperature Dynamics (DYNCPT) The constant pressure/temperature dynamics algorithm is implemented following the paper by Berendsen et al. (J. Chem. Phys. (1984) 81(8) p.3684). Modification of CHARMM20 to CHARMM22 ------------------------------------ [1] ANALysis The VAX version analysis facility is replaced by an energy contribution array (ECONT). All evaluated energy terms are partitioned into each atomic contribution and collected in the array, which is accessible through the SCALAR command. [2] XRAY The XRAY command of CHARMM20 is replaced by the READ XRAY command in CHARMM22. In CHARMM22, all I/O functions are parsed in mainio.src. The subroutine XRAY is changed to RDXRAY, which generates a card file compatible with Richard Feldmann's XRAY display program. [3] NOE NOE constraint has been overhauled. It now handles general distance restraint terms. [4] MISCOM The miscellaneous command parser (miscom.src in CHARMM22) is modified. (1) The SKIPE command is parsed in MISCOM. (2) New command parameter (@x) handling commands are added: DIVIde, EXPOnentiate, GET, MULTiply and SHOW. (3) The RANDOM command is added to set random number specifications. (4) The STOP command is parsed in MISCOM. (5) The QUICk (or Q) command is added to carry out a quick coordinate analysis. [5] HANDLE The subroutine HANDLE is improved to accept command line arguments given with the CHARMM command issued to an operating system. It works on most UNIX, UNICOS and VAX/VMS versions. [6] Command Parameters In CHARMM20, we have ten command parameters @n, where n is a single digit, 0 through 9. It is expanded to support any single alpha-numeric character so that one can use upto 36 command parameters (0-9, a-z). [7] Dynamic Memory Allocation Most of UNIX versions now support VEHEAP. VEHEAP was originally implemented by employing VAX/VMS system calls. It expands the HEAP common block when more HEAP space is needed. In UNIX versions, we use the UNIX system library routine malloc(), if available (the availability depends on the machine), to perform the same function. [8] File Format / Compatibility All binary files except dynamics trajectory are written in double precision format and not compatible with old versions. For PSF, topology, parameter, etc. one should use CARD format to transfer previous version files to CHARMM22. Trajectory files are written in single precision and compatible with all CHARMM versions and QUANTA. Old version dynamics restart files are not compatible with CHARMM22. [9] Random Number Generator All random number routines are implemented in double precision (64-bit words). Box-Muller algorithm is used for generating a Gaussian random deviat. A machine specific random number routine (RANV of CONVEX VECLIB) is used in a CMU version.  File: ChangeLog, Node: C22-C23, Up: Top, Previous: C20-C22, Next: C23-C24 Major Enhancements and Developments in CHARMM23 As an on-going project, CHARMM development has been carried out with CHARMM version 23 series. CHARMM development entails two objectives. First, we maintain an integrated macromolecular science package running on a wide range of computing devices. Second, we incorporate and exploit molecular simulation methodologies at the frontier of current research. In order to establish the first objective, we maintain all source and support files under CVS (Concurrent Versions System) control. The ROOT repository is tammy.harvard.edu:/prog/chmgr/CVS. CHARMM23 is stored in /prog/chmgr/CVS/c23a. A particular version is retrieved with the version name as the rivision tag (e.g., c23f3). Since we branched out from the CHARMM22 release version c22g2, we have made two alpha versions and four FORTRAN versions. c23a1 Developmental August 15, 1992 c23a2 Developmental October 25, 1992 c23f Developmental March 1, 1993 c23f1 Developmental March 15, 1993 c23f2 Developmental August 15, 1993 c23f3 Release February 1, 1994 c23f3 is the current release version. As the "f" in c23f stands for FORTRAN version, we converted FLECS source into FORTRAN. The conversion task had been completed as of c23f2. Now CHARMM is written in full FORTRAN except several machine dependant codes written in C. The universal languages (C and FORTRAN) make it easier to port to new machines in a broad range of architectural designs and to incorporate new methodologies into a research version of CHARMM. During the c23 development cycle, we have added and tested several new features as described below. We have also ported c23 to new machines and supported c23f versions on the following platforms. Platforms Supported ------------------- RREFX key Platforms ----------- ------------------------------------- ALLIANT Alliant ALPHA DEC alpha workstation APOLLO HP-Apollo, both AEGIS and UNIX ARDENT Stardent CONVEX Convex Computer CRAY Cray Research Inc. DEC DEC ULTRIX HPUX Hewlett-Packard series 700 IBM IBM-3090 running AIX IBMMVS IBM's MVS platform IBMRS IBM RS/6000 IBMVM IBM's VM platform IRIS Silicon Graphics MACINTOSH Apple Macintosh computers (system 7) SUN Sun Microsystems VAX Digital Equipment Corp. VAX VMS New Features in CHARMM23 ------------------------ [1] Cray Fast Code - Douglas J. Tobias Vector/parallel code for energy calculation, shake, and nonbonded list generation on the Cray was implemented. Dynamic heap and stack allocation on the Cray was added. [2] PARALLEL - Bernard R. Brooks General code for support of CHARMM on MIMD machines is completed. This includes control of the I/O levels for all file I/O. For parallel machines or workstation clusters, only node zero performs I/O and it broadcasts are to other nodes. All compuationally intensive code exercised in MD is now fully parallel which includes: DYNAMC, ENERGY (and most subsections), SHAKE, PRSSRE, DYNLNG, IMAGES,... Almost all comutationally intensive code in the first order minimizers is fully parallel. Other usage of the energy routines are parallel (such as the energy time series in CORREL). [3] Dynamics Integrator 3.1 Leap-Frog Integrator - Bernard R. Brooks Berendsen's method was modified so that it would work for very small systems and for very weak coupling constants. Now it is possible to use SHAKE with CPT and get correct pressures and temperatures. Another change is to calculate the change in potential energy due to the constant pressure algorithm. The energy lost due to the changes in box size is now added to the kinetic energy during the constant temperature procedure. This allows the constant presure code to nearly conserve energy and allows the constant temperature code to be used with weak coupling times. This correction was made when we found that water box simulations with the Berendsen's method were running about 10 degrees too cold when both temperature and pressure coupling times of 1ps were used. Now the correct target temperature is achieved, even in the limit of very weak couplings. 3.2 EULER Dynamics Integrator - Bernard R. Brooks The incorporation of of the Langevin/Implicit Euler dynamics integrator has been achieved. The effect is to remove the energy in the high frequency degrees of freedom which eliminates the noise in free energy studies where bonds are being modified. To support the Implicit Euler integration, a Truncated Newton Minimizer has been added. This minimizer may be used directly using the MINI TN command. The minimizer is not yet fully implemented (it works, but is not as efficient as it will be), but it is already very competitive relative to existing minimization methods. MINI TN does not work with SHAKE. This code has been developed by Tamar Schlick at NYU. It has been integrated within CHARMM with some modifications. 3.3 EHFC: High Freequency Correction - Bernard R. Brooks The leap-frog dynamics integrator has been modified to have an improved high frequency correction (HFC) term. With the old term, energy was conserved within a harmonic degree of freedom, but total energy would drift as energy exchanged between high and low frequency degrees of freedom. The new code avoids this problem. The total energy and kinetic energy that is printed in the first line of dynamics energy printout has reverted to the standard Verlet energies, and these match the output of the old integrator. The HFC terms (total energy, and kinetic energy) are now printed on the second line. The fluctuation of the HFC total energy is usually an order of magnitude smaller than that of the total energy. The HCF total energy is a good indicator of problems with NVE dynamics because small changes in total energy are not lost in the noise of high frequency oscillations. 3.4 Velocity Verlet Integrator - Masa Watanabe Velocity Verlet method has been implemented. Two integrator (Verlet and Leap-frog) methods presented in CHARMM have their own flavors, but Verlet method handles velocities rather awkward and may introduce some numerical imprecision. On the other hand, the Leap-frog integrator minimizes loss of precision on a computer, but it does not handle the velocities in a satisfactory manner. Velocity Verlet integrator can store positions, velocities, and accelerations all at the same time and minimizes round-off error. 3.5 Nose-Hoover Constant Temperature Method - Masa Watanabe The constant temperature method has been implemented based on S. Nose, JCP 81, 511 (1984) and W.G. Hoover, Phy. Rev. A 31, 1695 (1985). This is an another type of constant temperature method, but an equilibration time in the vicinity of the desired temperature is faster than other routines which are available in CHARMM. Also multi-temperature controls are also developed in order to equilibrate the system faster and keep the system in the desired temperature well. This method works with Verlet and Velocity Verlet integrators. 3.6 Multiple Time-Scaled Method - Masa Watanabe Tuckerman et al proposed a reversible RESPA algorithm recently (Tuckerman, Berne, Martyna, JCP 97, 1990 (1992)). Previous MTS methods have the disadvantages of loosing accuracy due to the approximation of holding the slow variables fixed while integrating the equations for the fast variables. But in this reversible RESPA equations of motions are derived from Liouville operators and Trotter theorem. The method gives more accurate dynamics than previous methods. In this implementation, one can specify up to three different time steps in dynamic simulation run. [4] RISM (Reference Interaction Site Model) - Georgios Archontis The RISM module allows the user to calculate the site-site radial distribution functions g(r) and pair correlation functions c(r) for a multi-component molecular liquid. These functions can then be used to determine quantities such as the potential of mean force or the cavity interaction term between two solute molecules into a solvent, and the excess chemical potential of solvation of a solute into a solvent. The change in the solvent g(r) upon solvation can be determined and this allows for the decomposition of the excess chemical potential into the energy and entropy of solvation. [5] MMFP (Miscellaneous Mean Field Potential) - Benoit Roux The MMFP Commands are primarily used for setting up special restraining potentials on some or all of the atoms. The key word MMFP is used to enter the MMFP environement. In the MMFP environment, all miscelaneous commands (label, goto, if, etc...), and string substitutions (with @1, @2, etc...) are supported. The key word END returns to the main parser. The restraining potentials are used in all energy calculations, unless SKIP is used. The subcommand RESET clears the potential. This module is still under developement and only the subcommand GEO is released. The subcommand GEO (standing for geometrical) is used to setup various restraining potential (spherical, planar or cyclindrical restraints) on some or all atoms. The selection specification should be at the end of the command. The default atom selection includes all atoms. Future subcommands will include continuum electrostatic reaction field and solvent mean field potentials. Expected date of release is Spring 1994. [6] NMR Analysis - Benoit Roux The NMR commands may be used to obtain a set of time series for a number of NMR properties from a trajectory. Among the possible properties are relaxation rates due to dipole-dipole fluctuations (T1, T2, NOE, ROE), chemical shift anisotropy and Deuterium order parameters for oriented samples. [7] REPLICA - Leo Caves Tool to support LES and MCSS calculations. Performs replication of arbitrary regions of PSF. Data structure interfaces to non-bond list generation routines, to perform appropriate exclusions. In association with BLOCK can provide appropriate energy/force normalizations for various classes of methods employing replicas. Introduced REPLICA and REPDEB preprocessor directives. Code for cray multi-tasking list generation routine used inference and has not been tested. Convex parallel code works fine. Added miscellaneous parameters to report number of atom/group pairs from non-bonded routines: ?NNBA, ?NNBG, ?NNBI for atom/group/images respectively. For replica-based exclusions from the list there are ?NRXA and ?NRXG for atom and group exclusions. [8] Clustr code integrated into CORREL - Charles L. Brooks III The CLUSTER command clusters time series data obtained within the CORREL facility. The data are grouped into sets with similar time series values, using euclidean distance as the dissimilarity measure between different time frames of a set of time series. It is useful, for example, for grouping together similar conformations or energy levels. [9] GRAPHICS - Richard M. Venable Graphics code converted to FORTRAN and overhauled. Versions that work with Xwindows and GL are in progress. A new preflx keyword, NODISPLAY, builds a version which produces HPGL, PLUTO FDAT, and LIGHT.atm files without requiring any screen display capabilities. The SG (IRIS) code incorporation is relatively untested. Postscript file output similar to HPGL (but much nicer looking, hopefully) is also implemented. Major Modifications ------------------- [1] Command Line Handling 1.1 Extension of Command Line Parameter Handling - Leo Caves A command line parameter token can now be a string rather than just one of the single characters 0-9 and A(a)-Z(z). For substitution, a token is indicated by the use of the @ character as before. The token is end-delimited by any non-alphanumeric character. In the case that the token is not found in the parameter table, a check is made to see if the first character of the token is itself a token in the parameter table. If this single character token is in the table, the corresponding value is substituted -- this is the necessary scheme to allow backwards compatibilty with the old parameter substitution, which allowed parameters embedded in strings. For unambiguous token detection, "protect" the token with brackets {} --- this allows for the use of non alphanumerics in tokens such as -, _. 1.2 New Parsing Options - Bernard R. Brooks The IF command will be expanded to allow commands such as: IF ?ENER .GT. ?VDW THEN GOTO label or IF ?NSEL .LT. 8 THEN GOTO label 1.3 MSCNUM - Bernard R. Brooks New code for flexible miscellaneous command substitutions has been fully incoporated. Additional types were needed to make this code more flexible. Three types are supported, REAL(*8), INTEGER, CHARACTER. There are three subroutines which can be called; integer (SETMSI), character (SETMSC), and real (SETMSR) to specify a command substitution variable. Now it is possible for ?NATOM to return an integer, ?RSM to return a real number, and ?SEGID to return the segment identifier of the first selected atom. [2] QUANTUM Quantum mechanical and molecular mechanical combined force field method was implemented by employing the semi-empirical SCF method of the MOPAC program in the CHARMM version 22. The QUANTUM code has been modified extensively to meet CHARMM standards. There were several problems with the quantum code that have been fixed. The van der Waal group nonbond list was missing due to an improper interpretation of the group-group exclusion list in CHARMM (It's a two state list, not a 3 state as in the atom-atom exclusion list). All vdw interactions between QM and MM group where any QM atom had an exclusion or a 1-4 interaction with any MM atom were not computed. This caused major problems in certain situations where there was a strong electrostatic attraction with no compensating vdw interaction. New code to add link and place link atoms has been written. [3] Frequency Based Crystal Update - Ryszard Czerminski The modification allowes for automated, frequency based, crystal update. New variable (IXTFRQ) is introduced which controls frequency of the crystal update. [4] Ability to Linearly Increase/Decrease Pressure - Ryszard Czerminski The goal was to allow for linear increase (decrease) of the pressure during single dynamic run. New variables/keywords were introduced (PIXX - initial value of XX component of pressure tensor, PFXX - final value etc... for other components). [5] Atom Selection 5.1 Atom Parse - Bernard R. Brooks A new atom name parsing subroutine has been developed. This makes the code simpler and facilitates further advancements in atom parsing. One new feature allows an atom selection to be used to select a series of atoms. This is very useful in CORREL for specifying clusters of atoms for analysis. When the atom selection feature is used to specify 4 atoms of a dihedral, the first 4 selected atoms will be chosen. 5.2 New Tokens - Bernard R. Brooks new operator; .BYGROUP. new token; IGROup : have been added to allow the selection of atoms based on electrostatic groupings. Several keynames have been added to allow the query of the characterstics of selected atoms; ?SELATOM - number of first atom selected ?SELIRES - number of first residue selected ?SELISEG - number of first segment selected ?SELTYPE - name of first atom selected ?SELRESI - resid of first residue selected ?SELSEGI - segid of first residue selected ?SELRESN - residue type of first atom selected ?SELCHEM - chemical type of first atom selected These new keywords are in addition to the existing keyword; ?NSEL - Number of atoms selected [6] Correlation 6.1 New MANTim Options in CORREL - Bernard R. Brooks A histogram option to time series manipulation has been developed. This is executed by the command; MANTime time-series-name HISTogram min-value max-value num-steps The selected time series is replaced with a histogram which contains the probability of finding the time series within a given value range. Also, new options (RATIo and KMULt) added to the CORREL MANTIME command. 6.2 Dihedral Time Series in CORREL. - Bernard R. Brooks Fixed problems with the diheral code in correl to account for torsional timeseries. The correct fluctuation is now determined. The extra processing has been removed from the SHOW command because the data may no longer be valid for this processing when MANTIME commands are present in a script. A new command option "MANTime CONTinuous-dihedral" has been added to allow a dihedral timeseries to be unfolded to a continuous function. 6.3 Extension of Solanal ANALysis command - Arnaud Blondel A command -CROSs- was added to allow a cross analysis on two selected subsets of atoms. For the moment the exclusion of the couple of atoms belonging to the same SEGId is not implemented. The keyword CROSs cannot be selected with the following options: WATer, SITE, IKIRkg, ISDIst, IFDBf. IVAC, IMSD and IFMIn have not been tested with CROSs. [7] SCALAR Command Enhancement - Bernard R. Brooks The ASP arrays (IGNOre, ASPV and VDWS) are now accessible. There is a sort option for the SHOW command. There is a new MASS keyword for the STATistics and AVERage commands A new SCALAR READ option has been added. It allows values to be entered from a file. The use is: OPEN READ CARD UNIT 12 NAME file.dat SCALar WMAIn READ 12 SELE ... END which will read selected entries to the weighting array. [8] SURFACE - Bernard R. Brooks New analytic surface area code and energy terms for ASP (Atomic Solvation Parameters) energy and forces have been fully integrated (and parallelized for multi-machines). This has been achieved by the incorporation and adaptation of the code from Wesson and Eisenberg. The default for the COOR SURFace command is now the analytic surface area. The anaylitic answer is less expensive and more accurate. The older Lee and Richard's algorithm may still be invoked by specifying a nonzero RPRObe value. The maximum number of contacts that a sphere may have has been increased from 15 to 35. [9] QAUGMENT - Bernard R. Brooks It is desirable for a patch to be able to augment the charge of an atom. The current code could only set a charge. The new code can add or subtract a value from the charge. This is done by using a patch charge value near 100.0. For example, a charge of 100.15 will add 0.15 to the current charge. A charge value of -101.0 will subtract 1.0 from the current charge. Charge values less than -90.0 or larger than 90.0 are no longer allowed for generate or patch without charge augment. It allows more flexible patches to be developed where the prior charge on modified atoms need not be known. [10] COORdinate Commands 10.1 VACUUM_OP: COOR SEARCH Subcommand - Bernard R. Brooks The ability to manipulate pixel bitmaps generated from the COOR SEARCH command has been developed. The new syntax for the COOR SEARCH command is; COOR SEARch {PRINt [UNIT int]} { } {[VACUum]} {[RESEt]} [SAVE] {[NOPRint] } {[RCUT real]} { FILLed } { AND } {[RBUFf real]} { HOLES } { OR } { XOR } The new keywords are; SAVE - save the resultant bitmap for subsequent operations AND - logical AND the new bitmap with the previously saved map OR - logical OR the new bitmap with the previously saved map XOR - logical XOR the new bitmap with the previously saved map HOLES - search for holes (vacuum points surrounded by filled points) 10.2 New COOR DIST command - Bernard R. Brooks The COOR DISTance command has been overhauled and has additional features. One such feature is the ability to get g(r) plots from trajectory files using atom selections. It has several other features. The new syntax is: COOR DISTance { WEIGhting vector-spec atom-selection } { } { [UNIT int] [CUT real] [ENERGy [CLOSe]] 2X(atom-selection) - } { [Nonbonds] } { [NO14exclusions] } { [NOEXclusions] } - { NONOnbonds } { 14EXclusions } { EXCLusions } [TRIAngle] [ HISTogram HMIN real HMAX real HNUM integer - [HSAVe] [HPRInt] [HNORm real] [HDENsity real] ] [11] JOIN/RENUMBER Command - Bernard R. Brooks A "JOIN segid RENUMBER" feature is added in the JOIN command. This allows resid's to be made sequential within a single segment. [12] PREFX.SRC overhauled. - Bernard R. Brooks The PREFX program has been overhauled. The new code has the following features: - It allows "!" comments at the end of valid FORTRAN statements. - Conversion to single precision is performed ONLY if the SINGLE keyword is present. - It allows the use of identifier comments in ## statements. For example: ##IF PERT (pertprint) ... ##ELSE (pertprint) ... ##ENDIF (pertprint) This makes the code easier to read and allows ##ENDIF statements to be uniquely identified. A fatal error is flagged if the identifiers do not match.  File: ChangeLog, Node: C23-C24, Up: Top, Previous: C22-C23, Next: C24-C25 Major Enhancements and Developments in CHARMM24 During the C24 development cycle, February 15, 1994 to February 15, 1996, we made two bugfix-updates in the c23 releases and three alpha versions and one beta version in the c24 development line. c24x1 is the MMFF implementation in CHARMM developed at the Molecular Simulations Inc. CHARMM23.0 c23f4 Release August 15, 1994 c23f5 Release March 15, 1995 CHARMM24.0 c24a1 Developmental February 15, 1994 c24x1 Evaluation February 15, 1994 c24a2 Developmental August 15, 1994 c24a3 Developmental March 15, 1995 c24b1 Release August 15, 1995 Only bugfixes are incorporated into CHARMM23 and all new developments and enhancements have been carried out with the CHARMM24 developmental versions. All modifications are thoroughly recorded in the ChangeLog.c24 file and the following is the summary of new features and major enhancements in CHARMM 24. New Features in CHARMM24 ------------------------ [1] New Ports and Parallel Versions 1.1 Enhancement to Parallel Code - Bernard R. Brooks and Milan Hodoscek There has been continued development of the parallel code for CHARMM. This includes new features run in parallel, new machine types supported, new parallelization methods, and code made to run more efficiently. Due to conflict in routine names with library routines, the subroutines: WRITEC and READC had to be renamed. Initial code to allow the use of the Terra parallel computer has been added. Added preflx keyword SGIMP for multiprocessor SG machines using PVM massage passing library. The difference between PVM and (SGIMP, PVM) is that all the processes are spawned on one host and some communication parameters are not supported on MP machines. It can be used on a single processor SG for testing purpose. Use PVM only on a cluster of any type of workstation. 1.2 Convex Exemplar SPP-100 and generic PVM Ports - Charles L. Brooks, III and Stephen H. Fleischman A port of CHARMM version 24a2 to general PVM based parallelism using existing parallel code as well as a port to the Convex parallel machine are included. 1.3 Cray T3D Port - Charles L. Brooks, III and Barry C. Bolding A port of CHARMM version 24a2 to the Cray T3D parallel computer using existing parallel code is included. 1.4 Port of parallel CHARMM to Convex Exemplar SPP-1000 and generic MPI - Charles L. Brooks, III and Stephen H. Fleischman A port of CHARMM version 24a3 to general MPI based parallelism using existing parallel code as well as a port to the Convex parallel machine are included. 1.5 Thinking Machine's CM5 Port - Robert Nagle Previous communication scheme was based on a simple send and receive model. By using TMC's active message layer, communication bandwith can be increased by anywhere from 50% to 5X. 1.6 OS/2 Port - Stefan Boresch CHARMM (c23f4 and c24a3) has been ported to the OS/2 operating system, version 2.x and higher. The Watcom Fortran compiler (v. 9.5, patch-level (c)) has been used. A new pre-processor keyword, OS2, has been introduced, and all OS/2 related changes hide behind the OS2 keyword. There is currently no install script. Please contact me if you want to build an OS/2 version of CHARMM (boresch@tammy.harvard.edu). [2] Fast Multipole Code for Electrostatic interactions - Robert Nagle This is an initial implementation of a fast multipole method, based on John Board's work. A new non-bond option (FMA) has been added. This replaces cut-off parameters with a no cut-off hierarchical technique. The advantages of this method are that you can control the error and that it is amenable to parallelization. FMA is an O(N) technique but the constant is large and so FMA will, in general, be slower for systems of less that 5000 atoms, for the same accuracy. Two options, LEVEL and TERMS, govern how many hierarchical levels are used and how many terms are retained in the expansion, respectively. In the method, each box at every level is subdivided into 8 sub-boxes - you should select LEVEL so that the boxes at the lowest (i.e. finest) level contain 10-20 atoms on average: 3 or 4 will be typical choices. You then select TERMS to control the accuracy that you require: 4 will often suffice but I would generally recommend 6 or even 8. See the references in fma.doc for a detailed description of the error bounds. NOFMA is the nonbond option which turns off the multipole method. Compilation of FMA is controlled by the flag, FMA, in pref.dat. FAST ON is required for this initial implementation. This implementation is not yet parallelized. [3] Energy Embedding by the Addition of a Higher Spatial Dimension - Elan Z. Eisenmesser / Carol Post The energy embedding technique entails placing a molecule into a higher spatial dimension [Crippen, G. M. & Havel, T. F. (1990) J. Chem. Inf. Comput. Sci. Vol 30, 222-227]. The possibility of surmounting energy barriers with these added degrees of freedom may lead to lower energy minima. With the recent success of using four dimensions in the GROMOS force field [Van Schaik, R. C., Berendsen, H. J. C., Torda, A. E., & van Gunsteren, W. F. (1993) J. Mol. Biol. Vol 234, 751-762], creating a similar option in CHARMM should also prove advantageous. Specifically, another cartesian coordinate was added to the usual X, Y, and Z coordinates and was appropriately named FDIM for Fourth DIMension. This implementation has led to alterations in some existing code along with the addition of several algorithms. [4] DIMB (Diagonalization In a Mixed Basis) Method - David Perahia, Liliane Mouawad, Herman van Vlijmen The DIMB (Diagonalization In a Mixed Basis) method (see L. Mouawad and D. Perahia (1993), Biopolymers, 33, 599) is an iterative method to calculate the N lowest normal modes of molecules. It is especially targeted to do large molecules, since it does not require the full Hessian to be stored in memory or on disk. In short, the method does repetitive reduced-basis diagonalizations in bases that consist partially of the approximate eigenvectors, and partially of Cartesian coordinates. Eigenvectors are saved to file during the process. Before that is done, a new basis is again created, which consists of the approximate eigenvectors at that point + the residual vectors (Lanczos vectors). This accelerates the convergence. A very good property of this method is that the final eigenvectors are as accurate as the user wants them to be, so the results are no different from a full-blown diagonalization. Because the method is iterative, it takes longer to converge than a regular diagonalization. Sizewise it can handle almost anything on a moderately sized computer. David Perahia calculated a few dozen modes of Hemoglobin (~600 residues = ~6000 atoms = ~18000 d.o.f.) on a SGI workstation with 90 Mb memory. I have done several calculations on 900 residue systems. The actual time to reach convergence depends on the available memory, the desired accuracy, and the number of requested normal modes. One other area where the method saves memory is in the storage of the original Hessian. Since this matrix is usually sparse for large systems, a compressed Hessian is set up, which contains all non-zero elements. In addition, I added the option to used this compressed Hessian in the reduced-basis diagonalization option of VIBRAN. Before, the same size limits applied to full diagonalizations and reduced-basis diagonalizations. This should not be: people usually want to do reduced-basis calculations because the molecule is too big for the Hessian to be stored in memory. The option VIBRAn REDUce CMPAct will fill the compact Hessian and form the reduced-basis Hessian from this compact Hessian. Overall, this is a big saving on memory space. [5] Arithmetic Expression Interpreter - Benoit Roux An interpretor of arithmetic expression has been added to the CHARMM command parser. It is called at the level of the miscellaneous command handling using simply by the word CALC (for calculator). It can be used to evaluate algebraic numerical expression. The command supports all mathematical numerical expression with arbitrary number of nesting of recursive parentheses, e.g., exp[1.0-cos(2*(log(2*pi))**2)/0.5] The parsing is actually very crude since the expression is translated back and forth between character string and a real variable to handle the logic (there is no real subroutine recursion). [6] TNPACK Update - Tamar Schlick, Phillipe Derreumaux and Eric Barth The truncated-Newton minimization package TNPACK, developed by T. Schlick and A. Fogelson, has been incorporated into CHARMM and adopted for biomolecular energy minimization. TNPACK is based on the preconditioned linear conjugate-gradient technique for solving the Newton equations. The structure of the problem --- sparsity of the Hessian --- is exploited for preconditioning. Thorough experience with the new version of TNPACK in CHARMM has been described in a paper now in press in the Journal of Computational Chemistry: Applications are reported for a series of molecular systems including Alanine Dipeptide (N-Methyl-Alanyl-Acetamide), a dimer of N-Methyl-Acetamide, Deca-Alanine, Mellitin (26 residues), Avian Pancreatic Polypeptide (36 residues), Rubredoxin (52 residues), Bovine Pancreatic Trypsin Inhibitor (58 residues), a dimer of Insulin (99 residues), and Lysozyme (130 residues). Through comparisons among the minimization algorithms available in CHARMM, we find that TNPACK performs significantly better than ABNR in terms of CPU time when curvature information is calculated by a finite-difference of gradients (the "numeric" option of TNPACK). The CPU gain is 50% or more (speedup factors of 1.5 to 2.5) for the largest molecular systems tested and even greater for smaller systems (CPU factors of 1 to 4 for small systems and 1 to 5 for medium systems). With the analytic option, TNPACK converges more rapidly than ABNR for small and medium systems (up to 400 atoms) as well as large molecules that have reasonably good starting conformations; for large systems that are poorly relaxed (i.e., the initial Brookhaven Protein Data Bank structures are poor approximations to the minimum), TNPACK performs similarly to ABNR. TNPACK uses curvature information to escape from undesired configurational regions and to ensure the identification of true local minima. It converges rapidly once a convex region is reached and achieves very low final gradient norms, such as of order 10E-8, with little additional work. Even greater overall CPU gains are expected for large-scale minimization problems by making the architectures of CHARMM and TNPACK more compatible with respect to the second-derivative calculations. This work should be the focus of future developments. Such work involves sparse storage of the Hessian, efficient sparse Hessian/vector multiplications, and separation of the gradient and Hessian calculations. [7] X-window graphics extensively modified. - Richard M. Venable Several new features have been added to the X-window version of CHARMM graphics. This code has also been tested on a wider variety of hardware platforms (for example: SGI). Changes include: double-buffering, clipping, StaticColor, symbol fonts, window title, modified colormap calls, and a misc. Bug fixes in the labeling of the X axis. A NODISPLAY compile option has been added to the X windows version of CHARMM graphics in which only derivative files are produced. The GRAPhics NOWIndow option can be used to generate the same effect at run time. [8] Minimum Image Periodic Boundary Code - Charles L. Brooks, III, William A. Shirley and Stephen H. Fleischman Simple minimum periodic boundary conditions are added for cubic, truncated octahedra and rhomboidal (dodecahedra) periodicities which augments the image facility and enhances parallel scaling on scalar parallel machines as well as significantly reducing the memory requirements. This code is developed and fully tested for the simulation cells described above when the cell edgelength is the same in all dimensions. The (trivial) extension to non-identical cell sides will be added. However, it is critical to see reasonable performance on all scalar parallel platforms where simulations using images are currently employed that this enhancement be added now. [9] GAMESS Code - Bernard R. Brooks and Milan Hodoscek The CHARMM-GAMMES interface is under development. The interface part is completed and testing is in progress. Major Enhancements in CHARMM24 ------------------------------ [1] New Dihedral / Improper Dihedral Energy Routines - Arnaud Blondel The previous energy routines used the derivatives d(cos(phi))/dr to calculate the forces and the second derivatives. This choice introduced an artificial singularity at sin(phi)=0. The new routines use the derivative d(phi)/dr and thus have no singularities. This removes the tests to avoid numerical overflow or the switch functions in the vector improper routines. The new dihedral routines now support cases where planar conformation is not an extremum. Thus a value other than 0 or 180 can be specified in the dihedral parameters. The dihedral constraints can also use the dihedral functional form using the key word PERIod and giving a non-zero number. [2] Extended Pressure System, Langevin Piston Code - Bernard R. Brooks, Scott E. Feller and Yuhong Zhang The constant pressure code has been overhauled. The old method based on Berendsen's method has been replaced with a Langevin Piston Method. When no friction is applied, this method becomes the standard method based on Nose and Klein (adapted from Andersen). At the limit of infinite friction with no random force, this reverts to the Berendsen method. The unit cell information has been added to the trajectory file format. This implementation required an update to the image and crystal code which cleaned up some ancient problems. Options for including the surface tension (gamma-Area) term is also completed and tested. This has been developed for the accurate simulation of interfacial systems. [3] Anisotropic Harmonic Restraints - Bernard R. Brooks The global scale factors: "XSCAle", "YSCAle", and "ZSCAle" have been added to the "CONS HARM" command. This allows using the CONS HARM to enforce a planar or linear restraint. This feature is also useful for use in conjunction with our COORPLAS program (for generating 3-D coordinates from plastic models). [4] New RESDistance Facility - Bernard R. Brooks A new facility, RESD, has been created to allow general distance restraints based on a linear combination of distances. This is useful for searching reaction pathways. [5] New READ PARAm APPEnd Option - Bernard R. Brooks An append option has been added to the READ PARAM CARD command. This allows just a few parameters to be modified without editing an entire parameter file. A modification to the binary parameter file format was necessary. Old binary files may not be appended, but they are still supported. [6] New READ PSF APPEnd Option - Bernard R. Brooks An append option has been added to the READ PSF command. This allows PSFs to be easily merged to make a larger PSF. No modification to the binary parameter file format was necessary. This option works with both FILE and CARD options. [7] Best Fit Option to CORREL TRAJectory Command - Bernard R. Brooks The TRAJectory command in correl now accepts an ORIENt keyword with an optional [MASS] qualifier in conjunction with a second atom selection that will best fit selected atoms with respect to the rms deviation from the reference structure (in the comparison coordinate set). This operation is done prior to the determination of any time series value. This operation will not affect any time series value that is based only on relative distances and angles. [8] QM/MM Exclude Group Option - Bernard R. Brooks An option EXGRoup has been added which causes all atoms in the group of the link atom host to be excluded from the QM/MM electrostatic interaction terms. Code for specifying the charge of link atoms and their placement has also been added. [9] Enhancements to the Ewald Code - Bernard R. Brooks, Scott E. Feller and Steve Bogusz The EWALD electrostatic option now runs efficiently for parallel architectures. Also, the maximum K-space values can be specified independently for each direction. Several bugs were fixed. Additional ways to compute ERFC() were added, including a lookup table. [10] MMFP/SSBP Upgrade - Benoit Roux and Dmitrii Beglov The Miscellaneous Mean-Field Potentials (MMFP) has been upgraded. The spherical solvent boundary potential (SSBP) has also been incorporated into EPERT. A new "membrane-like" planar potential has been introduced using Gaussians to provide a smooth free energy function based on hydropathy profile of individual amino acids and solvent exposure. This is useful to orient membrane proteins. A new primary shell of hydration has been added to the MMFP facility to provide one layer of solvent around a flexible polypeptide. For more information, see Beglov & Roux, Biopolymers 35: 171-178 (1995). A solvent boundary potential for the simulation of water at constant pressure is also added to the Miscellaneous Mean Field Potential module. The boundary potential is an approximation but follows from a rigorous statistical mechanical treatment of the boundary. In light of the difficulties raised by the previous treatments, a different route was chosen to formulate and develop the solvent boundary potential for computer simulations of a finite representation of an infinite bulk system. The present theoretical formulation is based on a separation of the multidimensional solute-solvent configurational integral in terms of n "inner" solvent molecules nearest to an arbitrary solute, and the remaining "outer" bulk solvent molecules. This formulation, which differs significantly from previous treatments, provides further insight into the statistical mechanical basis of the solvent boundary potential and is helpful in constructing useful approximations for computer simulations in dense liquids. An approximation to the solvent boundary potential is constructed for simulations of bulk water at constant pressure, including the influence of van der Waals (done with RISM) and electrostatic interactions (done with a Kirkwood-like multipole expansion). The approach has been tested with success on several typical systems (water, ions, n-butane and alanine dipeptide). [11] Upgrade of the NMR module - Benoit Roux The NMR module is upgraded to have better output style. The old version used the value of PRNLEV to choose the printed quantities. Since this was a non-standard style in CHARMM, a series of logical flags have been included in the command calls to print some chosen quantities. In addition, the chemical shift anisotropy (CSA, used in solid state NMR of membrane proteins in oriented samples) has been redefined in term of a zmatrix to prevent confusion. The deuterium quadrupolar splittings (DQS) command is also upgraded. A bug in a call to NORMAL was fixed. [12] New Options to CORREL - Lennart Nilsson Two new MANTime options have been added to CORREL: CROS and DOTP. CROSsprod name Q(T) = Q(T) x Q2(T) produces the 3D crossproduct of the two 3D vectors formed by the selected and named timeseries and DOTProd name Q(T) = x-comp of Q(T)= Q(T) . Q2(T) gives x-comp of Q2(T) angle in degrees between the two vectors. [13] The COOR HBONd Command - Lennart Nilsson An option for the analysis of H-bond patterns from trajectories has been added to corman. COORdinates HBONd 2X(atom-selection) [CUT ] [CUTA ] [IUNIt ] [BRIDge ] [FIRSt int] [NUNIts int] [NSKIp int] [BEGIn int] [STOP int] The HBONd command analyses a trajectory for hydrogen bonding patterns. For each acceptor/donor in the first selection the average number and average lifetime of hydrogen bonds to any atom in the second selection is calculated. A hydrogen bond is assumed to exist when two candidate atoms are closer than the value specified by CUT (default 2.4A, (reasonable criterion, DeLoof et al. (1992) JACS 114, 4028), and if a value for CUTAngle is given the angle formed by D-H..A is greater than this CUTAngle (in degrees, 180 is a linear H-bond); the default is to allow all angles. The current implementation assumes that hbonding hydrogens are present in the PSF and also uses ACCEptor and DONOr information from the PSF to determine what pairs are possible. If output is wanted to a separate file the IUNIt option can be used. If the BRIDge option is used the routine calculates average number and lifetime of bridges formed between all pairs of atoms in the two selections; a bridge is counted a residue of the type specified with the BRIDge hydrogen bonds (using same criteria as for direct hbonding) to at least one atom in each selection. The typical use of this would be to find water bridges. Here again, results are presented for each atom in the first selection. In order not to find hbonds between bonded atoms UPDATE is called, which requires coordinates to be present when invoking this module. Since this is done just to get the non-bond exclusion lists, the cut-offs are set to very small values, and could influence subsequent energy evaluations if the non-bond cutoffs are not then respecified. [14] NORESET Option for SHAKE - Lennart Nilsson The NORESET option is added to allow multiple shake commands. It is useful to be able to define shake on bonds, bonh or so on several different sets of atoms, with different shake options. The NORESET keyword to shake command allows this by not zeroing counter. [15] Trajectory Reading - Lennart Nilsson READCV is modified to read coordinates at multiples of skip FROM the actual first coordinate set in a trajectory file. [16] Make BLOCK work with IMAGE/CRYSTAL and vice versa - Stefan Boresch In order to make BLOCK work / coexist with the IMAGE module two things had to be changed: (1) A memory allocation problem in the BLOCK datastructure and (2) the post-processing modules needed to be overhauled to allow for nonbonded list updates while reading frames from the trajectory. Ad (1), memory allocation: BLOCK uses two data-structures, one containing the interaction matrix between blocks, and one containing the block number for each atom (IBLCKP). This array was allocated so far as INTEG4(NATOM) on the heap. However, when IMAGE atoms are present, the energy routines attempt to find out to which block an IMAGE atom belongs. This at one point or the other causes a memory access violation. The solution consists out of two parts. (i) The IBLCKP data-structure is now allocated as INTEG4(MAXAIM) on the heap; therefore there is always enough space provided. (ii) The entries for the IMAGE atoms have to be initialized, and this has to be done at EVERY image update. However, similar things are already done for a number of other quantities like masses, vdW params, charges etc. All this is done among a number of other things in subroutine MKIMAT in upimag.src, where I have added an appropriate statement. Ad (2), changes to post-processing routines: Real/Image atoms leave/enter the simulation box/system dynamically. Therefore, the nonbonded/image interaction lists have to be updated during post-processing. The hooks were already in the program, subroutine BLUPLST. The real changes hide in this routine, most changes in BLFREE, BLEAVG and BLCOMP are either cosmetic or ensure proper printout. Post-processing routines FREE, EAVG and COMP will actually print IMAGE terms if present. The routine BLUPLST is a sibling of routine updeci in heurist.src. The heuristic update scheme itself is removed, as I feel that one should update the lists at every frame. Also, the CRYSTAL specific section of UPDECI is not present in BLUPLST as I don't understand it. Therefore, care should be exercised when using BLOCK with CRYSTAL! Negative values of INBFRQ/IMGFRQ are trapped, in this case they are set to 1; Printout from the update / list generation routines is suppressed by temporarily raising the PRNLEV to 1. The BLOCK documentation (block.doc) has been revised and reflects these modifications. A new testcase block3.inp has been added to test/c24test. [17] Constraint correction for PERT - Stefan Boresch The current version of PERT cannot handle situations where SHAKE is applied to bonds which change in length due to an alchemical mutation as SHAKE and PERT do not "communicate". Furthermore, in such cases a constraint correction has to be computed and added to the free energy difference. Two steps are required to fix this problem: (1) The constraint list needs to be updated as a function of the coupling parameter lambda. (2) The constraint correction has to be calculated. Only thermodynamic integration (both for slow-growth and windowing) is supported; the exponential formula will give nonsense results. (If someone wants to fix this, please look at Pearlman/Kollman, JCP 1991, 94, 4532 and Severance et al. J. Comput. Chem. 1995, 16, 311.) The method to calculate the constraint corrections is based on extracting the respective Lagrangian multipliers from the SHAKe routine; this approach is briefly described in van Gunsteren et al. Computer Simulation of Biomolecular Systems: Theoretical and Experimental Applications; ESCOM: Leiden 1994; Vol. 2, pp 315-348. The approach fully includes inertial contributions, it is left to the user to account for those correctly in the context of the problem. The new code is mostly transparent and does not really require additional documentation. However, some information is added to pert.doc. A new testcase pert2.inp is also added to test/c24test. [18] Non-Cubic Crystal Building Problem Fix - Wonpil Im and Ryszard Czerminski The crystal build facility uses the symmetrized rotated shape matrix XTLABC obtained from lattice parameters. However, it does not apply the same rotation to the unit cell moiety, which may result in bad contacts in non-cubic crystals. The problem is fixed by calling the subroutine ROTXTL. Some tests for the rotation are added by Ryszard.  File: ChangeLog, Node: C24-C25, Up: Top, Previous: C23-C24, Next: Top Major Enhancements and Developments in CHARMM25 During the C25 development cycle, August 15, 1995 to August 15, 1997, we made three bugfix-updates in the c24 releases and three alpha versions and one beta version in the c25 development line. CHARMM24.0 c24b2 Release February 15, 1996 c24g1 Release August 15, 1996 c24g2 Release February 15, 1997 CHARMM25.0 c25a0 Developmental August 15, 1995 c25a1 Developmental February 15, 1996 c25a2 Developmental August 15, 1996 c25a3 Developmental February 15, 1997 c25b1 Release August 15, 1997 Only bugfixes are incorporated into CHARMM24 and all new developments and enhancements have been carried out with the CHARMM25 developmental versions. All modifications are thoroughly recorded in the ChangeLog.c25 file and the following is the summary of new features and major enhancements in CHARMM 25. New Features in CHARMM25 ------------------------ [1] Merck Molecular Force Field (MMFF) - Thomas A. Halgren, Ryszard Czerminski, Jay L. Banks, Bernard R. Brooks, and Youngdo Won Merck Molecular Force Field (MMFF) developed by Tom Halgren at Merck has been implemented in CHARMM. Ryszard introduced MMFF into c23f2, which made the c24x1 (February 15, 1994) version for evaluation. As CHARMM was evolved through the c24 development project, Jay incorporated MMFF into c24b1 in a less intrusive manner. Bernie and other developers reviewed c24b1/MMFF and suggested some corrections. Youngdo took the Jay's code and Bernie's suggestions and made the checkin code of MMFF. MMFF is documented in doc/mmff.doc. [2] CADPAC - Paul Lyne An interface is added to allow CHARMM to run with CADPAC6.0 when performing QM-MM calculations. CADPAC6.0 can perform HF, MP2, MP3 and DF calculations. [3] Particle Mesh Ewald Code - Bernard R. Brooks The Particle Mesh Ewald (PME) method has been implemented. This code is based on code sent by Tom Darden at NIEHS/NIH. It has been modified so as to conform with CHARMM coding standards. This version is much faster than the standard Ewald code and accuracy does not appear to be a problem when reasonable options are used. This code uses the new "smooth" algorithm. See ewald.doc for more details. The code is now running in parallel and the following features are supported: - PERT (free energy calculation) with PME (including pressures) - Assymetric units with CRYSTAL (when NOPEr>0 in CRYStal BUILd command) is now supported with PME. - Total charge (Qtot<>0) energy and pressure correction term has been added. - Accurate pressures for the triclinic (and all other) cases (and for PERT) - Ewald energy components have been separated and can be turned off with the SKIP command ('EWKS','EWSE','EWEX','EWQC','EWUT'). (k-space,self term,exclusion,total Q correction,utility) [4] External Force to Selected Atoms - Lennart Nilsson A new command has been added which calculates a new energy term corresponding to a static or periodically varying external force on an atom selection. [5] Distance matrix and radius of gyration restraints - Charles L. Brooks, III, Felix B. Sheinerman and Erik Boczko New restraint energy terms added to permit restraint of system based on its radius of gyration and/or the value of a reaction coordiante what describes the degree of nativeness based on the number of native side chain contacts. New Keywords are RGYCONS and DMCONS. [6] HTML Doc Files - Rick Venable and Charles L. Brooks, III Added html documentation files and developed/modified doc2html.com originally developed at NIH. All relevant files added to support/htmldoc Major Enhancements in CHARMM25 ------------------------------ [1] PARALLEL CODE reorganized and extended - Milan Hodoscek, Charles L. Brooks The parallel code has been updated and organized into three parts: paral1.src, paral2.src and paral3.src. The new code is faster and there has been a significant additions to support other platforms. We now support about 15 platforms including ALPHAMP, T3D, T3E, Terra, Global-Works-Server and others. [2] Linux Port - Milan Hodoscek The Linux port is done with the GNU Fortran compiler, version 0.5.18. For now, all Linux related changes are under the GNU keyword. [3] Nonbond Energy Code Overhaul with Semi-Automatic Code Expansion - Bernard R. Brooks The program PREFLX (PREFX) has been overhauled to allow semi-automatic code expansion in the moving of inner loop if-tests to the outside of do-loops. The nonbond energy routines are cleaned and organized to utilized the semi-automatic expansion. Obsolete ZTBL code is removed. [4] Ewald code - Bernard R. Brooks Memory needs of the EWALD electrostatic option have been reduced, and multiple parallel options are now supported. Pressure code has been fixed as well. The calling sequence to ENBOND was modified so that a flag (QEWEX) can be sent indicating whether the nonbond exclusion correction should be performed for the Ewald calculation. This corrects several problems (such as Ewald with MTS and Ewald with PERT) and this simplifies some code relative to the handling of the exclusion lists. Also there were several changes to EPERT so that the Ewald method will report a correct internal virial (for pressure). The Ewald method was enabled for the GROUP option so that group lists can be used. This reduces the amount of memory and the time needed to handle the nonbond lists (good for limited memory parallel machines). A version of EWALD was developed for MMFF. The usual MMFF electrostatic term: qq/(r+d) is split into two terms: qq/r - qq*d/(r*(r+d)) The first term is handled by the Ewald method in the usual manner (real-space and k-space parts) and the second term is truncated at the cutoff distance using a switching function (from CTONNB to CTOFNB). Since the second term is quite small at the cutoff distance, the use of a switching function should not introduce significant artificial forces. [5] Restrained Distance Code Enhancement - Bernard R. Brooks The restained distance method has been extended to allow the use of a one sided function (positive or negative). It also allows a non-unit exponent for the individual distance terms. The code is now much more general in its ability to define distnace based retraints based on multiple distances. [6] COOR DIPOLE - Bernard R. Brooks A COOR DIPOle command has been added. This command computes tha charge and dipole (multipoles) for selected atoms. [7] READ PSF APPEnd - Bernard R. Brooks The READ PSF APPEnd command option has been modified so that it does not initialize the coordinates of existing atoms. Only the new appended atoms will have undefined coordinates. [8] Replica within Images - Bernard R. Brooks The replica code has been enhanced so that it workes with images and the crystal facility. [9] The crystal facility has been extended - Bernard R. Brooks The follwing new features have been added: - The "DODE" has been renamed "OCTA" (for truncated OCTAhedron). (pressure bug fixed for OCTA) - A new type "RHDO" has been added (for RHombic DOdecahedron). - The CRYSTAL BUILD command is now much faster and more accurate. The use of the double atom search has been limited. - The documentation has been updated to give detailed information regarding crystal types. - The WRITe/PRINt IMAGE command is no longer iterative (in accord with the existing documentation). [10] Overhaul of Harmonic restraints - Bernard R. Brooks The CONS HARM command has been overhauled and extended. The new syntax has three different types of harmonic restraints: CONStraint HARMonic { [ABSOlute] absolute-specs } force-const-spec { BESTfit coordinate-spec } { RELAtive 2nd-atom-selection } { CLEAr } The ABSOlute is the old method. The BESTfit causes the reference set to be logically bestfit rotated/translated before computing the restraint energy. The RELAtive allows two portions of one PSF to be restrained to the same internal geometry by the bestfit least squares rotation (no reference coordinates used). Some features and changes: - Multiple restraints (same or different types) are allowed. - HARMonic restraint I/O is no longer supported. - The old command syntax still functions (no rewrite of scripts required). - The READ/PRINt/WRITe CONS commands now have a "PSF 0" option for PERT. - PERT supports all of these restraint types. Restriction: - Each atom may participate in AT MOST one harmonic restraint term. [11] Enhancement to REPLICA/PATH - Bernard R. Brooks The REPLICA/PATH method has been extended to allow for bestfit translation and/or rotations between adjacent replicas before computing the restraint energies. Getting the forces right was the hard part. This allows entire molecules to be replicated (or sections with significant freedom). RPATh [ KRMS real ] [ KANGle real ] [ COSMax real ] [MASS] [WEIGht] [ KMAXrms real ] [RMAXrms real ] [ ROTAtions ] [ TRANslations ] [12] CHARMM/GAMESS enhanced - Milan Hodoscek The version of GAMESS has been updated to the March-97 version from Ameslab. Also, QM/MM gaussian blur of MM charges has been implemented as an option. [13] A few small changes to MMFP and NMR - Benoit Roux A few small changes to some MMFP subroutine have been made. The main thing is a second atom select for the SSBP command that allows the present of atoms outside the boundary radius. This could be useful when the boundary is used only for an active site. The relaxation time due to the chemical shift anisotropy addition has been added to NMR. [14] COOR DMAT - Charles L. Brooks, III The dist keyword has been removed from the covariance command and a new analysis command has been added under the coor subsyntax. This command is accessed with the command COOR DMAT and provides some general tools for the calculation, manipulation and storage/extraction of distance matrix based properties. This routine has some overlap with the new distance command introduced by Bernie Brooks but also provides significant complementarity in extending the range of properties computed. CHARMM Element doc/charmm.doc 1.1  File: CHARMM, Node: Top Chemistry at HARvard Macromolecular Mechanics - --- - - Version 24b1 - August 15, 1995 Copyright(c) 1984,1987,1991,1994,1995 President and Fellows of Harvard College All rights reserved You are now using the INFO facility to view CHARMM 24 documentation. The paper; CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations. J. Comp. Chem., Vol. 4, p187 (1983), is considered to be an integral part of this documentation. In places, this documentation and the paper will conflict. In all such cases, the documentation presented here should take precedence. * Menu: * Commands: (chmdoc/commands.doc). Discription and syntax of CHARMM commands * Install: (chmdoc/install.doc). Release notes How to install CHARMM on a user site * Usage: (chmdoc/usage.doc). How to use CHARMM * Support: (chmdoc/support.doc). Supporting data files and utilities * Testcase: (chmdoc/testcase.doc). CHARMM testcases * Develop: (chmdoc/developer.doc). Notes for CHARMM developers * News: (chmdoc/changelog.doc). New features and Modifications * Parallel: (chmdoc/parallel.doc). CHARMM on parallel platforms * Info: (Info). A description of the INFO facility. CHARMM Element doc/charmm_gen.doc $Revision: 1.1 $  File: charmm_gen, Node: Top, Up: (chmdoc/commands.doc), Previous: (chmdoc/install.doc), Next: Configuration The script charmm_gen.com was designed at NIH for easy maintenance of multiple executables in an active research environment. Multiple versions versions can be derived from the same source code, incorporating different features and maximum atom limits. It is assumed that install.com has already been run, and any porting or compiling issues resolved before charmm_gen.com is used. In fact, charmm_gen.com simply calls install.com after doing a little creative copying and renaming. The script is interactive; it asks a few questions, does a lot of checking, and then proceeds to make up to nine different versions in one operation with no further human intervention required. A "test" or development version can also be prepared, and is in fact the "path of least resistance", i.e. the accepting of all the defaults to each prompt. Since simply starting up a LARGE version of CHARMM with most of the available feature sets can easily require 100 Mbyte of memory, we recognized the need to have multiple executables available. Our choice was to create 3 principal versions: "full", with most major modules included; "lite", a version without most of the high memory usage or rarely used modules; and "am1", which adds the QUNATUM QM/MM code and few other features to the "lite" feature set. Each is available in 3 sizes, small, medium, and large. We also use a "cover" script in /usr/local/bin to run CHARMM, after parsing feature set and size keywords, and stripping them from the command line. An example is included at the end of this description. Currently, ten different sets of object libraries are maintained as well; this does require a bit of disk space, but allows rapid re-building of all versions when bugfixes are made.  File: charmm_gen, Node: Configuration, Previous: Top, Next: Cover script To use charmm_gen.com, the following additional files are *required* in build/mach, where mach = hpux in this case: Makefile_hpux.gamess Makefile_hpux.nogamess Makefile_hpux.test Makefile_hpux.test.list pref.dat.large.am1 pref.dat.large.full pref.dat.large.lite pref.dat.medium.am1 pref.dat.medium.full pref.dat.medium.lite pref.dat.small.am1 pref.dat.small.full pref.dat.small.lite pref.dat.test Makefile_hpux.gamess defines additional tools, directories, and libraries needed to compile the GAMESS code for QM/MM calculations, while the .nogamess version is a typical generic version of Makefile_hpux. Makefile_hpux.test is configured for rapid re-compiling of CHARMM during development, while Makefile_hpux.test.list produces cross-referenced source listings by changing the definition of the variable FC at the top of the makefile. The remaining files represent the ten different possible versions; at NIH, the keywords in pref.dat.test are usually the same as pref.dat.medium.full, but that doesn't have to be the case. The following listing shows the pref.dat keywords we chose for the 2 different feature sets at NIH: Feature set "am1" HPUX ! machine type UNIX PARALLEL ! multiple processors/workstations PARAFULL ! req'd for parallel SYNCHRON ! req'd for parallel SOCKET ! req'd for parallel MEDIUM ! size directive = 25120 atom limit SCALAR ! machine characteristics = default for scalar machines VECTOR ! feature directive * = Vectorized routines PARVECT ! Parallel vector code (multi processor vector machines) CRAYVEC ! Fast vector code (standard vector code) SAVEFCM ! Include all SAVE statements PUTFCM FCMDIR=fcm XDISPLAY ASPENER ! feature directive * = Atomic Solvation Parameter energy term MOLVIB ! feature directive = MOLVIB vibrational analysis code NIH ! feature directive * = NIH default specs code OLDDYN ! feature directive = Old dynamics integrator PERT ! feature directive * = NIH free energy code QUANTUM ! feature directiver = include AM1 semi-empirical code REPLICA ! feature directive = Replica code RISM ! feature directive = RISM solvation code RXNCOR ! feature directive * = RXNCOR code TRAVEL ! feature directive * = PATH and TRAVEL code DIMB ! feature directive FMA ! feature directive ZTBL ! feature directive END ! end Feature set "full" HPUX ! machine type UNIX PARALLEL ! multiple processors/workstations PARAFULL ! req'd for parallel SYNCHRON ! req'd for parallel SOCKET ! req'd for parallel MEDIUM ! size directive = 25120 atom limit SCALAR ! machine characteristics = default for scalar machines VECTOR ! feature directive * = Vectorized routines PARVECT ! Parallel vector code (multi processor vector machines) CRAYVEC ! Fast vector code (standard vector code) SAVEFCM ! Include all SAVE statements PUTFCM FCMDIR=fcm XDISPLAY ! X11 graphics display ASPENER ! feature directive * = Atomic Solvation Parameter energy term BLOCK ! feature directive * = Energy partition and free energy code MOLVIB ! feature directive = MOLVIB vibrational analysis code MTS ! feature directive = Multiple time step code NIH ! feature directive * = NIH default specs code OLDDYN ! feature directive = Old dynamics integrator PERT ! feature directive * = NIH free energy code GAMESS ! GAMESS ab initio interface for QM/MM REPLICA ! feature directive = Replica code RISM ! feature directive = RISM solvation code RXNCOR ! feature directive * = RXNCOR code TNPACK ! Truncated Newton TRAVEL ! feature directive * = PATH and TRAVEL code TSM ! feature directive = TSM and ICPERT code DIMB ! feature directive FMA ! feature directive FOURD ! feature directive PRIMSH ! feature directive PBOUND ! simple Periodic BOUNDary (min image) SHAPES ! feature directive = SHAPE descriptor code ZTBL ! feature directive END ! end Feature set "lite" HPUX ! machine type UNIX PARALLEL ! multiple processors/workstations PARAFULL ! req'd for parallel SYNCHRON ! req'd for parallel SOCKET ! req'd for parallel MEDIUM ! size directive = 25120 atom limit SCALAR ! machine characteristics = default for scalar machines VECTOR ! feature directive * = Vectorized routines PARVECT ! Parallel vector code (multi processor vector machines) CRAYVEC ! Fast vector code (standard vector code) SAVEFCM ! Include all SAVE statements PUTFCM FCMDIR=fcm XDISPLAY ASPENER ! feature directive * = Atomic Solvation Parameter energy term MOLVIB ! feature directive = MOLVIB vibrational analysis code NIH ! feature directive * = NIH default specs code PERT ! feature directive * = NIH free energy code REPLICA ! feature directive = Replica code RXNCOR ! feature directive * = RXNCOR code TRAVEL ! feature directive * = PATH and TRAVEL code END ! end  File: charmm_gen, Node: Cover script, Up: Top, Previous: Configuration Finally, to make live easy for the end users, we use the following script to run CHARMM on a routine basis: #! /bin/csh # INITIALIZE VARIABLES set n = $#argv setenv HOST `hostname | cut -d. -f1` set chmsiz = small set i = 1 set cleanup = date set chmopt = lite # CHECK FOR OPTIONAL KEYWORDS while ( $i <= $n ) switch ( $argv[$i] ) case small: set chmsiz = small breaksw case large: set chmsiz = large breaksw case medium: set chmsiz = medium breaksw case test: set chmopt = test breaksw case lite: set chmopt = lite breaksw case full: set chmopt = full breaksw endsw @ i = $i + 1 end # STRIP KEYWORDS FROM ARGUMENT STRING set t = `echo $* | sed -e 's/small//' -e 's/medium//' -e 's/test//' \ -e 's/full//' -e 's/large//' -e 's/lite//' -e 's/am1//'` # CHECK FOR DESIGNATED PARALLEL HOSTS switch ( $HOST ) case par0: set cleanup = 'qpara_clean par0 bypass' setenv NODE0 par0f setenv NODE1 par1f setenv NODE2 par2f setenv NODE3 par3f echo "Parallel; $NODE0 $NODE1 $NODE2 $NODE3" breaksw case par11: set cleanup = 'qpara_clean par11 bypass' setenv NODE0 par11f setenv NODE1 par12f setenv NODE2 par13f setenv NODE3 par14f echo "Parallel; $NODE0 $NODE1 $NODE2 $NODE3" breaksw default: echo "Single processor; $HOST" breaksw endsw # ECHO WORKING DIRECTORY AND CHARMM VERSION W. TIMESTAMP if ( $?PWD ) then echo $PWD else echo $cwd echo "Warning: env var PWD not defined; required for parallel CHARMM" endif # SET THE VERSION TO BE RUN if ( $chmopt == test ) then set exe = $chmopt else set exe = $chmsiz.$chmopt endif # VERIFY THE ACTUAL EXECUTABLE; RUN AT REDUCED PRIORITY ls -o ~charmm/c24n4/exec/hpux/charmm.$exe | cut -c33- if { /bin/nice -5 ~charmm/c24n4/exec/hpux/charmm.$exe $t } then echo '' $cleanup else echo '(charmm) ABNORMAL EXIT' $cleanup exit(1) endif CHARMM Element doc/commands.doc 1.1  File: Commands, Node: Top, Up: (chmdoc/charmm.doc), Previous: (chmdoc/parallel.doc), Next: (chmdoc/install.doc) CHARMM commands The commands available for use in CHARMM are classified in several groups. * Menu: * Analysis: (chmdoc/analys.doc ). Analysis facility * ACE: (chmdoc/ace.doc ). Analytical Continuum Electrostatics * ADUMB: (chmdoc/adumb.doc ). ADaptive UMBrella sampling simulation * Block: (chmdoc/block.doc ). BLOCK free energy simulation * Cons: (chmdoc/cons.doc ). Harmonic and other constraints or SHAKE * Coordinates: (chmdoc/corman.doc ). Commands to manipulate coordinates * Correl: (chmdoc/correl.doc ). Time series and correlation functions * Crystl: (chmdoc/crystl.doc ). Crystal facility * Dynamics: (chmdoc/dynamc.doc ). Dynamics commands * EEF1: (chmdoc/eef1.doc ). Effective Energy Function 1 * Energy: (chmdoc/energy.doc ). Energy evaluation * Ewald: (chmdoc/ewald.doc ). Ewald summation * GBorn: (chmdoc/genborn.doc ). Generalized Born Solvation Energy * Genetic: (chmdoc/galgor.doc ). The genetic algorithm commands * Graphx: (chmdoc/graphx.doc ). The graphics subsection for workstations * H-bond: (chmdoc/hbonds.doc ). Generation of hydrogen bonds * H-build: (chmdoc/hbuild.doc ). Construction of hydrogen positions * Images: (chmdoc/images.doc ). Use of periodic or crystal environment * Internal: (chmdoc/intcor.doc ). Manipulation of internal coordinates * I/O : (chmdoc/io.doc ). I/O of data structures and files * LonePair : (chmdoc/lonepair.doc). Lone-Pair Facility * LUPOPT : (chmdoc/lupopt.doc ). Low Energy Path OPTimization * MBOND: (chmdoc/mbond.doc ). Multi-body dynamics * MC: (chmdoc/mc.doc ). Monte Carlo simulation program * Minimiz: (chmdoc/minimiz.doc ). Description of the minimization methods * Miscellany: (chmdoc/miscom.doc ). Miscellaneous commands * MMFP: (chmdoc/mmfp.doc ). Miscelaneous Mean Field Potential * Molvib: (chmdoc/molvib.doc ). Molecular vibrational analysis facility * NMR: (chmdoc/nmr.doc ). NMR analysis facility * Non-bonded: (chmdoc/nbonds.doc ). Generation of the non-bonded interaction * Parameters: (chmdoc/parmfile.doc). CHARMM energy parameters * PBEQ: (chmdoc/pbeq.doc ). Poisson-Boltzmann Equation Solver * Perturb: (chmdoc/pert.doc ). Free energy perturbation simulations * Pressure: (chmdoc/pressure.doc). Pressure calculation and usage * Quantum: (chmdoc/qmmm.doc ). Quantum and Molecular Mechanical FF * Replica: (chmdoc/replica.doc ). REPLICA: molecular system replication * RISM: (chmdoc/rism.doc ). Reference Interaction Site Model * Sbound: (chmdoc/sbound.doc ). Stoichastic boundary * Scalar: (chmdoc/scalar.doc ). Scalar command for atom properties * Select: (chmdoc/select.doc ). Use of the atom selection facility * Structure: (chmdoc/struct.doc ). Structure manipulation (PSF generation) * Substitute: (chmdoc/subst.doc ). Command line substitution parameters * Test: (chmdoc/test.doc ). Commands to test various things * Topology: (chmdoc/rtop.doc ). Residue Topology File * Travel: (chmdoc/travel.doc ). Reaction coordinate refinement command * TSM: (chmdoc/perturb.doc ). Thermodynamic Simulation Method * Umbrella: (chmdoc/umbrel.doc ). Umbrella Sampling * Vibration: (chmdoc/vibran.doc ). Vibrational analysis facility * MMFF: (chmdoc/mmff.doc ). Merck Molecular Force Field CHARMM Element doc/cons.doc 1.1  File: Cons, Node: Top, Up: (chmdoc/commands.doc), Next: Harmonic Atom CONSTRAINTS The following forms of constraints are available in CHARMM: * Menu: command * Harmonic Atom:: "CONS HARM" Hold atoms in place * Dihedral:: "CONS DIHE" Hold dihedrals near selected values * Internal Coord:: "CONS IC" Holds bonds, angles and dihedrals near table values * Quartic Droplet:: "CONS DROP" Puts the entire molecule in a cage about the center of mass * Fixed Atom:: "CONS FIX" Fix atoms rigidly (sets the IMOVE array) * SHAKE:: "SHAKE" Fix bond lengths during dynamics. * NOE:: "NOE" Impose distance restraints from NOE data * Restrained Distances:: "RESD" Impose general distance restraints * External Forces:: "PULL" Impose externally applied (pulling) force * Sbound: (chmdoc/sbound.doc). Solvent boundary potential  File: Cons, Node: Harmonic Atom, Up: Top, Next: Dihedral, Previous: Top Holding atoms in place ------------------------------------------------------------------------------ [SYNTAX CONS HARMonic] Syntax: CONS HARMonic {[ABSOlute] absolute-specs force-const-spec coordinate-spec } { BESTfit bestfit-specs force-const-spec coordinate-spec } { RELAtive bestfit-specs force-const-spec 2nd-atom-selection} { CLEAr } force-const-spec ::= { FORCE real } atom-selection [MASS] { WEIGhting } absolute-specs ::= [EXPOnent int] [XSCAle real] [YSCAle real] [ZSCAle real] bestfit-specs ::= [ NOROtation ] [ NOTRanslation ] coordinate-spec::= { [MAIN] } { COMP } { KEEP } The potential energy has a harmonic restraint term which allows one to prevent large motions of individual atoms. There are three forms for this restraint, ABSOLUTE, BESTFIT, and RELATIVE. It is possible to combine multiple restraints in one energy calculation, but no atom may participate in more than one harmonic restraint set. ------------------------------------------------------------------------------ ABSOLUTE positional restraints. Absolute positional restraints specify a location a cartesian space where an atom must remain proximal. This is the original positional restraint in CHARMM. The form for this potential is as follows for coordinates: EC = sum over selected atoms of k(i)* [mass(i)] * (x(i)-refx(i))**exponent where refx is a reference set of coordinates. If MASS is specified in the command line, then k is multiplied by the mass of the atom resulting in a natural frequency of oscillation for the restraint of sqrt(k) in AKMA units. An atom restrained with MASS FORCE 1.0 will oscillate at 8 cycles/picosecond if free of other interactions. For most operations involving harmonic restraints, mass weighting is recommended. There are three reasons for this. First, the results obtained will be similar regardless of what atom representation is used (extended vs. explicit) for hydrogen atoms. Second, Hydrogen atoms are allowed greater relative freedom if present. And third, The character of the normal modes of a molecule are unperturbed with mass weighting (essential if normal modes or low frequency motions are of interest). Note, there is no longer a prefactor of 0.5 on the force constant specification. This is appropriate in that exponent values other than "2" are allowed. This differs from the earlier versions of CHARMM (up to version 16). The restraint force constant can be set to any positive value (specified by the FORCE keyword followed by the desired value). The force constants may also be obtained from the weight array, in which case the FORCe keyword is not read. When using this option, a negative values may be used for some atoms, however, the total weight must be positive. The reference coordinates can be the current set at the point when restraints are specified (the default) or a set can be the comparison set (COMP keyword). When multiple CONS HARM commands are used, the KEEP option preserves the reference coordinates from the previous restraints. This is useful in cases where the force constant is to be modified, but no other changes are desired. The variables XSCAle, YSCAle, and ZSCAle are global scale factors for ABSOLUTE harmonic restraint terms. The default scale factor is 1.0 for all terms. If multiple harmonic restraint sets are used, they may have different scale factors. The RELATIVE and BESTFIT types do not allow a scale factor at present. Example: CONS HARM FORCE 1.0 MASS SELE atom * * CA END COMP This command harmonically restrains all alpha-carbons to the current positions in the comparison coordinates with a force constant of 24 Kcals/mol/A**2 (assuming a mass of 12). ------------------------------------------------------------------------------ BESTFIT positional restraints This restraint is similar to the absolute restraints except that the reference coordinates are (implicitly) rotated and translated so as to bestfit the selected atoms. This best fit is done in a manner that minimizes the restraint energy. Due to the nature of the best fit, this restraint term does not add any net force or torque to the system. Note 1: An exponent may not be specified (it is set to 2) Note 2: Global scale factors do not apply Note 3: At present, there is no Hessian code for this restrain Note 4: There is no energy partition (ANAL command) for this restraint Example: CONS HARM BESTFIT MASS FORCE 1.0 COMP SELE SEGID A END CONS HARM BESTFIT MASS FORCE 1.0 COMP SELE SEGID B END These commands will restrain segments A and B to the geometries they have in the current comparison coordinates. This restraint will not add a net force or torque to the system (unlike the ABSOLUTE restraint type). Segments A and B can move (rotate/translate) independently with no change in the restraint energy. ------------------------------------------------------------------------------ RELATIVE positional restraints. The relative positional restraints are similar to the bestfit restraint except there is no reference coordinates. In this case, one part of the system is restrained to have the same shape as another part of the system. The two restraint sets are implicitly bestfit by an optimal rotation/translation (minimizing the restraint energy). Both sets of atoms Note 1: An exponent may not be specified (it is set to 2) Note 2: Global scale factors do not apply Note 3: At present, there is no Hessian code for this restrain Note 4: There is no energy partition (ANAL command) for this restraint Note 5: The atoms of the two sets are matched on-to-one in sequential order. Note 6: If the two sets do not have the same number of atoms, an error will be issued and the set lists will be truncated. Note 7: Both sets must be specified and must not use set number 1. Example: CONS HARM RELATIVE WEIGHT SELE segid a1 END SELE segid a2 END NOROT NOTRAN This command will force two replicas (A1 and A2) to have the same coordinates based on the values in the weighting array (as best fit weights). Example: CONS HARM RELATIVE MASS FORCE 10.0 SELE SEGID A END SELE SEGID B END This command will force two segments (A and B) to have the same shape, but they may have very different locations and orientations. Atoms are matched one-to-one by selected atom number. ------------------------------------------------------------------------------ GENERAL INFORMATION It is important to understand some aspects of how the restraints are set in order to get the most flexibility out of this command. When CHARMM is loaded, each atom has associated with it a harmonic force constant initially set to zero. Each call to the CONS HARM command changes the value of this constant for only those atoms specified. When this command is invoked with an atom selection (and KEPP is not specified), only the reference coordinates (XREF,YREF,ZREF) for selected atoms are modified. IMPORTANT NOTE: Each atom may participate in AT MOST one harmonic restraint term. This is a coding limitation designed to maximize compatibility with older CHARMM scripts (i.e. doing a series of minimizations with a decreasing series of force constants). This could be easily modified with a bit of work to increase the capability (at the expense of script compatibility). When multiple restraint sets are used, it is important to note that all selections should be exclusive. When they are not exclusive, then atoms will be assigned to the restraint of the most recent CONS HARM command which selected that atom. In other words, the restraint set number is an atomic property. If restraint sets are broken up, then an error message will be issued. If an entire set is replaced, then the new restraint replaces the old one (without a warning message). ------------------------------------------------------------------------------ OTHER COMMANDS: The harmonic restraints may no longer be read and written to files. The PRINT command still functions for harmonic restraints for information. To examine or modify the internal harmonic restrain data, the SCALar command (arrays: CONStraints,XREF,YREF, and ZREF) may be used (see *Note scalar::(chmdoc/scalar) ). In addition, one may look at the contributions to the energy in detail using the ANALysis command, see *note anal:(chmdoc/analys). ------------------------------------------------------------------------------  File: Cons, Node: Dihedral, Up: Top, Next: Internal Coord, Previous: Harmonic Atom Holding dihedrals near selected values Using this form of the CONS command, one may put restraints on the dihedral angles formed by sets of any four atoms. The improper torsion potential is used to maintain said angles. The command for setting the dihedral restraints is as follows: Syntax: [SYNTAX CONS DIHEdral] CONS DIHEdral [BYNUM int int int int] [FORCe real] [MIN real] [PERIod int] [WIDTh real] [ atom-selection ] [ COMP ] [ 4X(atom-spec) ] [ MAIN ] CONS CLDH Syntactic ordering: DIHE or CLDH must follow CONS, and FORCE, MIN and PERIod must follow DIHE. where: atom-spec ::= { segid resid iupac } { resnumber iupac } DIHEdral adds a torsion angle to the list of restrained angles using the specified atoms, force constant, minimum and periodicity. If an atom selection is used, then the first 4 selected atoms (in order) will define the dihedral angle. If either MAIN or COMP is specified and [MIN real] is not, then the minimum angle value will be determined by the current dihedral angle value in the corresponding coordinate set. If the PERIodicity is zero (improper type), then the force constant has units of kcal/mol/radian/radian, else it has units of kcal/mol. Ecdih = FORCE * max(0, abs( phi - MIN*pi/180) - WIDTH)**2 [ PERIod = 0 ] Ecdih = FORCE * (1-cos( PERIod* (phi - MIN*pi/180 )) ) [ PERIod > 0 ] CLDH clears the list of restrained dihedrals so that different angles or new restraint parameters can be specified. Other commands: The PRINT CONS command, see *note print:(chmdoc/io.doc)print, will work for restraints.  File: Cons, Node: Internal Coord, Up: Top, Next: Quartic Droplet, Previous: Dihedral Holding Internal Coordinates near selected values [SYNTAX CONS IC] Syntax: CONStraint IC [BOND real [EXPOnent integer] [UPPEr]] [ANGLe real] [DIHEdral real] [IMPRoper real] Using this form of the CONS command, one may put restraints on any internal coordinate. For this energy term, the IC table is used. At each energy call, the reference (zero-force) value of each IC is set to the value currently in the IC table. All nonzero bond entries are restrained with the bond constant, using the optional EXPOnent (default 2) in the potential K*(S-S0)**EXPOnent. Second derivatives are currently supported only with EXPOnent=2. The angle, dihedral, and improper terms are only harmonic. The DIHEdral term only applies to IC's of normal type, and the IMPRoper term only applies to the improper IC type (those with a "*") If UPPEr is specified the reference bond length is taken as an upper limit and the restraint potential is applied only if S>S0; this is intended for use with distance restraints from NMR NOE data. All nonzero angle entries are restrained with the angle constant. All dihedrals are restrained with the dihedral constant using the improper dihedral energy potential. If any IC entry contains an undefined atom (zeroes), then the associated bonds,angles, and dihedral will not be restrained. The force constant has units of kcal/mol/radian/radian for both angle and dihedral restraints. The bond force constant has units of kcal/mol/angstrom**EXPOnent. This restraint term is very flexible in that the user may chose which bonds... to restrain by editing an IC table. The major drawback is that all bonds must have the same force constant. The same is true for angles and dihedrals. By listing some IC's several times, the effective force constant is increased. Also, if only angle restraints are desired, then the bond and dihedral constants can be set to zero eliminating their contribution.  File: Cons, Node: Quartic Droplet, Up: Top, Next: Fixed Atom, Previous: Internal Coord The Quartic Droplet Potential [SYNTAX CONS DROPlet] Syntax: CONStraint DROPlet [FORCe real] [EXPOnent integer] [NOMAss] This restraint term is designed to put the entire molecule in a cage. Is is based on the center of mass (or center of geometry if NOMAss is specified) so that no net force or torque is introduced by this restraint term. The potential function is; Edroplet= FORC* sum over atoms (( r-rcm )**EXPO )*mass(i))  File:Cons, Node: Fixed Atom, Up:Top, Next: SHAKE, Previous: Quartic Droplet How to fix atoms rigidly in place [SYNTAX CONS FIX] Syntax: CONS FIX atom-selection-spec { [PURG] } { [BOND] [THET] [PHI] [IMPH] } This command will fix all selected atoms and unfix all non-selected atoms. For example, the command; CONS FIX SELE NONE END will remove all fixing of atoms (except for lonepairs). This command fixes atoms in place by setting flags in an array (IMOVE) which tells the minimization and dynamics alogrithms which atoms are free to move. If atoms are fixed, it is possible to save computer time by not calculating energy terms which involve only fixed atoms. The nonbond and hydrogen bond algorithms in CHARMM check IMOVE and delete pairs of atoms that are fixed in place from the nbond and hbond lists respectively. In addition the PURG or individual energy term options specified with the CONS FIX command allow all or some of the internal coordinate energies associated with fixed atoms to be deleted. Interactions between fixed and moving atoms are maintained. *** NOTE *** because some energy terms are deleted from fixed systems, the total energy calculated with fixed atoms will be different from the total energy of the same system with all atoms free. The forces on the moveable atoms will however be identical. The purpose of this feature is to remove the computational cost of energy terms that do not change for simulations where a large fraction of the atoms are fixed. It is not recommended for any other purpose. The way CHARMM keeps track of fixed atoms is by the IMOVE array in the PSF. The IMOVE array is 0 if the atom is free to move, and has the value 1 if it is fixed. A value of -1 indicates that this atom is a lonepair. ***** WARNING ***** The purge options modify the PSF. The effects of this command cannot be undone by the subsequent releasing of atoms. ***** WARNING ***** The fixing of atoms does not work for constant pressure simulations.  File: Cons, Node: SHAKE, Up: Top, Next: NOE, Previous: Fixed Atom Fixing bond lengths or angles during dynamics. SHAKE is a method of fixing bond lengths and, optionally, bond angles during dynamics, minimization (not ABNR and Newton-Raphson methods), coordinate modification (COOR SHAKe command), and vibrational analysis (explore command). The method was brought to CHARMM by Wilfred Van Gunsteren (WFVG), and is referenced in J. Comp. Phys. 23:327 (1977). When hydrogens are present in a structure, it will allow a two-fold increase in the dynamics step size if SHAKE is used on the bonds. To use SHAKE, one specifies the SHAKE command before any SHAKE constraints usage. The SHAKE command has the following syntax: [SYNTAX SHAKe constraints] SHAKE { OFF } { shake-opt fast-opt 2x(atom-selection) [NOREset] } shake-opt:== [BONH] { [MAIN] } [TOL real] [MXITer integer] [BOND] { COMP } [ANGH] { PARAmeters } [SHKScale real] [ANGL] fast-opt:== { [ FAST [ WATEr water-resn ] ] } { NOFAst } BONH specifies that all bonds involving hydrogens are to be fixed. BOND specifies all bonds. ANGH specifies that all angles involving hydrogen must be fixed. ANGL specifies that all angles must be shaken. BOND is implied if any angles are fixed, otherwise, only the 1-3 distances would be fixed. Coordinates must be read in before the SHAKE command is issued, unless the PARAmeter option is specified. SHAKE constraints are applied only for atom pairs where one atom is in the first atom selection and one atom in the second atom selection. The default atom selection is ALL for both sets. TOL specifies the allowed relative deviations from the reference values (default: 10**-10). MXITer is the maximum number of iterations SHAKE tries before giving up (default: 500). When the SHAKE command is used, it will check that there are degrees of freedom available for all atoms to satisfy all their constraints. Angles cannot be fixed with SHAKE if one has explicit hydrogen arginines in the structure as the CZ carbon has too many constraints. This is a general problem for any structure which has too many branches close together. SHAKE is not recommended for fixing angles. The algorithm converges very slowly in the case where one has three angles centered on a tetravalent atom and the constraints are satisfiable only using out of plane motions. The use of SHAKE modifies the output of the dynamics command. The number appearing to the right of the step number is the number of iterations SHAKE required to satisfy all the constraints. This number should generally be small. When ST2's are present, SHAKE constraints are automatically applied for the O-H bonds and H-O-H angles. There is a PARAmeter option for the SHAKe command. This option causes the shake bond distances to be found from the parameter table rather than from the current set of coordinates. This option is NOT compatible with the use on angle SHAKE constraints, and it will give an error if this is tried. With these commands, the bond energy may be zeroed without any minimization with the command sequence; SHAKE BOND PARA COOR SHAKE [MASS] [SYNTAX SHAKe FAST constraints] SHAKe FAST [WATEr SELEct water_selection END] [OLDWatershake] [ MXITer TOL ] [PARAmeter] [COMP] This command specifies the use of the new vector/parallel SHAKE constraint routines. Certain assumptions are made when this command is issued: The only bonds involved are between heavy atoms and hydrogens, except for water molecules included in the WATEr selection ... end sub-command. This selection is used to indicate the water molecules that have an H-H bond. It is assumed that the selection will include all atoms in the water molecule and that said molecule contains exactly two X-H bonds and one H-H bond where X is any heavy atom. Testing for "hydrogen-ness" is done via the CHARMm hydrog() function which makes it's choice based on atomic mass. The prefered selection is through the use of the RESNAME selection specifier, eg: ... WATEr SELEct RESNAME TIP3 END By default, water molecules selected with the WATEr sub-command will be constrained via the use of a special water-SHAKE routine which uses the direct inversion method. This algorithm is from 25 to 30 % faster than the normal iterative, scalar SHAKE routine. For the rest of the heavy atom -hydrogen bonds, a vector/parallel version of the original SHAKE routine is used. This is about 5X the scalar SHAKE. If the optional keyword OLDWatershake is used, the vector/parallel (not the watershake) routines are used. The rest of the keywords are the same as in the original SHAKE command. Note: that FAST has to be the second word in command line.  File: Cons, Node: NOE, Up: Top, Previous: SHAKE, Next: Restrained Distances [SYNTAX NOE constraints] NOE Invoke the module RESEt Reset all NOE restraint lists. This command clears all existing NOE restraints. Resets scale factor to 1.0 PNOE Turn on the restraint between a given atom specified by ASSIgn and a point specified by CNOX, CNOY and CNOZ intead of a restraint between two atoms. The use of this restraint is desirable for docking, and loop refinements. ASSIgn [KMIN real] [RMIN real] [KMAX real] [RMAX real] [FMAX real] [TCON real] [REXP real] {2X(atom_selection)} {[CNOX real] [CNOY real] [CNOZ real] 1X(atom selection) } Assign a restraining potential between the atoms the first selection and the atoms of the second selection. Where multiple atoms are selected, R = [ average( Rij**(1/REXP) ) ]**REXP where (i) runs over the first atom selection and (j) runs over the second atom selection. The default REXP value is 1.0 (a simple average). An REXP value of 3.0 may be optimal for NOE averaging. / 0.5*KMIN*(RAVE-RMIN)**2 RRLIM and RAVE=R TCON=0 RAVE=RRAVE**(-1/3) TCON>0 RRAVE=RRAVE*(1-DELTA/TCON)+R**(-3)*DELTA/TCON for initial conditions, RRAVE=RMAX**(-3) DELTA is the integration time step. For minimization, the value is either 0.001ps or the previous simulation value. Where: RLIM = RMAX+FMAX/KMAX (the value of RAVE where the force equals FMAX) Defaults for each entry: KMIN=0.0, RMIN=0.0, KMAX=0.0, RMAX=9999.0, FMAX=9999.0 TCON=0.0, REXP=1.0 Also, the old sytax is supported: ASSIgn rminvalue minvariance maxvariance 2X(atom_selection) For this format, KMAX=0.5*Kb*TEMP/(maxvariance**2) KMIN=0.5*Kb*TEMP/(minvariance**2) RMIN=rminvalue RMAX=rminvalue READ UNIT Reads restraint data structure from card file previously written. WRITe UNIT [ANAL] Writes out the restraint data in card format to a file on the specified unit. A CHARMM title should follow the command. SCALE are saved together with the lists in the NOE common block. The ANAL option will print out the distances and energy data computed with the current main coordinates. PRINT [ANAL [CUT real]] Same as the WRITe command except to the output file and slightly more user friendly form. A positive CUT value will list only those that have a distance that exceeds RMAX by more than DCUT. SCALe [real] Set the scale factor for the NOE energy and forces. Default value: 1.0 TEMPerature real Specify the temperature for the old format. END Return to main command parser. No other commands (I/O or loops) are supported inside the NOE module. Looping can be performed outside if necessary. The units are Kcal/mol/A/A for force constants and Angstroms for all distances. EXAMPLE. Set up some NOE restraints for one strand of a DNA-hexamer in a file to be streamed to from CHARMM. * SOME NOE RESTRAINTS FOR DNA. ASSUME PSF, COORD ETC ARE ALREADY PRESENT * ! First clear the lists NOE RESET END ! Since there are many identical atom pairs we use a loop set 1 1 label loop NOE ! Sugar protons, same in all six sugars (don't pay any attention to ! the numeric values) ASSIgn SELE ATOM A @1 H1' END SELE ATOM A @1 H2'' END - KMIN 1.0 RMIN 2.7 KMAX 1.0 RMAX 3.0 FMAX 2.0 ASSIgn SELE ATOM A @1 H3' END SELE ATOM A @1 H2'' END - KMIN 1.0 RMIN 2.7 KMAX 1.0 RMAX 3.0 FMAX 2.0 END incr 1 by 1 if 1 le 6 goto loop ! Now do some more specific things OPEN WRITE UNIT 10 CARD NAME NOE.DAT NOE SCALE 3.0 ! Multiply all energies and forces by 3 WRITE UNIT 10 * NOE RESTRAINT DATA FROM DOCUMENTATION EXAMPLE * PRINT ANAL ! See what we have so far PRINT ANAL CUT 2.0 ! list END RETURN  File: Cons, Node: Restrained Distances, Up: Top, Previous: NOE, Next: External Apply general restrained distances allowing multiple distances to be specified. This restraint term has been added to allow for facile searching of a reaction coordinate, where the reaction coordinate is estimated to be a linear combination of several distances. By Bernard R. Brooks - NIH - March, 1995 [SYNTAX Restrained Distances] RESDistance [ RESEt ] [ SCALE real ] [ KVAL real RVAL real [EVAL integer] - [ POSItive ] [ IVAL integer ] repeat( real first-atom second-atom ) ] [ NEGAtive ] E = 1/EVAL * Kval * Dref**EVAL Where: Dref = K1*R1**Ival + K2*R2**Ival + ... + Kn*Rn**Ival - Rval Where K1,K2,...Kn are the real values in the repeat section and R1,R2,...Rn are the distances between specified pair of atoms. RESEt Reset the restraint lists. This command clears the existing restraints. Resets the scale factor to 1.0 SCALe real Set the scale factor for the energy and forces. Default value: 1.0 POSITIVE Include this restraint only when Dref is positive. NEGATIVE Include this restraint only when Dref is negative. If anything else is on the command line then a new restraint is added to the list of distance restraints. KVAL real The force harmonic constant RVAL real The target distance IVAL integer The exponent for individual distances. EVAL integer The exponent (default 2). EVAL must be positive. repeat( real first-atom second-atom ) The real value is a scale factor for the distance between the first and second specified atoms in the pair. EXAMPLES: 1. Create a reaction coordinate for QM/MM 2. Set up some restraints to force three atoms to make an equilateral triangle. !!! 1 !!! Create a reaction coordinate for QM/MM OPEN WRITE CARD UNIT 21 name reaction.energy OPEN WRITE FILE UNIT 22 name reaction.path TRAJECTORY IWRITE 22 NWRITE 1 NFILE 40 SKIP 1 * trajectory of a minimized reaction path * SET ATOM1 MAIN 11 OG SET ATOM2 MAIN 11 HG SET ATOM3 MAIN 23 OD1 SET 1 1 SET V -5.0 LABEL LOOP SKIP NONE ! make sure all energy terms are enabled RESDistance RESET KVAL 2000.0 RVAL @v - 1.0 @atom1 @atom2 -1.0 @atom2 @atom3 MINI ABNR NSTEP 200 NPRINT 10 PRINT RESDistances ! print a check of distances TRAJ WRITE ! write out the new minimized frame SKIP RESD ! turn off the restraint energy term ENERGY ! recompute the energy without restraints WRITE TITLE UNIT 21 ! write out the current restraint distance and energy * @V ?ENER * INCR 1 BY 1 ! increment the step counter INCR V BY 0.25 ! increment the restraint value IF 1 LT 40.5 GOTO LOOP RETURN !!! 2 !!! Make a water nearly an equilateral triangle set atom1 WAT 1 O set atom2 WAT 1 H1 set atom3 WAT 1 H2 RESDistance RESEt RESDistance KVAL 1000.0 RVAL 0.0 - 1.0 @atom1 @atom2 - 1.0 @atom1 @atom3 - -2.0 @atom2 @atom3 RESDistance KVAL 1000.0 RVAL 0.0 - 1.0 @atom1 @atom2 - -2.0 @atom1 @atom3 - 1.0 @atom2 @atom3 RESDistance KVAL 1000.0 RVAL 0.0 - -2.0 @atom1 @atom2 - 1.0 @atom1 @atom3 - 1.0 @atom2 @atom3 print resdistances mini abnr nstep 200 nprint 10 print resdistances stop !!! 3 !!! Prevent an atom from moving more than 20A from the others, ! but have no restraint energy when no distance is large. set atom1 SOLV 1 OH2 set atom2 SOLV 2 OH2 set atom3 SOLV 3 OH2 set atom4 SOLV 4 OH2 set atom5 SOLV 5 OH2 RESDistance RESEt RESDistance KVAL 1.5E-12 RVAL 6.4E7 IVAL 6 POSITIVE - 1.0 @atom1 @atom2 - 1.0 @atom1 @atom3 - 1.0 @atom1 @atom4 - 1.0 @atom1 @atom5 - 1.0 @atom2 @atom3 - 1.0 @atom2 @atom4 - 1.0 @atom2 @atom5 - 1.0 @atom3 @atom4 - 1.0 @atom3 @atom5 - 1.0 @atom4 @atom5 print resdistances mini abnr nstep 200 nprint 10 print resdistances stop  File: Cons, Node: External Forces, Up: Top, Previous: Restrained Distances, Next: Top [SYNTAX External Forces] PULL { FORCe } XDIR YDIR ZDIR [PERIod ] { EFIEld } { OFF } { LIST } [WEIGht] atom-selection A force will be applied in the specified direction on the selected atoms either as a constant: FORCe specified in picoNewtons (pN) or oscillating in time: FORCe*COS(TWOPI*TIME/PERIod), FORCe PERIod time is counted from the start of the dynamcis run. The force due to an electrical EFIEld (V/m) (possibly also oscillating) may also be specified, in which case partial charges are taken from the psf and used to calculate the force. If WEIGht is specified the forces are multiplied by the wmain array. Each invocation of this command adds a set of forces to the previously defined set. PULL OFF turns off all these forces. PULL LIST produces a listing. NB! Forces defined by PULL will move atoms in the specified direction, which is opposite to that listed by the forces from the COOR FORCE command. CHARMM Element doc/corman.doc 1.1  File: Corman, Node: Top, Up: (chmdoc/commands.doc), Next: Syntax The Coordinate Manipulation Commands The commands in this section are primarily used for moving some or all of the atoms. There is a wide range of commands and options. All of the commands may be used on either the main coordinate set, or the comparison set. Some commands require both sets of coordinates. * Menu: * Syntax:: Syntax of the coordinate manipulations commands * Simple:: Descriptions of the simple commands * Function:: Descriptions of the remaining commands * Substitutions:: Description and usage of substitution values  File: Corman, Node: Syntax, Up: Top, Next: Simple Syntax of Coordinate Manipulation commands [SYNTAX COORdinate manipulation] COORdinates { INITialize } [COMP] [atom-selection] { COPY } [WEIGhting_array] { SWAP } [IMAGes] { AVERage [ FACT real ] } { SCALe [ FACT real ] } { MASS_weighting } { ADD } { SET vector-spec } { TRANslate vector-spec } { ROTAte vector-spec {PHI real} } { {MATRix} } { ORIEnt [MASS] [RMS] [NOROtation] } { RMS [MASS] } { DIFFerence } { FORCe [MASS] } { SHAKe [MASS] } { DRAW draw-spec } { DISTance distance-spec [DIFF] } { DIPOle } { MINDist distance-spec } { READ io-specification } { WRITe io-specification } { PRINt io-specification } { RGYR [MASS] [FACT ] } { OPERate image_name } { STATistics [MASS] } { VOLUme {SPACe integer} } { } { DUPLicate { 2X(atom-selection) } } { { PREVious } } COORdinates HISTogram { X } [IUNIt int] HMIN real HMAX real HNUM integer - { Y } [HSAVe] [HPRInt] [HNORm real] [HDENsity real] - { Z } [COMP] [WEIGhting_array] atom_selection { R } COORdinates { HBONd } [CUTHB ] [CUTHA ] [IUNIt ] - { CONTact } [BRIDge ] [VERBose] [TCUT real] - 2X(atom-selection) traj-spec COORdinates DYNAmics [COMParison] [PAX] [atom-selection] [NOPRint] - traj-spec [ORIENT [MASS] [atom-selection] ] COORdinates PAXAnalysis [COMParison] [atom-selection] [NOPRint] [SAVE] - traj-spec COORdinates SEARch { search-spec } disposition-spec { INVErt } { KEEP xvalue yvalue zvalue } { EXTEnd RBUFf real } search-spec :: [atom-selection] [COMP] [IMAGe] [operation-spec] [XMIN real] [XMAX real] [XGRId integer] [YMIN real] [YMAX real] [YGRId integer] [ZMIN real] [ZMAX real] [ZGRId integer] operation-spec ::= { } { [VACUum] } { [RESEt] } { [RCUT real] } { FILLed } { AND } { [RBUFf real] } { HOLES } { OR } { XOR } { ADD } disposition-spec::= { [NOPRint] } [NOSAve] [CREAte segid CHEM type] {PRINt [UNIT int]} [ SAVE ] COORdinates SURFace [atom-selection] [WEIGhting] { CONTact-area } [ACCUracy real] { ACCEssible-area } [RPRObe real] COORdinates CONVert-from/to-unit-cell [ from | to ] - [atom-selection] [COMP] [IMAGe] - a b c alpha beta gamma [ from | to ] ::= [ FRACtional | SYMMetric | ALIGned ] COORdinates AXIS atom-selection [atom-selection] [MASS] [COMP] [IMAGEs] COORdinates LSQP [ NORM ] [VERBose] [MASS] [COMP] [IMAGEs] [WEIGh] - [ MAJOr ] [ MINOr ] atom-selection COORdinates COVAriance traj-spec 2x(atom_selection) [UNIT_for_output int] - [RESIdue_average_nsets integer] COORDinates DMAT - [RESIdue_averaging] [NOE_weighting] [SINGle_coordinate_file] - [CUTOff ] [UNIT_for_output ] [TRAJectory] [CUTOff ] - [PROJect UPRJ ] [PROBability UPRB ] [TOLE ] - traj-spec 2x(atom_selection) [RMSF] COORdinates PUCKer [SEGId segid] RESId resid1 [TO resid2] [AS | CP] COORdinates HELIx atom-selection [atom-selection] COORdinate ANALysis {WATer} - {XREF YREF ZREF } - ! setup arbitrary analysis point {CROSs|SITE [MULTI] } - ! setup solute analysis site or ! cross terms for arbitrary solvent traj-spec - ! reading trajectories NCORs RSPIn RSPOut - ! MSD/IVAC set-up RSPHere DR MGN - ! g(r) setup RDSP - ! cutoff for DENS,KIRK and DBF DENS - ! userspecified bulk density ! (atoms/A**3) ! for normalization of g(r) {IMSD |IVAC } IDENs - ! output for MSD, VAC and DENsity {IGDISt [IHH ] [IOH ]|ISDISt } - ! g(r) requests {BYGRoup|BYREsidue|BYSEgment} ! discard distances WITHIN ! specified unit for g(r) IKIRkg - ! Kirkwood g-factor (dipole correlations) XBOX YBOX ZBOX - !PBC info for analysis IFDBF IFDT RCUT ZP0 NZP - ! DBF analysis IHIST IPDB [XMIN XMAX DX ] - !3D histogram [YMIN YMAX DY ] - [ZMIN ZMAX DZ ] - [WEIGht] [CHARge] [DIPOle] - [THREshold ] [NORM ] - EXVC MCP MCSH - ! EXcludedVolumeCorrection RPRObe ISEEd atom-selection:== (see *note select:(chmdoc/select.doc).) distance-spec::= { WEIGhting vector-spec atom-selection } { } { [UNIT int] [CUT real] [ENERGy [CLOSe]] 2X(atom-selection) - } { [Nonbonds] } { [NO14exclusions] } { [NOEXclusions] } - { NONOnbonds } { 14EXclusions } { EXCLusions } [TRIAngle] [ HISTogram HMIN real HMAX real HNUM integer - [HSAVe] [HPRInt] [HNORm real] [HDENsity real] ] vector-spec::= { [XDIR real] [YDIR real] [ZDIR real] } [DISTance real] [XCEN real] [YCEN real] [ZCEN real] [FACTor real] { AXIS } draw-spec::= [DFACt real] [NOMO] UNIT integer io-specification:== (see *note io:(chmdoc/io.doc).) traj-spec::= [FIRSt int] [NUNIts int] [NSKIp int] [BEGIn int] [STOP int]  File: Corman, Node: Simple, Up: Top, Previous: Syntax, Next: Function Descriptions of the simple coordinate manipulation commands All of these commands allow either the main coordinate set (default), or the comparison set (COMP keyword) to be modified. The other coordinate set is only changed by the SWAP command and the ORIEnt RMS command when the specified atoms are not centered about the origin. Each of these commands may also operate on a subset of the full atom space. The selection specification should be at the end of the command. The default atom selection includes all atoms. If the IMAGes keyword is specified, then the operation will be performed on the image atoms as well (if images are present). ------------------------------------------------------------------------------ 1) The INITialize command The INITialize command returns the coordinate values of the specified atoms to their start up values (9999.0). The main use of this command is in connection with the IC BUILD command, which may only find coordinates for atoms with the initial value. ------------------------------------------------------------------------------ 2) The COPY command The COPY command will copy the coordinate values into the specified set FROM the other coordinate set. ------------------------------------------------------------------------------ 3) The SWAP command The SWAP command will cause the coordinate values of the specified atoms to be swapped with the comparison set. ------------------------------------------------------------------------------ 4) the AVERage command The AVERage command will generate a new coordinate set at a point along the displacement vector between the present coordinate set and the other set. The FACTor value determines the relative step along this vector. Its default value is 0.5 (a true average). A FACTor value of 1.0 is equivalent to the copy command. Negative or greater than unit positive values are also allowed. ------------------------------------------------------------------------------ 5) The SCALe command The SCALe command will cause the coordinate values for all selected values to be scaled by a required scale factor. This option is designed to work with coordinate displacement vectors. A scale factor of zero will set the selected coordinate values to zero. This option may also be useful in plotting. ------------------------------------------------------------------------------ 6) The MASS_weighting command The MASS_weighting command will cause all selected coordinates to be scaled by the MASS of each atom. If the WEIGht option is specified, the weighting array will be scaled. ------------------------------------------------------------------------------ 7) The ADD command The add command will add the main and the comparison coordinate values and store the results in the selected coordinate set. As with other commands, only selected atoms will be modified. If an atom in either set is undefined, then the sum will also be undefined. This option is designed for use in cases where one or both coordinate sets contain coordinate displacement vectors. ------------------------------------------------------------------------------ 8) The SET command The SET command will set all coordinate values of selected atoms to a specified value determined by the vector specified. This is a simple manner in which to zero a coordinate set with the command; COOR SET XDIR 1.0 DIST 0.0 Note, the XDIR keyword value was included so that the vector has a nonzero norm (required for all vector specifications). ------------------------------------------------------------------------------ 9) The TRANslate command The TRANslate command will cause the coordinate values of the specified atoms to be translated. The translation step may be specified by either X,Y, and Z displacements, or by a distance along the specified vector. When no distance is specified, The XDIR,YDIR, and ZDIR values will be the step vector. If the AXIS keyword is used, then the translation will be along the axis defined by the previous COOR AXIS command. For this option, a distance may be specified, but if it isn't, then the translation distance will be the COOR AXIS vector length ------------------------------------------------------------------------------ 10) The ROTAte command The ROTAte command will cause the specified atoms to be rotated about the specified axis vector through the specified center. The vector need not be normalized, but it must have a non zero length. If the AXIS keyword is used, then the axis and center information from the last COORdinates AXIS command will be used. The PHI value gives the amount of rotation about this axis in degrees. (NOTE: this command uses a left handed sense, not the right hand rule...) Only the atoms specified will be rotated. If the MATRix keyword is used the rotation will be made using an explicit rotation matrix, input in free format on the three following lines (3 real numbers /line): U(1,1) U(1,2) U(1,3) U(2,1) U(2,2) U(2,3) U(3,1) U(3,2) U(3,3) ------------------------------------------------------------------------------ 11) The ORIEnt command The ORIEnt command will modify the coordinate values of ALL of the atoms. The select set of atoms is first centered about the origin, and then rotated to either align with the axis, or the other coordinate set. The RMS keyword will use the other coordinate set as a rotation reference. The MASS keyword cause a mass weighting to be done. This will align the specified atoms along their moments of inertia. When the RMS keyword is not used, then the structure is rotated so that its principle geometric axis coincides with the X-axis and the next largest coincides with the Y-axis. This command is primarily used for preparing a structure for graphics and viewing. It can also be used for finding RMS differences, and in conjunction with the vibrational analysis. The NOROtation keyword will suppress rotations. In this case, only one coordinate set will be modified. ------------------------------------------------------------------------------ 12) The RMS command The RMS command will compute the RMS or mass weighted RMS coordinate differences between the selected set of atoms just as they lie. This differences from the COOR ORIENT RMS command in that no coordinate modifications are made and no translation is done. ------------------------------------------------------------------------------ 13) The DIFF command The DIFF command will compute the differences between the main and comparison set (or the reverse) and store this difference in the modified coordinate set. Undefined or unselected atoms result in a zero. If the WEIGht keyword is invoked, then the WCOMP array is subtracted from WMAIN and the coordinates are untouched. ------------------------------------------------------------------------------ 14) The FORCe command The FORCe command will copy the current forces (DX,DY,DZ) of the selected atoms to the specified coordinate set. Atoms not selected are given a value of zero. If the MASS keyword is specified, then the forces will be divided by the mass. This would correspond to an acceleration in dynamics. ------------------------------------------------------------------------------ 15) The SHAKe command This command will SHAKE the selected coordinate set with respect to the other (as a reference). A mass weighting may be used. Any atoms that are not selected are considered to be fixed (infinite mass). In order to use this command, the SHAKe command must first be invoked which sets up the shake constraints. ------------------------------------------------------------------------------ 16) The DIPOle command Calculates the dipole moment of selected atoms. If total charge is not zero, the dipole moment is somewhat ill-defined and coordinate system dependent; in this case the center of geometry of the selected atoms is used as origin for the coordinate system in which the dipole moment is calculated. Prints out dipole moment cartesian components and magnitude (in Debyes) and the total charge. CHARMM variables ?CHARGE, ?XDIP, ?YDIP, ?ZDIP, and ?MDIP (charge, x,y,z and magnitude of dipole) are set.  File: Corman, Node: Function, Up: Top, Previous: Simple, Next: Substitutions Descriptions of the remaining corman commands See the descriptions of the simple commands for some background information on these commands. ------------------------------------------------------------------------------ 1) The DISTance command The COOR DIST command will either find distances between atoms or the distances of atoms from a fixed point in space (WEIGh option). This command can find distances within a single coordinate set, or find distances between atoms in two coordinate sets (DIFF option). The DISTance command can find all atom distances between two atom selections. A unit number may be specified (default=6) and a cutoff distance may be included as well (default=8999.0). If no selection is specified, all atoms will be included! The delimiter ENDselection must separate the two sets of atom selections. The van der Waal energy may be requested with the "ENERgy" keyword, and if this option is used, the list of pairs with a positive van der Waal energy may be selected with the "CLOSe" keyword (i.e. only close contacts will be listed). The NEAR option will list the nearest atom in the second atom selection to the atoms in the first selection. The COOR DISTance command doesn't gives distances between excluded atoms unless the "EXCLusions" keyword is specified. This make it much easier to search for bad contacts. Likewise, 1-4 interactions and other interactions may be requested or omitted. The command; COOR DISTance ENERgy CLOSe CUT 5.0 SELE ALL END SELE ALL END - 14EXclusions NONBonds will list all atom pairs that have a positive van der Waal energy. The command; COOR DISTance ENERGY CUT 5.0 NONONbonds NOEXclusions 14EXCLusions - SELE ALL END SELE ALL END will list all 1-4 interactions and energies (and nothing else). The command; COOR DISTance ENERgy CUT 4.5 SELE RESID 23 END SELE ALL END will list all contacts less than 4.5A that residue 23 has with the rest of the system without considering 1-4 interactions or excluded pairs. The 1-4 vdw terms, E14FAC, and EPS values other than 1.0 are recognized. The WEIGht option puts the distance of all selected atoms from some specified point. If no point is specified, then the origin is used. This is most useful in computing magnitudes of forces or coordinate differences. For example, the sequence; ENERGY ... COOR FORCE COMP ! copy forces to the comparison coordinates COOR DIST WEIGH COMP ! put magnitudes in the weighting array. PRINT COOR COMP SELE PROP WCOMP .GT. 5.0 END ! print atoms with large forces. Note that all operations were done on the comparison set. The DIFF keyword causes the selection to work on different coordinate sets, where the first selection corresponds to the set specified (MAIN or COMP), and the second atom selection uses the other coordinate set. The HISTogram option allows a histogram of distances to be produced. With the histogram, the HMIN and HMAX (the range of the histogram in angstroms) and the HNUM (the number of bins) must be specified. The HSAVe keyword causes the histogram values to be saved for subsequent COOR DIST commands. In a loop, this allows g(r) to be calculated from a dynamics trajectory. The HPRInt option will cause the final histogram values to be printed. The HNORm value will be used to normalize the histogram before printing (divide by HNORm). A density value, HDENS, is also required, which is the number of selected objects divided by the volume per object. Also note: In order to get this to work with with the crystal facility, the first atom selection (in the loop) should only include primary atoms, and the second atom selection should include both primary and image atoms. The histogram will be scaled by the reciprocal of the distance squared The histogram will also be scaled by the reciprocal of the distance squared (to get normalized g(r) plots). Three columns of numbers are output; (1) the bin midpoint distance, (2) the normalized g(r), and (3) the total number of pairs within the bin divided by the HNORM value. A PRNLEV less than 5 will suppress the listing of distance pairs. Example of use to get a distance distribution plot: update imgfrq 20 cutim 20.0 traj .... prnlev 4 set 1 1 label loop traj read update inbf 0 IMALL cutim 10.5 coor dist image sele segid main .and. type OH2 end sele type OH2 end - cut 10.5 HIST HMIN 0.0 HMAX 10.0 HNUM 50 HSAVE incr 1 by 1 if 1 .lt. 1000.5 goto loop calc dens = 216.0/30.0 ! #waters/(volume/water) coor dist sele none end sele none end - cut 10.5 HIST HMIN 0.0 HMAX 10.0 HNUM 50 HNORM 1000.0 - HPRINT HDENS @dens ------------------------------------------------------------------------------ 2) The RGYR command The RGYR command can compute the Radius of GYRation, center-of-mass and total mass of the specified atoms. By default the RGYR, uses a unit weighting factor providing the rms distance from the center of geometry. The current keywords are: MASS use mass weighting (otherwise use unit weight per selected atom) WEIG use a weight array (WMAIN or WCOMP) for the weighting FACT constant to be subtracted from each weight The weight arrays can be filled, by using COOR or SCALAR commands, before invoking the RGYR routine. In this way almost any RGYR can be computed. ------------------------------------------------------------------------------ 3) The LSQP command The LSQP command computes the least-squares-plane through the selected atoms. Weighting can be done by the atom masses [MASS], by the weighting array [WEIG], or not at all (default). Output is the equation for the plane, the sum-of-squared distances (weighted) from the plane (SSQ), and the center-of-mass of the selected atoms. The keyword VERBose causes some additional output, most useful of which is the distance from the plane for each atom. The options; NORM, MAJOr, and MINOr select which vector is stored as the AXIS (see COOR AXIS command for more details). The default is to not set the AXIS variables. ------------------------------------------------------------------------------ 4) The OPERate command. The OPERate command processes the selected coordinates through the image transformation specified by name. This command may only be used if an image file has been read. The image_name is one of the image transformation names (WRITE IMAGE TRANS). This is also the SEGID of the image atoms created by the image update procedure. ------------------------------------------------------------------------------ 5) The MINDistance command. The MINDistance command computes the minimum distance between selected coordinates. Usually this command is executed with a double selection. Note that the default distance-spec excludes bonded atoms and 1-4 interactions. If only one selection is given, then it will give the minimum distance of the selected coordinates between the MAIN and COMPARISON set. ------------------------------------------------------------------------------ 6) The STATistics command The STATistics command will print some simple statistics regarding the selected atoms. The values XMIN,YMAX,XAVE,YMIN,YMAX,YAVE, ZMIN,ZMAX,ZAVE,WMIN,WMAX,WAVE are set when this command is executed. These variable values may then be used un subsequent commands with the "?" symbol. For example, the command sequence may be used to shift a structure so that a single atom is in the X-Y plane (e.g. shift in the z-direction); COOR STATistics SELE desired-atom END COOR TRANS ZDIR ?ZAVE FACT -1.0 The MASS option will place the average values at the center of mass. ------------------------------------------------------------------------------ 7) The AXIS command. The AXIS command generates a vector and saves it for subsequent use for either command parsing, or for use as input in the COOR SET, COOR ROTAte, COOR TRANslate, or COOR DISTance WEIGhting commands by using the AXIS keyword. There are two modes for the AXIS command. With a single atom selection, the stored vector is the defined from the origin to the center of geometry/mass of all selected atoms. With two atom selections, the vector spans from the center of the first set of selected atoms to the center of the second. The MASS keyword invokes the usage of the center of mass. The AXIS command sets the variables XAXIs, YAXIs, ZAXIs, RAXIs, XCEN, YCEN, and ZCEN, which may be accessed with the "?" symbol. These values define the actual vector, the length of the vector, and the center of the vector (midpoint). For example, to use the distance between two atoms as a criterion to terminating a run, the following command sequence could be used; SET 1 10.0 COOR AXIS SELE first-atom END SELE second-atom END IF 1 GT ?RAXIs STOP For another example, to rotate the chi-1 torsion of a specified residue BY 30 degrees, the command sequence would be appropriate; DEFINE BACK SELE TYPE O .OR. TYPE N .OR. TYPE H .OR. TYPE CA .OR. TYPE C END COOR AXIS SELE ATOM MAIN 23 CA END SELE MAIN 23 CB END COOR ROTATE AXIS PHI 30.0 SELE RESID 23 .AND. .NOT. BACK END ------------------------------------------------------------------------------ 8) The DUPLicate command. The DUPLicate command copies coordinates between atoms within a structure. The coordinates are copied FROM the first selection TO the second selection. If the selections overlap, watch out!. The matching is done by number within the selected coordinate sets. If the two selection have a different number of atoms, a warning will be issued, and the smaller number will be used. For example, if one needs to compute the relative orientation between two alpha helicies, the following input might be used; COOR COPY COMP COOR DUPL COMP SELE backbone of first END SELE backbone of second END COOR ORIE RMS MASS COMP SELE backbone of second END This will give the RMS shift between these helicies as well as the coordinate transformation required to map one into the other. The PREVious option may be used with a single atom selection. This assigns the coordinate position of selected atoms to the value of the previous atom (by number). This has been used with the command; COOR DUPLicate PREVious SELE TYPE H* END to assign hydrogen atom positions to that of the associated heavy atom. The COMP keyword causes only the comparison coordinates to be used and modified. Otherwise, the entire operation involves only the main coordinates. ------------------------------------------------------------------------------ 9) The DYNAmics command The COOR DYNAmics command will read a (set of) dynamics trajectory files and compute the average coordinates (stored in the selected coordinate set) and the isotropic rms fluctuations (stored in the weighting array). The first unit number (FIRSt)(default 51), number of units (NUNIts) (default 1), frequency of accepted coordinate sets (NSKIp)(default 1), starting set (BEGIn)(default first set), last set (STOP)(default last set), may be specified. Option values are not remembered with subsequent COOR DYNA commands. The NOPRint supresses much of the output. If the keyword ORIENT is present, all coordinate frames will be RMS re-oriented with respect to the COMParison set (must be defined); if the word MASS is also there the coordinates will be mass-weigthed for re-orientation; if a second atom selection is provided, only those selected atoms will be used. The PAX command causes the Principal AXis of the motion of each atom to be computed and save. The print out gives the direction and magnitude of the fluctuation as well as the anisotropies. The PAX data is saved for a subsequent COOR PAXAnal command if further analysis is desired. ------------------------------------------------------------------------------ 10) the PAXAnal command The COOR PAXAnal command computes additional data regarding the Pricipal AXis data (computed by the most recent COOR DYNA PAX command). The trajectory must be reopened and reread, or a different trajectory may be substituted. This command prints data for each selected atom and averages over the selected atoms. The printout includes the skew and kurtosis, anisotropies, as well as all of the low moments of the motion. The SAVE option causes the PAX data structure (from the COOR DYNA PAX command) to be saved (for subsequent COOR PAXA commands). ------------------------------------------------------------------------------ 11) the SEARch command COORdinates SEARch { search-spec } disposition-spec { INVErt } { KEEP xvalue yvalue zvalue } { EXTEnd RBUFf real } search-spec :: [atom-selection] [COMP] [IMAGe] [operation-spec] [XMIN real] [XMAX real] [XGRId integer] [YMIN real] [YMAX real] [YGRId integer] [ZMIN real] [ZMAX real] [ZGRId integer] operation-spec ::= { } { [VACUum] } { [RESEt] } { [RCUT real] } { FILLed } { AND } { [RBUFf real] } { HOLES } { OR } { XOR } { ADD } disposition-spec::= { [NOPRint] } [NOSAve] [CREAte segid CHEM type] {PRINt [UNIT int]} [ SAVE ] The SEARch command generates and/or manipulates a grid of small volume elements. The SEARch command will search through a set of grid points for vacuum space points (i.e. points outside the van der Waal radius of any atom). In the default mode (NOPRint), only the relative volume of filled and vacuum points are printed concerning the selected atoms. The grid specifiers must be input (min, max, and grid) for each dimension. (grid implies number of grid points. Hence XMIN -10.0 XMAX 10.0 XGRID 41 implies a half Angstrom sampling along the x direction) The FILLed option will cause non-vacuum points to be listed or plotted. The PRINt option will cause all found grid points to be listed on the output unit specified (default 6). For this command, the atom sizes (radii) are taken from the weighting array. To get van der Waal radii into the weighting array, the command; SCALar WMAIn = RADIus may be used. If a hole big enough to stuff a water into is to be found, then the command sequence; SCALar WMAIn = RADIus SCALAR WMAIN ADD 1.6 SCALAR WMAIN MULT 0.85 would be probably the best to use. If the RCUT or RBUFf value is set to a nonzero value, then the accessible volume command is enabled. When RCUT is set, this is the maximum radius. When RBUFf is set, then the maximum radius is the weighting array plus the RBUFf value. The weighting array is returned with the fraction of free volume in the shell from the atom radius to the maximum radius. If the HOLEs keyword is set, only the grid points not connected to the first point (point in the negative corner of the box) are considered. In this way, the volume of just the holes can be analyzed and saved. The "ADD" option for the COOR SEARCH command has been added to allow the calculation of partial occupancy factors. This allow holes in proteins to be analyzed for flexibility and variability. It is possilbe to use multiple COOR SEARch commands and to use boolean operations to combine the results. For example, the script sequence; COORdinates SEARch IMAGe - XMIN -10.0 XMAX 10.0 XGRId 20 - YMIN -10.0 YMAX 10.0 YGRId 20 - ZMIN -10.0 ZMAX 10.0 ZGRId 20 - NOPRINT VACUUM SAVE .... SCALAR WMAIN ... .... COORdinates SEARch IMAGe - XMIN -10.0 XMAX 10.0 XGRId 20 - YMIN -10.0 YMAX 10.0 YGRId 20 - ZMIN -10.0 ZMAX 10.0 ZGRId 20 - AND PRINT UNIT 22 RBUFF 2.0 FILLED NOSAVE Note, the results of these two commands are computed and the intersection (AND) is printed. The first command needs a "SAVE" in order for the results to be saved. Also, the grids (if specified) must exactly match (same number of grid points in all dimensions) for this operation to work. The COOR SEARch command allocates space, if needed, and frees the space when the NOSAve option is used. Thus, if four COOR SEARch commands are needed for a single computation, the first must have the SAVE option. The only way to free the space allocated by the COOR SEARch SAVE command is to run another COOR SEARch command with the NOSAve option. If the CREAte option is used then the specified grid points will be added to the PSF as dummy atoms. The chemical type of the dummy atom must be specified and it must be present in the current RTF. This option can be used for graphics or for other hole analysis (shape,...). This option will add one segment to the PSF, one residue and atoms and groups equal to the number of selected grid points. ------------------------------------------------------------------------------ 12) the VOLUme command The VOLUme command will compute the volume of a selected set of atoms. Its operation is the same as that of the SEARch command, except that only the volume is printed and the degree of exposure for each atom is returned in the weighting array. The SCALAR storage arrays must be filled before using this command. The first storage array [1] must contain the radii of each atom (RMIN) and the second storage array must contain the outer probe distance (RMAX) for each atom. The free volume within the RMIN to RMAX range and not within RMIN of any other atom will be returned in the weighting array as a ratio of the maximum possible value. For example a completely exposed atom will return a value of 1.0 and an atom in the interior of a protein would return a value of 0.0. The HOLEs keyword feature causes holes within the selected atoms to be filled before computing the total volume and the accesible volume. SPACE is a maximum number of cubic pixels i.e. SPACE = x_points * y_points * z_points Larger SPACE value results in more accurate calculation but it takes more memory an computer time. Number of points in x,y and z directions are determined according to the formula: factor = ( SPACE / (a*b*c) ) ** (1/3) x_points = factor*a y_points = factor*b z_points = factor*c where a, b and c are dimensions of the smallest rectangular box enclosing the molecule. ------------------------------------------------------------------------------ 13) The SURFace command The COOR SURFace command computes the Lee and Richards surface for selected atoms and stores the result in the appropriate weighting array. If the "WEIGhting" keyword is used, the radii are obtained from the weighting array (and then written over), otherwise the radii are obtained from the parameter file values. The radius of the probe may be specified (default 1.6) and the accuracy may be specified (default 0.05). Either ACCEssible surface (default) or CONTact surface may be specified. Contact surface is equivalent to Accessible surface if a zero probe radius is used. If the accuracy is not specified (or set to zero), then the analytic result is provided. If a nonzero accuracy is provided, then the original Lee and Richard's (points on a sphere) algorithm is used. ------------------------------------------------------------------------------ 14) The HELIX command The COOR HELIx command will analyze a single helix, or the relative orientation of two helicies. The use this command, one or two atom selections should be provided selecting ONLY the atoms which will be used to define the helix. The order of these atoms is important. With a single atom selection, this command calculates the normalized axis (A) and the perpendicular vector (R0) from the origin to A of the cylinder most closely approximating a helix on which the selected atoms best fit (Algorithm by J. Aqvist Computers & Chemistry Vol. 10, pp97-99, (1986)). With a double atom selection, this command also computes helix axis and helix-helix structure analysis (Algorithm by Chotia, Levitt, and Richardson JMB 145, P215-250 (1981)). ------------------------------------------------------------------------------ 15) The CONVert command The COOR CONVert command will cause the coordinates of all defined and selected atoms to be transformed from the unit cell to cartesian coordinates or back from cartesian to fractional coordinates. Two orientations in cartesian coordinates are supported : ALIGned - in which b-vector is along y-axis and a-vector in xy-plane (this is old charmm standard) SYMMetric - in which shape matrix constructed from unit cell vectors is symmetric Two keywords in any order [FRAC|ALIG|SYMM] are required after CONVert. Unit cell parameters (a,b,c,alpha,beta,gamma) follow in the same line. The angle values are specified in degrees. See the routine CONCOR for details concerning the transformation. As an example, the following manipulations should have no net affect on the coordinates, COOR COPY COMP COOR CONVERT SYMMETRIC FRACTIONAL 5.6 12.2 5.4 80.0 95. 100. COOR CONVERT FRACTIONAL SYMMETRIC 5.6 12.2 5.4 80.0 95. 100. COOR CONVERT SYMMETRIC ALIGNED 5.6 12.2 5.4 80.0 95. 100. COOR CONVERT ALIGNED FRACTIONAL 5.6 12.2 5.4 80.0 95. 100. COOR CONVERT FRACTIONAL ALIGNED 5.6 12.2 5.4 80.0 95. 100. COOR CONVERT ALIGNED SYMMETRIC 5.6 12.2 5.4 80.0 95. 100. COOR DIFF COOR STAT When working with a triclinic system, the user should be aware of the form of the coordinates. Most of the data from crystallography is in fractional (coordinates between zero and one) or in the aligned frame. NOTE: All of the internal use in CHARMM for energy calls, minimization, or dynamics ASSUMES that the coordinates are in the symmetric frame. ------------------------------------------------------------------------------ 16) The COVAriance command The covarience command under coordinate manipulations computes covariances of the spatial atom displacements of a dynamics trajectory for selected pairs of atoms. mu = E[ (R - E[R ]) (R - E[R ] ) JK J J K K = E[R R ] - E[R ] E[R ] J K J K and the normalized covariance matrix is given by CO = mu / SQRT(mu mu ) JK JK JJ KK The command syntax and varibles are as in the coor dynamics command. The exceptions are the keywords: SET1: specifies the selection for the "J" groups in covarience SET2: specifies the selection for the "K" groups in covarience UNIT_for_output: specifies unit for output of covarience matrix (ascii) RESIdue_average: is a logical for computing the average over residues in SET2 specification. When followed by NSETS: equal to 2 the average is over both SET1 and SET2 giving a NRES1 x NRES2 covarience matrix. ------------------------------------------------------------------------------ 17) The DMAT command This command is accessed with the command COOR DMAT and provides some general tools for the calculation, manipulation and storage/extraction of distance matrix based properties. This routine has some overlap with the new distance command introduced by Bernie Brooks but also provides significant complementarity in extending the range of properties computed. The entire syntax is: [SYNTAX] COORdinates DMAT - RESIdue_average NOE_weighting - SINGle - FIRSt_unit NUNIt BEGIn SKIP - STOP 2x - UNIT_for_output TRAJectory CUTOff - PROJect UPRJ PROBability UPRB TOLE [RMSF] The command structure is like that of most other coordinate manipulation commands other sub-parser keywords are: UNIT the distance matrix will be written to the unit number specified as an ASCII file unless the TRAJ keyword is specified, in which case a binary "trajectory" of the distance matrix will be written. RESIdue this keyword specifies to compute the distance matrix for a center of geometry weighted average of residues NOE this keyword denotes that the averaging over distances in the distance matrix should be inverse sixth power weighted. TRAJ write a dynamic trajectory file of the distance matrix SINGle process only a single coordinate file CUTOff print only those values of the distance matrix which are smaller than cutoff value PROJect project out a subset of contacts for printing UPRJ read projection matrix from unit UPRJ PROB compute the contact probability based on differences from reference contact map read from UPRB and with an upperbound tolerance of TOLE RMSF Computes the root mean square fluctuation in the distance matrix from the trajectory. Disables the printing of the binary file. Note: The binary file produced is analogous to the binary trajectory files and contain the following information: WRITE(UNIT) HDRD,ICNTRL CALL WRTITL(TITLEA,NTITLA,UNIT,-1) WRITE(UNIT) NSET1,NSET2 WRITE(UNIT) (IND1(I1),I1=1,NSET1) WRITE(UNIT) (IND2(I2),I2=1,NSET2) and then nframes of WRITE(UNIT) ((CO(I1,I2),I1=1,NRES1),I2=1,NRES2) Where ICNTRL is a 20 element integer array with the following data: ENDDO ICNTRL(1) = (STOP - BEGIN)/SKIP ICNTRL(2) = BEGIN ICNTRL(3) = SKIP ICNTRL(4) = STOP - BEGIN ICNTRL(5) = NSAV ICNTRL(8) = NDEGF ICNTRL(9) = NATOM - NFREAT CALL ASS4(ICNTRL(10),SKIP*DELTA) IF(LNOE) THEN ICNTRL(11) = 1 ELSE ICNTRL(11) = 0 ENDIF IF(LRESI) THEN ICNTRL(12) = 1 ELSE ICNTRL(12) = 0 ENDIF and NSET1[2] are the number of atoms comprising the two selections and IND1[2](NSET1[2]). The distance matrix CO(NRES1,NRES2) is a 2-D array of size either NSET1 x NSET2 or NRES(NSET1) x NRES(NSET2) depending on whether the residue flag was used in processing the commands Examples of usage: ------------------ 1. Compute the distance matrix for a single coordinate file (resident in the main coordinate set) and print this matrix to a file linked to fortran unit 1. open unit 1 write form name total.dmat COOR DMAT SINGLE UNIT 1 SELE ALL END SELE ALL END 2. Compute the side chain-side chain center of geometry distance map from a single coordinate file and print the distanice matrix to unit 1 zeroing all elements of the matrix with distances greater than 6.5 angstroms define bb select ( type ca .or. type n .or. type c .or. typ o ) end define side select ( (.not. bb) .and. (.not. hydrogen) ) end open unit 1 write form name side.dmat coor dmat residue_average single unit 1 cutoff 6.5 select side end - select side end 3. Compute the average hydrogen atom-hydrogen atom distance map from a trajectory file on unit 10 and print the average distance matrix to unit 1. Use NOE inverse-sixth power weighting in the averaging and "filter-out" all distances in the final map with values greater than 6.0 angstroms. open unit 10 read unform name trajectory.crd open unit 1 write form name noe.dmat coor dmat unit 1 cutoff 6.0 noe_weighting select hydrogen end - select hydrogen end - first_unit 10 nunit 1 begin 100 skip 100 stop 10000 4. Compute the center-of-gemoetry distance matrix for side chains and write this as a binary "trajectory" file to unit 1. Read the trajectory from unit 10. open unit 10 read unform name trajectory.crd open unit 1 write unform name side.dm-trj define bb select ( type ca .or. type n .or. type c .or. typ o ) end define side select ( (.not. bb) .and. (.not. hydrogen) ) end coor dmat residue_average unit 1 traj select side end select side end - first_unit 10 nunit 1 begin 100 skip 100 stop 10000 5. Compute the center-of-geometry contact map probability based on a precomputed distance matrix (e.g. from a PDB structure) based on a 6.5 A cutoff. (This example is for the interdomain (helix-helix) contacts in GCN4. The two helices are segids zipa and zipb.) ! First contacts open unit 1 read unform name "traj/crdp/2zta/2zta_d1-60p.crd" ! trajectory file to use to compute probability from open unit 2 write form name "distance_matrix/2zta_d1-60p.dmatp" ! file to write contact probability matrix to open unit 3 read form name "distance_matrix/2zta_full.dmat ! reference contact map coordinates dmat residue unit 2 - first 1 nunit 1 begin 100 skip 100 stop 600000 - select side .and. ( segid zipa ) end - select side .and. ( segid zipb ) end - probability uprb 3 tole 0.3 cutoff 6.5 close unit 1 close unit 2 close unit 3 6. The following example shows the use of the dmat command to count the number of contacts (native and non-native) throughout the course of a trajectory using the distance matrix projection operator and the fact that the number of contacts are accessible through the ?ncontact variable. label dotraj ! Now we loop over the trajectory and compute time dependent properties open unit 1 read unform name "traj/crdp/2zta/2zta_d1-60p.crd" open unit 10 write form name "distance_matrix/2zta_d1-60p.traj" write title unit 10 *# Properties for Contacts *# trajectory 2zta_d1-60p. *# time(ps) C(native) C(total) * traj iread 1 nread 1 begin 500 skip 500 stop 600000 set time 1.0 set frame 1 label loop trajectory read ! First get the contact information open unit 3 read form name "distance_matrix/2zta_full.dmatp" ! reference distance matrix to use for projection open unit 2 write form name "distance_matrix/temp.dmat" ! junk distance matrix coor dmat single residue unit 2 cutoff 6.5 - select ( side .and. segid zipa ) end - select ( side .and. segid zipb ) end - proj uprj 3 set cnat ?ncontact open unit 2 write form name "distance_matrix/temp.dmat" coor dmat single residue unit 2 cutoff 6.5 - select ( side .and. segid zipa ) end - select ( side .and. segid zipb ) end set ctot ?ncontact ! Write information to file write title unit 10 * @time @cnat @ctot incr time by 1.0 incr frame by 1 if frame lt 1200 goto loop ------------------------------------------------------------------------------ 18) The ANALysis command A "new" analysis module for computing solvent averaged properties has been added to CHARMM. It is accessed from the coordinate manipulation part (CORMAN) of CHARMM and is used with the following syntax. This piece of documentation is still under development. CLBIII 1/1/1990 NOTE: Keyword syntax changed after c25a2!! Unit numbers for output to file have to be specified, and the trajectory is now specified in the usual way with BEGIN,SKIP,STOP LNI 11/11/96 Keywords: (SOLVent: specifies analysis is to be of pure solvent, which means xref, yref and zref, or site keywords are inappropriate, i.e., analysis all configurations of solvent using all solvent molecules. OBSOLETE) WATEr: specifies the solvent is water (acutally any three-site molecule), and forces all distinct g(r)'s to be computed, i.e., g_oo, g_oh and g_hh. The first atom selection specifies the solvent atoms/molecules to be analyzed. (SPECies: specifies the solvent species. If SOLVent is active then all solvent molecules to be analyzed should be specified here, e.g., all of them present in the simulations. This keyword is followed by the standard selection syntax and is terminated with the FINIsh_solvent_specification keyword. OBSOLETE) SITE: Specifies the collection of atoms around which you would like to compute solvent properties, e.g., if you would like to analyze the solvent distribution and velocity correlation function around the center of geometry of a trp residue this keyword would be followed by the selection syntax which selects that residue. XREF, YREF, ZREF: specifies that solvent analysis around a specific spatial position, (xref, yref, zref) is to be carried out. This is the same as the site keyword, as far as the analysis of solvent configurations it invokes, however, this site is static whereas the SITE keyword permits selection of a dynamically evolving site. The above dimensions ar taken from trajectory stored informtion for crystal runs (w/ charmm22 or later) CROSs: allows the selection of two subset of atoms for g(r) analysis (a&b: 'a' are the atoms specified by the first selection and 'b' are the atoms specified by the second selection). The g(r) for a-vs-b and b-vs-b are calculated and returned in units IOH and IHH respectively. g(r) for a-vs-a will be returned in unit IGDIst. Note that CROSs does not exclude form the analysis the couple of atoms belonging to the same segid since it is design for the analysis of independent subset of solvent molecules. NOTE: The keyword CROSs cannot be selected with the following options: WATer, SITE, IKIRkg, ISDIst, IFDBf. IVAC, IMSD, IFMIn were not tested with CROSs. IVAC cannot be combined with any analysis requiring coordinates IGDIST and ISDIST are mutually exclusive flags NCORs = number of steps to compute vac or msd RSPIn = inner radius for vac,msd, analysis around REF (or SITE) RSPOu = outer radius for vac,msd, analysis around REF (or SITE) RDSP = radius of dynamics sphere, used for densities, kirkwood and dbf DENS = density (atoms/A**3) to use in normalization of g(r) if the value as calculated from the density within RDSP is not satisfactory DR = grid spacing for analysis of rdf's RSPHere = radius around REF to use for rdf analysis MGN = number of points in g(r) curve RCUT = radius of interaction sphere in dbf calculation ZP0 = initial reference site - dynamics sphere origin separation NZP = number of separations to compute dbf TYP = for DBF calc 1=oxygen, 1=hydrogen IHIS = unit for output of 3Dhistogram data or IPDB = unit for output of "atoms" where density exceeds THREshold with options: WEIG use WMAIN to weight points !! Not tested DIPO accumulate dipole vector density !! NOT working yet (June 98) CHARge accumulate charge density !! Not tested default is to just accumulated number density of sel. atoms NORM value densities are divided by this value (and by number of frames) (default 1) XMIN,XMAX,DX YMIN,YMAX,DY grid dimension&spacing (default +/- 20A,0.5A spacing) ZMIN,ZMAX,DZ THREshold value for density to output atoms in PDB file format The atoms indicated by the solvent selection are analyzed. If dipole data is to be analyzed the selection should contain 1 atom/group - the groups define what atoms are to be used for the dipole calculation. This could be automated; also need minimum image combined with orienting function. EXVC EXcludedVolumeCorrection for use with ISDIST - the soulte-solvent g(r) is corrected for the volume excluded around the solute (ie the SITE) by the atoms in the selection following EXCV. This correction is computed using a Monte Carlo procedure with parameters: MCP int Total number of points to use in the Monte Carlo (default 1000) MCSHells int Total number of equal volume shells to spread the MCP in (10) RPRObe real Probe radius (1.5A); a point is considered as excluded if it is within RPRObe+VDWR(i) of any atom i in the EXVC set ISEEd int Seed for random number generator (3141593) EXAMPLES: (See also the test/c27test/solanal2.inp testcase) The following examples use a trajectory of a short peptide in a periodic water box ! MeanSquareDisplacement of all watermolecules to estimate diffusion coeff open unit 21 read unform name @9pept500.cor open unit 31 write form name @9pept500.msd coor anal select type oh2 end - ! what atoms to look at firstu 21 nunit 1 skip 10 - ! trajectory specification imsd 31 - ! flag to do the MSD analysis rspin 0.0 rspout 999.9 - ! we are interested in ALL waters ncors 20 - ! compute MSD to NCORS*SKIP (0.04ps)steps xbox @6 ybox @7 zbox @8 ! and we did use PBC ! g(r) for the waters; the program defaults are used to calculate the density ! using selected atoms within 10A (RDSP keyword) of the reference point (0,0,0) ! (REF keyword) open unit 21 read unform name @9pept500.cor open unit 31 write form name @9pept500.goo open unit 32 write form name @9pept500.goh open unit 33 write form name @9pept500.ghh ! specify WATEr to get all three g(r) functions computed coor anal water select type OH2 end - firstu 21 nunit 1 skip 10 - ! trajectory specification igdist 31 ioh 32 ihh 33 - ! flag to do the solvent-solvent g(r) mgn 100 dr 0.1 - ! comp. g(r) at MGN points separated by DR rsph 999.9 - ! use ALL waters for rdf calculation xbox @6 ybox @7 zbox @8 ! and we did use PBC ! g(r) backbone amide hydrogen - water oxygens ! if a single solute atom is looked at the MULTi keyword is not necessary ! when several solute atoms are specified as the site, their average position ! will be used as the reference position if MULTi is not present open unit 21 read unform name @9pept500.cor open unit 31 write form name @9pept500.gonh coor anal select type oh2 end - ! Water oxygens site select type H end multi - ! and the amide hydrogens firstu 21 nunit 1 skip 10 - ! trajectory specification isdist 31 - ! do the g(r) (here solute-solvent) mgn 100 dr 0.1 - ! comp. g(r) at MGN points separated by DR rsph 999.9 - ! we use ALL waters for the calculation xbox @6 ybox @7 zbox @8 ! and we did use PBC ! g(r) for GLY3 NH - the water oxygens - with excluded volume correction open unit 21 read unform name @9pept500.cor open unit 31 write form name @9pept500.gn3ox1 coor anal select type OH2 end - site multi select atom pept 3 H end - EXVC select segid pept end - MCPoints 2000 MCSHells 20 RPRObe 1.7 - firstu 21 nunit 1 skip 50 - ! trajectory specification isdist 31 - ! flag to do the solvent-solvent g(r) mgn 100 dr 0.1 - ! comp. g(r) at MGN points separated by DR rsph 999.9 - ! we use ALL waters for the calculation xbox @6 ybox @7 zbox @8 ! and we did use PBC ------------------------------------------------------------------------------ 19) The DRAW command The DRAW command (called directly from CORMAN, not to be confused with the DRAW command found under the ANALysis command) is useful for displaying molecules. The output is a command file that can be read by various displaying and plotting programs. This command file can be edited for different types of displaying. In addition to atom positions and bonds, velocity and forces may also be displayed. The current keywords are: NOMO - No molecule option (only velocities or derivatives) DFACt - Derivative factor (default 0.0) DASH - Spacing of dashed line used for Hbonds (default .01) FRAMe - Specifies that a frame tag will be written first (default - dont specify frame) RETUrn- Specifies which stream the plotting program will return to after plotting this section (default none) An atom selection is also looked for. Any atom not selected will not be considered. The default is to include all atoms. ------------------------------------------------------------------------------ 20) The HBONd command The CONTact command The HBONd command analyses a trajectory, or the current coordinates, for hydrogen bonding patterns. The form COOR CONTact ... ignores the hydrogen bond donor/acceptor definitions in the psf and looks for all contacts which satisfy the distance cutoff criterion between all atoms in the two selections; possibly bridged by a residue as defined by the BRIDge keyword. This is useful for hydrophobic contact analysis, or for salt bridges. No angle cutoff can be used with this form of the command. Output and other options are as for the COOR HBONd variant. The form COOR HBONd makes use of the DONOR/ACCEPTOR definitions in the psf. For each acceptor/donor in the first selection the average number and average lifetime (for trajectories only) of hydrogen bonds to any atom in the second selection is calculated. A hydrogen bond is assumed to exist when two candidate atoms are closer than the value specified by CUT (default 2.4A, (reasonable criterion, DeLoof et al (1992) JACS 114,4028), and if a value for CUTAngle is given the angle formed by D-H..A is greater than this CUTAngle (in degrees, 180 is a linear H-bond); the default is to allow all angles. The current implementation assumes that hbonding hydrogens are present in the PSF and uses ACCEptor and DONOr information from the PSF to determine what pairs are possible. If output is wanted to a separate file the IUNIt option can be used. If the BRIDge option is used the routine calculates average number and lifetime of bridges formed between all pairs of atoms in the two selections; a bridge is counted when a residue of the type specified with the BRIDge hydrogen bonds (using same criteria as for direct hbonding) to at least one atom in each selection. The typical use of this would be to find water bridges. Here again, results are presented for each atom in the first selection. If FIRSTunit is not specified the current (MAIN) coordinates are analyzed. Keyword VERBose provides a more detailed output: For trajectory analysis the duration and endtime (ps) of each H-bond, or bridge, together with a specification of the atoms involved is output; potentially very large amounts of data! Only hbonds/bridges with a lifetime longer than the value specified by keyword TCUT (default 0.0 ps) are included here and in the summary. NB: TCUT (and NSKIP) may influence the results, since hbonds with a duration < TCUT are not counted, and for the lifetime analysis a quick fluctuation in hbond distance may with one choice of NSKIP result in the hbond being perceived as broken at that instant, whereas with a longer NSKIP the event would not have been noticed, resulting in a longer lifetime being reported. For single coordinate set analysis the VERBose keyword results in a more detailed listing giving all atoms involved, and also the geometry for direct hbonds. For each donor/acceptor in the first selection the trajectory analysis outputs the AVERAGE NO. of hydrogens bonds this atom has had during the trajectory (aveno=sum over frames(number of hbonds formed by this atom)/(number of frames) the average lifetime is defined as avelife= sum over hbonding events(duration of hbond between two atoms)/(number of different hbonds formed by these atoms) (ie, hbonds that have been broken for at least one frame between events) Note that the lifetime can be influenced by end-effects (ie hbonds still active at end of trajctory are counted as being terminated then!) Output can be directed to a separate file specified by IUNIT int. The following charmm substitution parameters are set in the module: ?NHBOND = total number of hydrogen bonds for selected atoms (timeaveraged) ?AVNOHB = average number of hydrogen bonds over selected atoms (timeaver.) ?AVHBLF = average lifetime of hydrogen bonds Note that these averages are over the selected atoms, which may include a number of atoms with no hbonds > TCUT! NOTE: In order not to find hbonds between bonded atoms UPDATE is called, which requires coordinates to be present when invoking this module. Since this is done just to get the non-bond exclusion lists, the cut-offs are set to very small values, and could influence subsequent energy evaluations if the non-bond cutoffs are not then respecified. ------------------------------------------------------------------------------ 21) The HISTogram command This command computes a histogram along the X,Y,Z or Radial directions for the selected atoms. The histogram can either be a simple count of the number of atoms contained in each bin (specified by the HNUM=number of bins between HMIN,HMAX keywords), or if the WEIGhting keyword is present the WMAIN array is summed for the atoms in each bin. HSAVe specifies that the histogram should be saved and incremented at the next invocation of COOR HIST. HPRInt specifies that the resulting histogram should be printed. For X,Y,Z histograms the output is the accumulated density/HNORM (default=1.0) in each bin. If HDENS>0.0 (default=0.0) there is also a third column for R histograms containing the accumulated density/(volume of shell containing this bin)/DENS. The COMParison keyword results in XCOMP,YCOMP,ZCOMP,WCOMP being used. The variable ?NCONFIG is set to the number of configurations (frames) that have been accumulated so far. The results may be output to a file specified by IUNIt int. EXAMPLE: To average the charge density in spherical shells from a trajectory could be done in the following way: scalar wmain=charge traj iread .... set i 1 label loop traj read !if you are reading velocities, you may want to convert to A/ps ! (and then you wouldn't use the weighting option like this) ! scalar x divi ?TIMFAC ! scalar y divi ?TIMFAC ! scalar z divi ?TIMFAC coor hist R hnum 50 hmin 0.0 hmax 10.0 hsave weig incre i by 1 if i .lt. 100 goto loop ! you could also normalize for number of selected atoms ! set scale ?NSEL ! mult scale by ?NCONFIG ! then use @scale instead of ?NCONFIG below bomblevel -1 ! to get by the zero atom selected warning below coor hist R hnum 50 hmin 0.0 hmax 10.0 select none end hprint - hnorm ?NCONFIG [ hdens 0.03 (some reasonable bulk density/A**3) ] ------------------------------------------------------------------------------ 22) The PUCKer command COORdinates PUCKer [SEGId segid] RESId resid1 [TO resid2] [AS | CP] The sugar pucker phase and amplitude, as defined by Altona&Sundaralingam (default, keyword AS) or (CP) Cremer&Pople (JACS 1975), are calculated for the (deoxy)ribose of the specified residue(s); the first segment is the default. A range of residues from resid1 TO resid2 can be analyzed. ------------------------------------  File: Corman, Node: Substitution, Up: Top, Previous: Function, Next: Top Coordinate Manipulation Values There are several different variables that can be used in titles or CHARMM commands that are set by some of the coordinate manipulation commands. Here is a summary and description of each variable. ---------------------------------------------------------------------------- 'XAXI','YAXI','ZAXI','RAXI','XCEN','YCEN','ZCEN' A rotation axis vector and its length and the center of rotation. This data is set by the COOR AXIS, COOR LSQP, COOR ORIE, and COOR ORIE RMS commands. These values may be used by any of the commands that uses the vector-spec with the AXIS keyword. ---------------------------------------------------------------------------- 'XMIN','YMIN','ZMIN','WMIN','XMAX','YMAX', 'ZMAX','WMAX','XAVE','YAVE','ZAVE','WAVE' Statistics set by the COOR STAT command. ---------------------------------------------------------------------------- 'THET' Angle of rotation set by the COOR ORIEnt command. ---------------------------------------------------------------------------- 'XMOV','YMOV','ZMOV' Displacement of centers set by the COOR ORIEnt command. ---------------------------------------------------------------------------- 'RMS' Resulting RMS value set by the COOR RMS, COOR ORIEnt, or COOR RGYR commands. CHARMM Element doc/correl.doc 1.1  File: Correl, Node: Top, Up: (chmdoc/commands.doc), Next: Syntax Correlation Functions The CORREL commands may be used to obtain a set of time series for a given property from a trajectory. Once obtained, the time series may be manipulated as required, saved or plotted, or to generate correlation functions ( C(tau) = ). The correlation functions may be manipulated, saved, plotted, and transformed to find spectral density (Fourier transform of C(tau)), etc and determine the correlation times. Alternately, a covariance matrix may be computed for a collection of time series. This option will compute the full matrix for use in entropy calculations or for other applications. Reorienting a coordinate trajectory is possible using the COMPARE command. For details see *note reorient:(chmdoc/dynamc.doc)Merge. * Menu: * Syntax:: The syntax of the correlation command * General:: General information regarding the correlation section * Enter:: How to specify time series * Trajectory:: How to reference to trajectory files * Edit:: How the edit the time series specifications * Mantime:: How to manipulate time series * Corfun:: How to generate correlation functions. * Spectrum:: How to get a spectrum from a correlation function * Cluster:: How to cluster time series data into similar groups * IO:: Input/output guide to correlation functions and series * Examples:: Just what it says  File: Correl, Node: Syntax, Up: Top, Previous: Top, Next: General Syntax for the CORREL command and subcommands [SYNTAX CORRelation functions] Syntax: CORREL [ MAXTimesteps int ] [ MAXSeries int ] [ MAXAtoms ] [ COVAriance] - default 512 default 2 default 100 [ nonbond-spec ] [ hbond-spec ] [ image-spec ] [NOUPdate] [ INBFrq 0 ] [ IHBFrq 0 ] [ IMGFrq 0 ] hbond-spec *note Hbonds:(chmdoc/hbonds.doc). nonbond-spec *note Nbonds:(chmdoc/nbonds.doc). image-spec *note Images:(chmdoc/images.doc)Update. Subcommands: miscellaneous-commands COOR coordinate-manipulation-command { DUPLicate time-series-name } { } ENTEr name { [ BONDs repeat(2x(atom-spec)) ] [ GEOMetry ] } c { [ ANGLe repeat(3x(atom-spec)) ] [ ENERgy ] } c { [ DIHEd repeat(4x(atom-spec)) [NOTR] ] } c { [ IMPRo repeat(4x(atom-spec)) ] } c { } { [ ATOM ] [ X ] repeat(atom-spec) [ MASS ] } e { [ FLUC ] [ Y ] } c { [ Z ] } { [ R ] } { [ XYZ ] } { } { VECT [ X ] repeat(2x(atom-spec)) } e { [ Y ] } { [ Z ] } { [ R ] } { [ XYZ ] } { } { ATOM DOTProduct repeat(2x(atom-spec)) [NORMal] [MASS]} e { FLUC DOTProduct repeat(2x(atom-spec)) [NORMal] [MASS]} e { VECT DOTProduct repeat(4x(atom-spec)) [NORMal] } e { } { ATOM CROSsproduct rep.(2x(atom-spec)) [NORMal] [MASS]} e { FLUC CROSsproduct rep.(2x(atom-spec)) [NORMal] [MASS]} e { VECT CROSsproduct rep.(4x(atom-spec)) [NORMal] } e { } { HBONd [4x(atom-spec)]*++ [ ENERgy ] } c { [ DISTance] } { [ HANGle ] } { [ AANGle ] } { } { DISTance repeat(2x(atom-spec)) } c { } { [ GYRAtion ] [ CUT real ] [ MASS ] s** } c { [ DENSity ] s** } c { } { RMS [ MASS ] [ ORIEnt ] s** } c { MODE mode-number s** } c** { TEMPerature [ NDEGF int ] s** } v { ENERGY } c { HELIx [ SELE atom-selection END ] } c { INERtia [ SELE atom-selection END ] [ TRACe ] } c { PUCK RESI [SEGI ] } c { USER user-value [ repeat(atom-spec) ] s** } e { } { CELL cell-spec } { TIME [ AKMA ] } { ZERO } { } ( code: c-coordinates, v-velocities, e-either ) c** MODE time series is allowed only if CORREL is invoked from VIBRAN. s** these utilize the first atom selection in the next TRAJ command. *++ Hydrogen bond atom order is one of: Donor,Hydrogen,Acceptor,Acceptor-antecedent Donor,Hydrogen,Acceptor Donor,Acceptor cell-spec::= one of { A B C ALPHa BETA GAMMa ALL SHAPe } atom-spec::= {residue-number atom-name} { segid resid atom-name } { BYNUm atom-number } { SELE atom-selection END} *** atom-selection::= see *note Selection:(chmdoc/select.doc) *** Note: If an atom-selection is used for atom-spec's, then all atom-spec's must be contained within one atom-selection TRAJectory [ FIRStu int ] [ NUNIt int ] [ BEGIn int ] [ STOP int ] [ SKIP int ] [ VELOcity ] [first-atom-selection] [ ORIEnt [MASS] second-atom-selection ] { ALL } [P2] [UNIT int] SHOW { time-series-name } { CORRelation-function } (defines ?P2, ?AVER, ?FLUC) { ALL } EDIT { time-series-name } edit-spec { CORRelation-function } edit-spec::= [INDEx int] [VECCod int] [CLASs int] [SECOnd int] [TOTAl int] [SKIP int] [DELTa real] [VALUe real] [NAME new-name] [OFFSet real] READ { time-series-name } unit-spec edit-spec { [FILE] } { CORRelation-funct } { CARD } { DUMB [COLUmn int] } { ALL } { [FILE] } WRITe { time-series-name } unit-spec { CARD } { CORRelation-function } { PLOT } { DUMB [ TIME ] } MANTIME time-series-name { DAVErage } ! Q(t) = Q(t) - , <> implies time average { NORMal } ! Q(t) = Q(t) / |Q(t)| { SQUAre } ! Q(t) = Q(t) ** 2 { COS } ! Q(t) = COS(Q(t)) (in degrees) { ACOS } ! Q(t) = ACOS(Q(t)) (in degrees) { COS2 } ! Q(t) = 3*COS(Q(t))**2 - 1 (in degrees) { AVERage integer } ! Q(t) = < Q(ti) >(ti=t-NUTIL+1,t) { SQRT } ! Q(t) = SQRT(Q(t)) { FLUCt name2 } ! print zero time fluctuations { DINItial } ! Q(t) = Q(t) - Q(1) { DELN integer } ! Q(t) = Q(t) - (ti=t-NUTIL+1,t) { OSC } ! print oscillations { COPY name2 } ! Q(t) = Q2(t) { ADD name2 } ! Q(t) = Q(t) + Q2(t) { RATIo name2 } ! Q(t) = Q(t) / Q2(t) { DOTProdcut name2 } ! Q(T) x-comp=Q(T).Q2(T) Q2(T)x-comp=angle Q(T) vs Q2(T) degrees { CROSproduct name2 } ! Q(T) = Q(T)xQ2(T) { KMULt name2 } ! Q(t) = Q(t) * Q2(t) { PROB integer } ! Q(t) = PROB(Q(t)) { HIST min max nbins } ! Q(ibin) = Fraction of Q(t) values in ibin { POLY integer } ! fit time series to polynomial (0-10) [REPLace] [WEIGh name] { CONTinuous [real] } ! make a (dihedral) time series continuous Q(t) = Q(t)+ n(t)*2*real, n(t)=integer (default real is 180.0) { LOG } ! Q(t) = LOG(Q(t)) { EXP } ! Q(t) = EXP(Q(t)) { IPOWer integer } ! Q(t) = Q(t) ** integer { MULT real } ! Q(t) = real * Q(t) { DIVIde real } ! Q(t) = Q(t) / real { SHIFt real } ! Q(t) = Q(t) + real { DMIN } ! Q(t) = Q(t) - QMIN { ABS } ! Q(t) = ABS(Q(t)) { DIVFirst } ! Q(t) = Q(t) / Q(1) { DIVMaximum } ! Q(t) = Q(t) / ABS(Q(MAX)) { INTEgrate } ! Q(t) = Integral(0 to t) (Q(t)dt) { TEST real } ! Q(t) = COS(2*PI*t*real/TTOT) { ZERO } ! Q(t) = 0.0 { DERIvative } ! Q(t) = (Q(t+dt)-Q(t))/dt CORFUN 2x(time-series-name) { [ PRODuct ] [ FFT ] [ LTC ] [ P0 ] [ NONOrm ] } [ TOTAl int ] { [ DIREct] [ NLTC ] [ P1 ] } { [ P2 ] } { } { DIFFerence } SPECtrum [FOLD] [RAMP] [SWITch] [SIZE integer] CLUSter time-series-name RADIus [ MAXCluster ] - [ MAXIteration ] [ MAXError ] - [ NFEAture ] [ UNICluster ] - [ UNIMember ] [ UNIInitial ] - [ CSTEP ] [ BEGIn ] - [ STOP ] [ ANGLE ] END ! return to main command parser  File: Correl, Node: General, Up: Top, Next: Enter, Previous: Syntax General discussion regarding time series and correlation functions Discussion: The CORREL command invokes the CORREL subcommand parser. The keyword values MAXTimesteps, MAXSeries, and MAXAtoms may be specified for space allocation greater than the default options. If there in insufficient virtual address memory for the space request, it may be possible to achive the desired results by removing the nonbond lists before running the CORREL command. The MAXTimesteps value is the largest number of steps any time series will contain. The MAXSeries keyword is the largest number of timeseries that will be contained at any time within CORREL. A vector time series will counts as 3 time series in allocating space. The MAXAtoms keyword allocates space for the atoms that are specified in the ENTER commands (also duplicating a time series requires more space for atoms). For bonds, angles, dihedrals, and improper dihedral specifications, one extra value is needed for each entry to hold the CODES value (so each bond uses 3 atom entries, 4 for angles...). If the COVAriance keyword is given, no time series will be computed, but instead, a complete equal time covariance matrix will be computed. For this option, only one TRAJectory command is allowed. The covariance matrix is then obtained by writing the time series, where the elements are covariant with other time series. The ENTER defines a time series. Many time series may be specified. A time series is defined by the following items; Name - Each time series must have a unique (4 character) name. Class code - The type of time series (BOND, USER, ATOM,...) Number of steps - The number of time steps currently valid Velocity code - Was the time series read from velocities? Skip value - What multiple of delta do the time steps represent? Delta - Integration time step Offset - Time of first element Secondary code - Depends on Class code (Geometry/Energy)(X/Y/Z...) Vector code - 1=simple time series, 3=vector, 0=Y or Z part of vector Value - Utility series value, depends on Class code Mass weighting - Are the elements to be mass weighted (only for ATOM) Average - Time series average Fluctuation - Time series fluctuation about the average Atom pointer - Pointer into first specified atom in atom list Atom count - Number atom entries given in the ENTER command Time series - Series values from (1,NTOT) The TRAJectory command processes all of the time series which have a NTOT (number of steps) count of zero. For this process, the main coordinates are used for reading the trajectory. If flutucations are requested, the comparison coordinates MUST be filled with the reference (or average) coordinates before invoking the TRAJectory command. Allowing multiple TRAJectory commands separated by enter commands make it possible to compute correlation function between positions and velocities, or even for different trajectories. The EDIT command allows the user to directly modify the time series specifications. The MANTIME command allows the user to manipulate the time series values (and sometimes some of the specifications). The SHOW command will display the specification data for all of the time series.  File: Correl, Node: Enter, Up: Top, Next: Trajectory, Previous: General Specifying time series The ENTER command defines a new time series. Each time series specified by different enter commands must have a unique name (up to 4 characters). With this command, a time series may be defined and then must be later filled with a TRAJectory command (or a MANTIME COPY, or a READ time-series command). Alternativly, a time series may be retrieved from an existing file, or duplicated from another time series that currently exists. The time series names "ALL" and "CORR" may not be used, and are reserved for selecting all of the time series or the correlation function respectivly. The ENTER options are; ----------------------------------------------------------------------------- DUPLicate time-series-name This causes an exact copy of an existing time series to be created (except with a different name). This may be useful where several different type of manipulations are required on a single time series. ----------------------------------------------------------------------------- READ unit-number [CARD] [edit-spec] This causes a time series to be created and all data then read in from an existing time series file. All time series (up to the maximum allowed) will be read with this command. ----------------------------------------------------------------------------- [ BONDS repeat(2x(atom-spec)) ] [ GEOMETRY ] [ ANGLE repeat(3x(atom-spec)) ] [ ENERGY ] [ DIHEd repeat(4x(atom-spec)) [NOTR] ] [ IMPRo repeat(4x(atom-spec)) ] These specifications cause a particular internal coordinate (or an average of several) to define the time series. It is not necessary that the specified atoms have a corresponding PSF entry, but if ENERGY is requested, the specified atoms must be able to produce a valid parameter code. The default is GEOMETRY. With geometry, any 4 atoms may be specified. A velocity trajectory should not be used to fill these types of time series. The NOTR option for dihedral prevents the analysis of dihedral transitions. ----------------------------------------------------------------------------- [ ATOM ] [ X ] repeat(atom-spec) [ MASS ] [ FLUC ] [ Y ] [ Z ] [ R ] [ XYZ ] These ENTER commands define a time series, Q(t), based on atom positions or velocities. The ATOM option uses the (X,Y,Z,R,or XYZ) values directly. The FLUCtuation option subtracts off the reference values (contained in the comparison coordinates). For example, if the average structure is desired as the reference value, then the command: COOR DYNA COMP trajectory-spec would be required before invoking the TRAJECTORY command. If more than one atom is specified, then Q(t) values are averaged over atoms. If MASS is specified, then mass weighting is used in this averaging of Q(t) values. The properties X,Y,Z, and R cause a scalar time series to be created with the requested property. The XYZ option causes a vector time series to be created. ATOM: Q(t) = X(t) FLUC: Q(t) = X(t) - Xref ----------------------------------------------------------------------------- VECT [ X ] repeat(2x(atom-spec)) [ Y ] [ Z ] [ R ] [ XYZ ] The VECTor command is similar to the ATOM and FLUCuation commands listed above, except the values are given by the difference in position or velocity of 2 atoms. If more than one pair of atoms is specified, then the values for each vector are averaged. Q(t) = X1(t) - X2(t) ----------------------------------------------------------------------------- ATOM DOTProduct repeat(2x(atom-spec)) FLUC DOTProduct repeat(2x(atom-spec)) VECT DOTProduct repeat(4x(atom-spec)) ATOM CROSsproduct repeat(2x(atom-spec)) FLUC CROSsproduct repeat(2x(atom-spec)) VECT CROSsproduct repeat(4x(atom-spec)) These ENTER commands produce a scalar time series for velocities or positions with the following definitions; ATOM DOTP: Q(t) = ( r1(t) | r2(t) ) FLUC DOTP: Q(t) = ( (r1(t)-r1(ref)) | (r2(t)-r2(ref)) ) VECT DOTP: Q(t) = ( (r1(t)-r2(t)) | (r3(t)-r4(4)) ) If more than one set of atoms is specified, then the vector values are averaged. The dotproduct is then computed from the averaged vectors. NOTE: the vectors are averaged, NOT the resultant dotproducts or crossproducts. For the FLUC option, the reference coordinates must be in the comparison coordinate set. ----------------------------------------------------------------------------- [ GYRAtion ] [ CUT real ] [ DENSity ] These commands define a scalar time series for a coordinate trajectory. The density calculation is based about the origin on all atoms within the CUT value; the radius of gyration is for all atoms within distance CUT of the geometric center of the molecule, and no mass weighting is applied. ----------------------------------------------------------------------------- MODE mode-number This option generates a scalar time series which is obtained by projecting the velocities onto the specified normal mode, or to project the coordinate diplacement from the reference strucure. The result is given by; velocity: Q(t) = < root(mass)*v(t) | q > position: Q(t) = < root(mass(i))*(r(t)-r(ref)) | q > ----------------------------------------------------------------------------- TEMPerature The time series is the temperature at each point. If NDEFG is specified as a positive value, then this is used instead of the NDEGF values from the trajectory file. If a negative NDEGF value is specified, then NDEGF will be set to 3 times the number of selected atoms in the trajectory associated trajectory command. --------------------------------------------------------------------- HELIx atom-selection The x,y, and z components of the normalized vector defining the axis af a cylindrical surface best fitting the selected atoms. So you end up with a three-dimesnional vector series. Intended for say alpha helices where the selection would be something like: SELE ATOM * * CA .AND. RESID 23:36 END, to give the axis of an alpha helix running from residue 23 to residue 36. --------------------------------------------------------------------- INERtia atom-selection [ TRACe ] The x,y, and z components of the normalized vector defining the principal axis obtained from diagonalizing the moment of inertia tensor for the selected atoms at each time point. The eigenvector corresponding to the smallest eigenvalue is returned, and 180 deg flips of the axis are explicitly prohibited (nonphysical). The optional TRACe keyword returns the sorted eigenvalues as a three column time series, instead of the principal axis vector. --------------------------------------------------------------------- CELL cell-spec If the cell-spec is one of the 6 unit cell parameters A, B, C, ALPHA, BETA, or GAMMA, then a single time series corresponding to that component is return. The keyword ALL returns a 6 element time series, with the columns in the order given above. The SHAPE keyword returns the shape matrix for the unit cell at each time point, in lower diagonal form. The shape matrix has the angles as cosines, while ALPHA, BETA, and GAMMA are in degrees. --------------------------------------------------------------------- RMS [ORIE] The RMS deviation from the COMPARISON coordinate set is computed, with a rotation to obtain a best fit if ORIEnt is specified. --------------------------------------------------------------------- PUCK RESI [SEGI ] The sugar pucker phase and amplitude are calculated for the (deoxy)ribose of the specified residue; the first segment is the default. This gives a two-dimensional vector, with component 1 being the phase (degrees) and component 2 the pucker amplitude (Angstroms), as defined by Cremer&Pople (JACS 1975). ----------------------------------------------------------------------------- USER user-value [ repeat(atom-spec) ] The USRTIM routine is called for each coordinate or velocity set. The user value and atom list is also passed along. See the description in (USERSB.FLX)USRTIM for more details. Q(t) = Whatever you want! ----------------------------------------------------------------------------- TIME [ AKMA ] The time is returned in picoseconds unless AKMA is specified. Q(t) = t ----------------------------------------------------------------------------- ZERO A zero time series is specified ( Q(t)=0 ). This option is useful for cases where time series will be read with the DUMB option. For these cases, the EDIT command may also be needed to get desired results.  File: Correl, Node: Trajectory, Up: Top, Next: Edit, Previous: Enter Specification of the Trajectory Files The TRAJectory command reads a number of trajectory files whose Fortran unit numbers are specified sequentially. The first unit is given by the FIRSTU keyword and must be specified. NUNIT gives the number of units to be scanned, and defaults to 1. BEGIN, STOP, and SKIP are used to specify which steps in the trajectory are actually used. BEGIN specifies the first step number to be used. STOP specifies the last. SKIP is used to select steps periodically as follows: only those steps whose step number is evenly divisible by STEP are selected. The default value for BEGIN is the first step in the trajectory; for STOP, it is the last step in the trajectory; and for SKIP, the default is 1. The first atom selection in the TRAJectory command is meaningful only for those time series that require an atom selection. These are time series defined by the following ENTER commands: GYRAtion, DENSity, RMS, MODE, TEMPerature, and optionally USER. General reorienting of a coordinate trajectory is possible using the MERGE command. For details see *note reorient:(chmdoc/dynamc.doc)Merge. It is also possible to perform a simple rms best fit of each frame with the reference coordinates (comparison set) using the ORIEnt option. For this option a second atom selection MUST be provided and a MASS keyword is an option that allows for a mass weighting of the best fit. If VELOcity is specified, a velocity trajectory will be looked for. Otherwise, a coordinate trajectory is expected. Any time series that has a zero count (NTOT=0) will be filled by this comand. The time series count will then be filled with the total number of steps processed for each of these series. Any time series with a nonzero count (NTOT>0) will not be affected by this command. The count may be set to zero for a time series with the EDIT command. Upon conclusion, the average and flucutation as well as some other data is presented on each of the processed time series. If any of the time series to be filled require a reference coordinate set, then the comparison coordinates MUST be filled with the reference (or average) coordinates before invoking the TRAJectory command. Upon completion, the main coordinates contain the last coordinate set read from the trajectory, and the comparison coordinates are unaffected.  File: Correl, Node: Edit, Up: Top, Next: Mantime, Previous: Trajectory Editing a time series The EDIT command allows the time series specifications to be modified directly. WARNING:: This command gives the user direct access to most time series specification. There is NO checking to see if what is being done makes sense. As such, this command is versitile and dangerous. The EDIT command must be followed by a valid time series name. All subsequent keywords will be based on that time series. The series name "ALL" will cause the edit spec to operate on all the time series. The name "CORR" will cause the edit to occur on the correlation function. The following may be specified for a time series; INDEx integer - May be specified to modify X,Y, or Z (1,2,3 resp) of a vector timeseries. Otherwise, all are modified. The index number is in fact an offset from the specified time series, where a value of 1 represents the selected time series. A value of 5 will cause the edit operation to modify the fourth time series from the specified. CLASs integer - May be used to specify a class code (consult source). TOTAl integer - The total number of valid steps may be altered, but none of the values are modified. By setting this value to zero, the time series is then ready again for the next TRAJectory command. SKIP integer - May be specified to reset the SKIP value. This may be useful after reading an external time series. DELTa real - May be specified to modify the basic time step. The actual time step for a series is (SKIP*DELTA). OFFSet real - The time of the first element in the time series. VECCod integer - User may specify a vector code. This may be useful in merging 3 separate time series into a vector time series (or the reverse). In fact any number of time series may be grouped together with this option. For example, if a table with 5 time series is desired, setting VECCOD to 5 for the first one and the writing this time series will output all 5. VALUe real - This allows the user to modify the series utility value. Its function depends on the Class code. This value is currently used for (USER, GYRAtion, DENSity, MODE, and TIME) SECOndary int - The secondary class code may be modified (consult source).  File: Correl, Node: Mantime, Up: Top, Next: Corfun, Previous: Edit Manipulating the Time Series The MANTIME command allows the user to manipulate selected time series, Q(t), and performs the operation requested by the option and leaves the resultant time series as the active time series. This helps in performing various permuations of manipulations to increase the options without increasing the number of ENTER commands. The keyword ordering must be followed exactly. DAVErage subtracts the average of the time series from all elements. NORMal normalises the vectorial time series. (i.e. creates the unit vector by dividing all elements for a given value of t by r(t) = sqrt(x**2 + y**2 + z**2) ). SQUAre squares all the elements COS obtains the cosine of all elements. ACOS obtains the arc-cosine of all elements. COS2 calculates 3*cos**2 - 1 for all elements. AVERage integer calculates the average for every consecutive points and increases the time interval by a factor of . Note: NTOT is divided by . SQRT obtains square root for all elements. Negative elements are set to -SQRT(-q(t)). FLUCt name The Q(t) remains unchanged. A second (b) timeseries must be specified. The zero time fluctuations are computed and printed out. The following variables are computed: A = B = sqrt C = sqrt D = A/(B*C) DINItial subtracts the value of the first element from all elements. Q(t) = Q(t) - Q(1) DELN integer Q(I) = Q(I) - I FROM 1 TO N, FROM N+1 TO N+N ETC. (untested). OSC counts the number of oscillations in Q(t) / unit time step. The Q(t) remains unchanged. COPY name This copies the second time series to the first. NTOT of the first is set to that of the second. ADD name Q(t) = Q(t) + Q2(t); add the second time series to the first RATIo name Q(t) = Q(t) / Q2(t) CROSsprod name Q(T) = Q(T) x Q2(T); the 3D crossproduct of the two 3D vectors formed by the selected and named timeseries DOTProd name Q(T) = x-comp of Q(T)= Q(T) . Q2(T) x-comp of Q2(T) angle in degrees between the two vectors NOTE! Modifies Q2 as well as Q to get just the x-comp you may then edit the selected series: EDIT series VECCOD 1 KMULt name Q(t) = Q(t) * Q2(t) PROB integer give the probability to find a specific value of the time series. subdivisions of the time series are considered so that there are integer+1 values. HIST min max nbins Q(ibin) = Fraction of Q(t) values within ibin This command replaces a time series with a histogram of the time series divided into "nbins" with a range from "min" to "max". The histogram values sum to 1. POLY integer [REPLace] [WEIGh name] fit time series to polynomial. The order should be in the range of 0 to 10. Order 0 will provide just the average, Order 1 will fit the time series to a stright line. Order 2 will fit to a quadratic function. The REPLace option will replace the time series with fitted one. The WEIGht option will wait all data by the values in a second time series. CONTinuous real Q(t) = Q(t) + n(t) , where n(t) is an integer such that the ABS(Q(t)-Q(t-1))<=real The default value is 180.0, which is appropriate for making a dihedral time series continuous. A different positive value may be selected (such as a box size...). LOG Q(t) = LOG(Q(t)) EXP Q(t) = EXP(Q(t)) IPOWer integer Q(t) = Q(t) ** integer MULT real Q(t) = Q(t) * DIVI real Q(t) = Q(t) / SHIFt real Q(t) = Q(t) + DMIN Q(t) = Q(t) - QMIN, QMIN being the minimum of the time series. ABS Q(t) = ABS(Q(t)) DIVFirst Q(t) = Q(t) / Q(1) DIVMax Q(t) = Q(t)/ ABS(Q(t) with max norm) INTEgrate Q(t) = Integral(0-t) [ Q(t) dt ] TEST real Q(t) = COS ( 2 * PI * / NTOT ) ZERO Q(t) = 0 This option zeroes the specified time series. DERIvative Q(t) = (Q(t+dt)-Q(t))/dt, the last point is set to the one before last  File: Correl, Node: Corfun, Up: Top, Next: Spectrum, Previous: Mantime Calculating a Correlation Function CORFUN: This option takes the specified time series and calculates the desired correlation function from it. The resultant correlation function is saved in a time series named "CORR" which may then be used in subsequent CORREL manipulation or write commands. If multiple CORFUN commands are requested, then the "CORR" time series is overwritten. In the following, Qa and Qb refer to the time series that were extracted using the CORREL command. PRODuct This option (default) generates a correlation function that is the product of the time series elements. C(tau) = < Q1(t)*Q2(t+tau) > DIFFerence The difference option is an alternative of the product option and it generates a function that is useful in calculating diffusion constants (slope at long tau). C(tau) = < ( Q1(t) - Q2(t+tau) ) **2 > FFT This option is to calculate the correlation function using the FFT method. There are certain limitations on the prime factors in the total number of points. DIRECT This option is to calculate the correlation function using the direct multiplication method. P1 This option gives the direct correlation function, . If Qa and Qb are unit vectors, then this is also the first order Legendre Polynomial P2 This is to obtain the correlation function of second order Legendre Polynomial, (3 <[Qa(0).Qb(t)]**2> - 1)/2. For all applications that I can think of, Qa and Qb will be unit vectors. For P2, LTC = 0 and NORM = 1 NLTC no long tail correction. LTC long tail correction (subtracts **2 if autocorrelation, * if cross correlation. There is no LTC for P2 so NLTC and LTC give same result.) This feature is to be used with care. If the Qa and Qb are fluctuations from the mean (i.e. FLCT or MANTIME DELTA), then this can serve as a correction for roundoff error. Otherwise, they are not centered about the mean, this correction causes the C.F. to be a less accurate calculation of fluctuations from the mean, i.e. - LTC = - * = NONORM Correlations are not normalized. This is useful for adding correlations computed in different trajectories. (P2 is not normalized) The correlation functions are normalized unless NONORM is specified. TOTAL integer The TOTAL value determines the number of points to keep in the correlation function. The number of points may not be grater than the number of points in the time series. A reasonable value is about 1/4 to 1/3 the length of the time series. Correlation function values near the end have little weight. The default value is the nearest power of two less than half of the time series length. The defaults are FFT, P0, NLTC. Note: The correlation time which is given by the program is calculated by an exponential fit to the first NTOT/8 points or up to the first crossing of the time axis. This value should be considered a (poor) estimate, it is meaningful only for correlation functions which decay exponentially to zero with no oscillations. For P0, C(t) = (c(t) - ltc)/N ltc and Normalization factors, N, are: LTC, autocorrelation: ltc = **2 for P0 = 0 for P2 N = C(0) - ltc = - ltc LTC, cross-correlations: ltc = * N = sqrt[ ( - **2) * ( - **2) ] NLTC, autocorrelation: ltc = 0 N = C(0) NLTC, cross-correlations: ltc = 0 N = sqrt [*]  File: Correl, Node: Spectrum, Up: Top, Next: Cluster, Previous: Corfun Generating a Spectrum from Correlation Functions There is a command, SPECtral-density, which may be used to generate a spectrum from a correlation function. The synatax is; SPECtrum [SIZE integer] [FOLD] [RAMP] [SWITch]  File: Correl, Node: Cluster, Up: Top, Next: IO, Previous: Spectrum Clustering Time Series Data This command clusters time series data obtained within the CORREL facility. The time series must first be defined using CORREL's ENTEr command and the data read in via TRAJ or READ. The CLUSter command clusters these data into groups with similar time series values, with each cluster being defined by a "cluster center". The cluster centers are output to UNICluster, and a list of time points and assigned clusters is given in the cluster membership file (UNIMember). For example, if you want to find similar conformations of a peptide using dihedral angles, you would first define the set of dihedral angles to be considered, say angle(1) -> angle(M), as M time series. If the time series were each N time steps long, then you would be clustering N "patterns", with each pattern M "features" long. Consecutive time series are clustered. If the first time series is, for example, "ts1" then the "veccod" of this time series can be changed to the number of time series to be clustered: CORREL ... ENTE ts1 ... ENTE ts2 ... ... ENTE tsM ... EDIT ts1 veccod M TRAJ ... (or READ ...) CLUSTER ts1 ... END Alternatively, NFEAture M can be specified in the CLUSter command line. Note that vector time series count as three features. The Clustering Algorithm ART-2' is a step-wise optimal clustering algorithm based on a self-organizing neural net (Carpenter & Grossberg, 1987; Pao, 1989; Karpen et al., 1993). The algorithm optimizes cluster assignment subject to a constraint on cluster radius, such that no member of a cluster is more than a specified distance from the cluster center. This optimization is carried out as an iterative minimization procedure that minimizes the Euclidean distance between the cluster center and its members. A self-organizing net is created with each output node representing a cluster. The number of pattern features is equal to the number of input nodes. The weights of the connections between the input layer (layer i) and the output layer (layer j) are denoted by b(j,i). For each cluster j, b(j,i), i = 1, nfeature, is the cluster center. To create the net (which is synonomous to learning the classification scheme or cluster centers) the following algorithm is implemented: 1. To initialize the network, assign b(1,i) equal to the first pattern tq(1,i) for i = 1, nfeature. 2. For each pattern number k, calculate the Euclidean distance (rms) between the pattern tq(k,i) and all cluster centers b(j,i), where j is the cluster index. rms(j,k) = sqrt[sum [(b(j,i)-tq(k,i))**2] for i = 1, nfeature] 3. Find cluster j such that rms(j,k) < rms(i,k) for all i<>j. If rms(j,k) <= Threshold, then update b(j,i): b(j,i) = ((m-1)*b(j,i) + tq(k,i))/m, where m is the number of prior updates of b(j,i). Note that b(j,i) is the average of feature i for all patterns currently assigned to cluster j. 4. If rms > Threshold for all prior cluster centers (j=1,numclusters), then create a new cluster center by increasing the number of output nodes by one, and assign the weights b(numclusters,i) of this node the value of the pattern tq(k,i). 5. Repeat 2.-4. until all patterns have been input. 6. Compare the new set of cluster centers with the last set. If the difference between them is less than MAXError, then halt clustering. 7. If the difference between the sets of cluster centers is greater than MAXError, then use the new set of cluster centers as the starting cluster centers, and repeat steps 2.-6. Else, clustering is complete. Note that the cluster centers currently being calculated in step 3 are only used for the comparison in step 2 during the first iteration with no initial cluster centers. Otherwise, the centers calculated in the previous iteration (or read from UNIInit) are used in the comparison in step 2. Hence, in the initial "learning" phase, cluster centers are recalculated as each new member is added. In subsequent "refining" phases, cluster centers are not updated until all conformations are read in and assigned. References: 1) Carpenter, G. A., & Grossberg, S. 1987. ART 2: Self-organization of stable category recognition codes for analog input patterns. Appl. Optics 26:4919- 4930. 2) Pao, Y.-H. 1989. Adaptive Pattern Recognition and Neural Networks, Addison- Wesley, New York. 3) Karpen, M. E., Tobias, D. T., & Brooks III, C. L. 1993. Statistical clustering techniques for analysis of long molecular dynamics trajectories. I: Analysis of 2.2 ns trajectories of YPGDV. Biochemistry 32:412-420. CLUSter Parameters CLUSter time-series-name RADIus [ MAXCluster ] - [ MAXIteration ] [ MAXError ] - [ NFEAture ] [ UNICluster ] - [ UNIMember ] [ UNIInitial ] - [ CSTEP ] [ BEGIn ] - [ STOP ] [ ANGLE ] 1. time-series-name: The name of the first time series (as defined by the ENTE command) to be clustered. 2. RADIus: Maximum radius of cluster. The rms cutoff or threshold for assignment to a cluster. 3. MAXCluster: Maximum number of clusters (default = 50). 4. MAXIteration: The maximum number of iterations allowed. If the clustering has not converged by this number of iterations, all clusters are output (default = 20). 5. MAXError: If the rms difference between the position of the cluster centers for the last two iterations is less than maxerror, the system is considered converged and the clustering is halted (default = 0.001). 6. NFEAture: This variable gives the number of features in the input pattern, that is, the number of time series to be clustered at a time. The default is the veccod parameter associated with 'time-series-name'. NFEATure time series are clustered, starting with 'time-series-name' and continuing with the next nfeature-1 series specified in subsequent 'ENTE' commands (default = veccod of time-series-name). 7. UNICluster: The unit number of the output cluster file. If UNIC = -1 (the default), the cluster parameters are output to the standard output. 8. UNIMember: The unit number of the output membership file. This file lists each time point and the cluster(s) associated with the specified time series at that time point. If UNIM = -1 (the default), the membership list is not output. 9. UNIInit: The unit number of the file with the initial cluster centers. If UNII = -1 (the default), no initial cluster centers are specified. 10. CSTEp: This variable gives the spacing between time series in the input vector. For each timepoint k, the set of patterns clustered is tq(k,1) -> tq(k,nfeature), tq(k,1 + cstep) -> tq(k,nfeature + cstep), ...,tq(k,nserie - nfeature + 1) -> tq(k,nserie) (default = nfeature). 11. BEGIn: Indicates frame in time series where clustering begins (default = 1). 12. STOP: Indicates the frame in the time series where clustering ends (default = minimum length (TOTAl in SHOW) of time series). 13. ANGLe: A logical flag which when true specifies angle data is to be clustered, taking angle periodicity into account (default = .FALSE.). Caveats The clustering algorithm is initial-guess dependent, i.e., it is dependent on the input order of the patterns. The order of presentation in CLUSter is simply the consecutive frames of the time series. To check for stable clustering, cluster centers can be calculated from time series with the time frames randomized. This is not currently implemented in CHARMM, so the user will have to write a set of time series to a file and then randomize row position outside of CHARMM. It is relatively straight forward to compare features derived from similar measures (i.e., time series with the same "class codes", for example all DIHE/GEOM). In some applications it may be desired to "mix" units in the pattern, for example, cluster a set of time series derived from both atomic positions and energies. How best to compare "apples & oranges" is a problem from measurement theory, and is application-specific. Normalizing the variables such that they have unit variance is one possibility, and this can be done by 1) determining the standard deviation of the time series (FLUC given by the SHOW command), and 2) using this value in the MANTim DIVI command. Since only differences between features are used in the clustering algorithm, shifting the time series to zero mean is not necessary. Duda & Hart have a good discussion of the issues involved in clustering and normalization: Duda, R. O., & Hart, P. E., Pattern Classification and Scene Analysis, Wiley, New York, pp. (1973). Cluster Output The following data are output to UNIC for each cluster: Cluster Index - The clusters are numbered starting with 1. No. of Members - Number of patterns assigned to the cluster. Cumulative No. of Members - The total number of patterns within the cluster radius. This can be higher than the No. of Members due to patterns being within the maximum radius of more than one cluster. Standard Deviation of Patterns within Cluster - For cluster j with the number of features = Nfeature, this is sqrt(sum((tq(k,i) - b(j,i))**2)/Nfeature*N(j)) where the sum is over i = 1, Nfeature and over all k such that tq(k) is a member of j. N(j) = the number of members in cluster j. Note that b(j,i) = (averaged over k in cluster j). Maximum Distance - the longest distance between the cluster center and an assigned pattern, normalized by sqrt(Nfeature). Cluster Centers - (b(j,i), i = 1, Nfeature) The following data are output to UNIM: Cluster index of the assigned cluster Time series time step Time series index of first time series in pattern Distance of pattern from cluster center, normalized by sqrt(Nfeature)  File: Correl, Node: IO, Up: Top, Next: Examples, Previous: Cluster Input/Output of time series and correlation functions. 1) The SHOW command { ALL } SHOW { time-series-name } { CORRelation-function } The SHOW command displays to print unit various data regarding the specified time series. This command is automatically run after the ENTER and EDIT commands as a verification of the last action. 2) The READ command READ { time-series-name } unit-spec edit-spec { [FILE] } { CORRelation-funct } { CARD } { DUMB [COLUmn int] } The READ command allows a time series or correlation function to be directly read. The file formats for time series and correlation functions is identical. There are three basic methods by which time series may be read: FILE (default), CARD, and DUMB. The FILE and CARD options expect a file of specific type generated by the corresponding WRITE command. The DUMB option will read a free field card file with NO title or other header. The COLUmn option (default 1) may be specified to start reading the time series from any specified column. The DUMB option will usually include some edit specifications to properly set the time steps (etc.). 3) The WRITe command { ALL } { [FILE] } WRITe { time-series-name } unit-spec { CARD } { CORRelation-function } { PLOT } { DUMB [ TIME ] } The WRITe command will write out time series or a correlation function. All of the write options expect a title to follow this command. There are several file formats; FILE (default), CARD, PLOT, and DUMB. The FILE and CARD options will write out all data regarding the specified time series with the expectation for later retrival by Charmm or another program. The PLOT option will create a BINARY file for plotting by PLT2. The first line of the title is used as the plot title, but this may be reset in PLT2. The DUMB options will simply write out the values with no title or header to a card file, one value to a line. If the TIME option is specified, the time value will preceed the time series values (as needed for an X-Y plot). If the time series is a vector type, then 3 values will be given on each line. The DUMB option is useful for making plot files, or for feeding the data to other programs. With the EDIT command, a user may merge 3 separate sequential time series into a vector time series (or the reverse). In fact any number of time series may be grouped together with this option. For example, if a table with 5 time series is desired, setting VECCOD to 5 for the first one and the writing this time series will output all 5.  File: Correl, Node: Examples, Up: Top, Previous: IO Examples These examples are meant to be a partial guide in setting up input files for CORREL. The test cases may be examined for a wider set of applications. Example (1) CORREL MAXSERIES 1 MAXTIMESTEPS 500 MAXATOMS 5 ENTER AAAA TORSION MAIN 28 N MAIN 28 CA MAIN 28 C MAIN 29 N GEOMETRY TRAJECTORY FIRSTU 51 NUNIT 5 BEGIN 26000 STOP 31000 SKIP 10 MANTIME AAAA DAVER WRITE AAAA UNIT 20 DUMB TIME * title * WRITE AAAA CARD UNIT 10 * title for card * file containing the time series * CORFUN AAAA AAAA FFT NLTC P0 WRITE CORREL UNIT 21 DUMB TIME * title * WRITE CORREL FILE UNIT 11 * title for binary correlation function * Extracts the time series, PHI(t), for phi dihedral of residue 28. Makes the time series the fluctuation from the mean, delta PHI(t). Makes a plot file of delta PHI(t) vs. time. Makes binary file of delta PHI(t). Calculates C(t) = / by FFT with no long tail correction. Makes a plot file of C(t) vs. t. Makes a binary file of C(t). Example (2) CORREL MAXSERIES 2 MAXTIMESTEPS 500 MAXATOMS 10 ENTER PHI TORSION MAIN 27 C MAIN 28 N MAIN 28 CA MAIN 28 C GEOMETRY ENTER PSI TORSION MAIN 28 N MAIN 28 CA MAIN 28 C MAIN 29 N GEOMETRY TRAJECTORY FIRSTU 51 NUNIT 5 BEGIN 26000 STOP 31000 SKIP 10 MANTIME PHI DAVER MANTIME PSI DAVER CORFUN PHI PSI FFT NLTC P0 NONORM WRITE CORREL FILE UNIT 11 * title for cross correlation binary file * WRITE CORREL PLOT UNIT 12 * plot title * Extracts the time series PHI(t), for phi dihedral, and PSI(t), for the psi dihedral, of residue 28. Makes the time series the fluctuation from the mean. Calculates C(t) = by FFT with no long tail correction. Makes a binary file of C(t). Makes a binary PLT2 file for plotting Example (3) Fluorescence Depolarization, for example CORREL MAXSERIES 6 MAXTIMESTEPS 500 MAXATOMS 8 ENTER V1 VECTOR XYZ MAIN 28 NE1 MAIN 28 CZ3 MAIN 28 NE1 MAIN 28 CE3 ENTER V2 VECTOR XYZ MAIN 28 CD1 MAIN 28 CH2 MAIN 28 CD1 MAIN 28 CZ3 TRAJECTORY FIRSTU 51 NUNIT 5 BEGIN 26000 STOP 31000 SKIP 10 MANTIME V1 NORMAL MANTIME V2 NORMAL SHOW ALL CORFUN V1 V2 FFT P2 WRITE CORREL PLOT UNIT 21 * title for plot * Extracts the time series, consisting of the average of the vectors NE1 - CZ3 and NE1 - CE3 == V1(t) and of the average of CD1 - CH2 and CD1 - CZ3 == V2(t). Makes V1(t) and V2(t) unit vectors. Displays data regarding both time series Calculates P2(t) = (3< (V1(0)*V2(t))**2 > - 1) / 2 Makes a binary plot file for PLT2 CHARMM Element doc/crystl.doc 1.1  File: Crystl, Node: Top, Up: (chmdoc/commands.doc), Next: Syntax Calculations on Crystals using CHARMM The crystal section within CHARMM allows calculations on crystals to be performed. It is possible to build a crystal with any space group symmetry, to optimise its lattice parameters and molecular coordinates and to carry out a vibrational analysis using the options. * Menu: * Syntax:: Syntax of the CRYSTAL command * Function:: A brief description of each command * Examples:: Sample testcases * Implementation:: Background and implementation  File: Crystl, Node: Syntax, Up: Top, Next: Function [Syntax CRYStal command] CRYStal [BUILd_crystal] [CUTOff real] [NOPErations int] [DEFIne xtltyp a b c alpha beta gamma] [FREE] [PHONon] [NKPOints int] [KVECtor real real real TO real real real] [VIBRation] [READ] [CARD UNIT int] [PHONons UNIT int] [PRINt] [PRINt] [PHONons] [FACT real] [MODE int THRU int] [KPTS int TO int] [WRITe] [CARD UNIT int] [PHONons UNIT int] [VIBRations] [MODE int THRU int] [UNIT int] xtltyp ::= { CUBIc } { TETRagonal } { ORTHorhombic } { MONOclinic } { TRIClinic } { HEXAgonal } { RHOMohedral } { OCTAhedral/trnc} { RHDO } a b c alpha beta gamma ::= (six real numbers) The crystal module is an extension of the image facility within the CHARMM program. All crystal commands are invoked by the keyword CRYStal. The next word on the command line can be one of the following : Build - builds a crystal. Define - defines the lattice type and constants of the crystal to be studied. Free - clear the crystal and image facility. Phonon - calculates the crystal frequencies for a single value or a range of values of the wave vector, KVEC. Print - prints various crystal information. Read - reads the crystal image file. Vibration - calculates the harmonic crystal frequencies when the wave vector is the zero vector. Write - writes out to file various crystal information.  File: Crystl, Node: Function, Previous: Syntax, Up: Top, Next: Examples A brief description of each command follows. 1. Crystal Build. A crystal of any desired symmetry can be constructed by repeatedly applying a small number of transformations to an asymmetric collection of atoms (called here the primary atoms). The transformations include the primitive lattice translations A, B and C which are common to all crystals and a set of additional transformations, {T}, which determines the space group symmetry. The Build command will generate, given {T}, a data structure of all those transformations which produce images lying within a user-specified cutoff distance of the primary atoms. The data structure can then be used by CHARMM to represent the complete crystal of the system in subsequent calculations. The symmetry operations, {T}, are read from the lines following the Crystal Build command. The syntax of the commmand is : Crystal Build Cutoff Noperations ... lines defining the symmmetry operations. The Cutoff parameter is used to determine the images which are included in the transformation list. All those images which are within the cutoff distance are included in the list. There is no limit to the number of transformations included in the lists as they are allocated dynamically. The crystal symmetry operations are input in standard crystallographic notation. The identity is assumed to be present so that (X,Y,Z) need not be specified (in fact, it is an error to do so). For example, a P1 crystal is defined by the identity operation and so the input would be Crystal Build .... Noper 0 whilst a P21 crystal would need the following input lines : Crystal Build .... Noper 1 (-X,Y+1/2,-Z) A P212121 crystal is specified by Noper 3 (X+1/2,-Y+1/2,-Z) (-X,Y+1/2,-Z+1/2) (-X+1/2,-Y,Z+1/2) It should be noted that in those cases where the atoms in the asymmetric unit have internal symmetry or in which a molecule is sited upon a symmetry point within the unit cell not all symmetry transformations for the crystal need to be input. Some will be redundant. It is up to the user to check for these cases and modify the input accordingly. 2. Crystal Define. The define command defines the crystal-type on which calculations are to be performed. It is usually the first crystal command that is specified in any job using the crystal facility. It has the format : Define lattice-type a b c alpha beta gamma The input lattice parameters are checked against the lattice-type to ensure that they are compatible. Nine lattice types are permitted. They are listed below along with any restrictions on the lattice parameters : CUBIc - a = b = c and alpha = beta = gamma = 90.0 degrees. (example: 50.0 50.0 50.0 90.0 90.0 90.0 ) (volume = a**3) (degrees of freedom = 1) TETRagonal - a = b and alpha = beta = gamma = 90.0 degrees. (example: 50.0 50.0 40.0 90.0 90.0 90.0 ) (volume = c*a**2) (degrees of freedom = 2) ORTHorhombic - alpha = beta = gamma = 90.0 degrees. (example: 50.0 40.0 30.0 90.0 90.0 90.0 ) (volume = c*b*a) (degrees of freedom = 3) MONOclinic - alpha = gamma = 90.0 degrees. (example: 50.0 40.0 30.0 90.0 70.0 90.0 ) (volume = c*b*a*sin(beta) ) (degrees of freedom = 4) TRIClinic - no restrictions on a, b, c, alpha, beta or gamma. (example: 50.0 40.0 30.0 60.0 70.0 80.0 ) (volume = c*b*a*sqrt(1.0 - cos(alpha)**2 - cos(beta)**2 - cos(gamma)**2 + 2.0*cos(alpha)*cos(beta)*cos(gamma)) ) (degrees of freedom = 6) HEXAgonal - a = b, alpha = beta = 90.0 degrees and gamma = 120.0 (example: 40.0 40.0 120.0 90.0 90.0 120.0 ) (volume = sqrt(0.75)*c*a**2 ) (degrees of freedom = 2) RHOMbohedral - a = b = c ; alpha=beta=gamma<120 (trigonal) (example: 40.0 40.0 40.0 67.0 67.0 67.0 ) (volume = a**3*(1.0-cos(alpha))*sqrt(1.0+2.0*cos(alpha)) ) (degrees of freedom = 2) OCTAhedral - a = b = c, alpha = beta = gamma = 109.4712206344907 (a.k.a truncated octahedron) (example: 40.0 40.0 40.0 109.471220634 109.471220634 109.471220634 ) (volume = 4*sqrt(3))/9 * a**3 ) (truncated cube length = a * sqrt(4/3) ) (degrees of freedom = 1) RHDO (Rhombic Dodecahedron) - a = b = c, alpha = gamma = 60.0 and beta = 90.0 (example: 40.0 40.0 40.0 60.0 90.0 60.0 ) (volume = sqrt(0.5) * a**3 ) (truncated cube length = a * sqrt(2) ) (degrees of freedom = 1) It is up to the user to ensure that the lattice parameters have the desired values for the system at all times. The values are stored by the program but, at present, there is no way of transmitting this information between jobs. For example, if the lattice parameters have been changed during a lattice optimisation then the new parameters, which are printed out at the end of the minimization, must be input here at the beginning of the next CHARMM run. Lattice parameters are stored in dynamic trajectory and restart files. 3. Crystal Phonon. Phonon calculates the dispersion curves for a crystal. Any value of the wavevector can be used (although, in practice, each component of KVEC is normally limited to the range -0.5 to +0.5). The dynamical matrix and normal mode eigenvectors determined in the phonon calculation are complex although the eigenvalues remain real. The syntax for the command is : Crystal Phonon Nkpoints Kvector To Nkpoints tells the program the number of points at which the derivative matrices must be built and diagonalised whilst the Kvector ... To ... clause determines the values of KVEC for each calculation. Thus, Kvector 0.0 0.0 0.0 To 0.5 0.5 0.5 Nkpoints 3 would solve for the crystal frequencies at the points, KVEC=(0.0,0.0,0.0), (0.25,0.25,0.25) and (0.5,0.5,0.5). If it is desirable, point calculations can be carried out by omitting the To statement and putting Nkpoints 1. For single calculations at KVEC=(0.0,0.0,0.0) the Crystal Vibration command is faster. The eigenvalues and eigenvectors at each value of the wave vector from the phonon calculation are saved and they can be written out to a file using the Crystal Write Phonon command. No analysis facilities exist within CHARMM for the phonon data structure as the eigenvectors are complex. It is to be noted that phonon and vibration calculations can only be performed on crystals of P1 symmetry. No information about the symmetry operations is used when generating the dynamical matrix. 4. Crystal Print. Two options exist with the Print command. If no keyword is given then the crystal image file is printed out. The Crystal Print Phonon command performs a similar function to the Print Normal_Modes command in the vibrational analysis facility. Selected frequencies and eigenvectors for a range of values of the wave vector can be printed out. The syntax is : Crystal Print Phonon Kpoints To Modes Thru Factor The Kpoints .. To .. clause determines the wave-vectors at which the modes are to be printed, the Modes .. Thru .. gives the range of the eigenvectors and the Factor command gives the scale factor to multiply each normal mode by. 5. Crystal Read. The Crystal Read command reads in a crystal image file. The file has the same output as produced by the Crystal Print or Crystal Write commands. The command is useful if a crystal image file was produced using the Crystal Build command and saved using the Crystal Write command in a previous job and it is desired to reuse the same transformation file for analysis or comparison purposes. The command can also be used to read in limited sets of transformations if specific crystal interactions need to be investigated. The transformation file is formatted so the Card keyword needs to be specified and the unit number must be given after the Unit keyword. 6. Crystal Vibration. For a free molecule with N atoms the dynamical equations have 3N-6 non-zero eigenvalues. This is no longer so for a crystal. If a crystal is made up of L unit cells each containing Z molecules with N atoms, the dynamical equations would have a dimension of 3NZL. However, using the symmetry properties of the lattice it is possible to factor the equations into L sets each with a dimension of 3NZ and each depending upon a vector, KVEC, which labels the irreducible representation of the translation group to which the set belongs. The force constant matrix is complex. Its form may be found in the references given at the end of the documentation. Vibration solves the dynamical equations for the case where the wave-vector is zero, i.e. when the equations are real. The procedure is invoked by the Crystal Vibration command. The syntax is : Crystal Vibration 7. Crystal Write. There are three Crystal Write options. If no keyword is given the crystal image file is written out, in card format, to the specified unit. The CARD and UNIT keywords are required. The Crystal Write Phonon command writes out the phonons from a phonon calculation. All the eigenvalues and eigenvectors for all values of the wavevector that are stored are written automatically. The Crystal Write Vibration command writes out the eigenvalues and eigenvectors from a vibration calculation. The modes to be written are given by the Mode .. Thru .. clause. All Write commands require that the Fortran stream number be given after the Unit keyword and a CHARMM title may be specified on the following lines. The structure of the phonon and vibration files for a crystal may be found by looking at the routines WRITDC and XFRQW2 respectively in the file [.IMAGE]XTLFRQ.SRC. The vibration modes are written in the same form as a for VIBRAN normal mode file and may be read in using the appropriate VIBRAN commands. Unfortunately no analysis facilities exist for complex eigenvectors within CHARMM and so users will have to write their own if they want to perform phonon calculations. 8. Crystal Minimization. It is possible to perform a lattice minimization using the normal CHARMM MINImize command and the ABNR minimizer. Two extra keywords have been introduced. If none of them is present then a coordinate minimization is performed as usual. If LATTICE is specified then the LATTice parameters and the atomic coordinates are minimized together. If NOCOoordinates is given with the keyword LATTice then only the lattice parameters are optimised. Specifying NOCOordinates by itself is an error. It should be noted that when the lattice is being optimised the crystal symmetry is maintained. A cubic crystal will remain cubic, etc.  File: Crystl, Node: Examples, Previous: Function, Up: Top, Next:Implementation Examples of input may be found in the test directory. All crystal files are prefixed by the string "xtl_". All the jobs involve L-Alanine. Briefly the jobs are : 1. XTL_ALA1.INP. The crystallographic fractional coordinates are read in and converted to real space coordinates using the CHARMM COORdinate CONVert command and the experimental values for the lattice parameters. 2. XTL_ALA2.INP. A crystal image file is generated for the crystal using a value of 10.0 Angstroms for the crystal cutoff. 3. XTL_ALA3.INP. A coordinate and lattice minimization are performed for the crystal. The crystal image file from the previous job is used and the optimised coordinates are saved. The main point to note is that before using the crystal package for energy calculations and other manipulations that involve the image non-bond lists an image update must be performed. For safety always do an update after building or reading in the crystal. Note too that the new, optimised lattice parameters are used in the all the subsequent input files. 4. XTL_ALA4.INP. For subsequent calculations a coordinate file that contains the coordinates of all atoms (four molecules of L-alanine) is generated. A crystal image file suitable to do this is read in directly from the input stream. It contains 6 transformations (not 3 as might be expected) because the CHARMM image facility requires that the inverses of all transformations be present. The first three are the ones needed and the last three are their inverses. An update is needed after reading the file to make known to the program the coordinates of the atoms in the first transformation of all the inverse pairs in the image list. The Print Coor Image file will then print out the coordinates of the atoms in the original asymmetric unit and the first three of the images. If the coordinates of the atoms in all the images are required then the keyword NOINV in the UPDATE command must be used (check IMAGE.DOC). 5. XTL_ALA5.INP. The same job as the second except that the crystal is generated for a whole unit cell (i.e. the system generated in the fourth job). The same value of the crystal cutoff is used. An energy is calculated too. The energy and its RMS coordinate derivative should be exactly four times (apart from a small round-off error) the value obtained for an energy calculation on a single asymmetric unit with the same lattice parameters and crystal cutoff (see job 3). 6. XTL_ALA6.INP. Peform a crystal vibration and phonon calculation for the optimised structure of the L-alanine crystal. The vibrational and phonon modes are written out to files and components of the first 24 phonon normal modes for the three values of the wavevector that were calculated are printed. To do the same for the vibrations it would be necessary to use the appropriate VIBRAN commands in another job.  File: Crystl, Node: Implementation, Previous: Examples, Up: Top Background and Implementation. The Crystal options and their commands were described above. The present section discusses relevant background material and briefly reviews the methods used in the implementation. Some technical points are also made. The crystal option is an extension to the CHARMM program. The source code is in the directory [.IMAGE] whilst the crystal data structure is in the file IMAGE.FCM. Two additional source code files have been added - CRYSTL.SRC and XTLFRQ.SRC. Small modifications have been made to the files ENERGY.SRC and EIMAGE.SRC. CHARMM Images and the Crystal Image Data Structure. As outlined above a crystal structure can be specified entirely by the action of the primitive translations A, B and C, and a small set of transformations, {T} (which themselves are functions of A, B and C), on an asymmetric group of atoms. In CHARMM the calculation of the energy assumes that there exists a cutoff distance beyond which all interactions between particles are neglected so that when performing calculations on supposedly infinite crystals only a limited portion of that crystal, i.e. that portion containing those atoms within the cutoff distance of the primary atoms, need be considered. The CHARMM image option, of course, already enables the energies of crystals to be calculated but the input required to use it to do so is cumbersome and time consuming. It is a great simplification to include an extra data structure that defines the crystal in terms of A, B and C and {T}. There are a number of advantages: 1. A crystal is regular so that its generation can be automated. All that needs to be done is to systematically transform the primary atoms by one of the set {T} and a linear combination of A, B and C. The result is obviously best stored in terms of A, B, and C rather than as absolute numerical values of the transformations. 2. It is essential to define a CHARMM crystal by A, B and C and {T} if the lattice parameters a, b, c, alpha, beta and gamma are to be varied because the coordinates of all the image atoms within the crystal will change during successive cycles of the optimisation as a, b, c, alpha, beta and gamma themselves change. 3. When constructing the dynamical matrix for a non-zero wave-vector it is necessary to know the unit cell to which a particular atom belongs in order to evaluate the exponential factor in the expression. Although the crystal data structure and the values of the lattice parameters define the crystal the individual transformations have to be worked out explicitly in order to determine energies, harmonic frequencies and so on. In the present version of the program the IMAGE facility is used, so that a new set of IMAGE transformations are calculated from the crystal data structure as soon as a crystal is built or every time the lattice parameters are changed. The use of the IMAGE facility means that the number of transformations that can be used is determined by the dimension of the IMAGE arrays (MAXTRN in DIMENS.FCM). Crystal and Image Patching. Crystal image patching is unavailable in the present version of the program so that bonds between images are not permitted. Similarly hydrogen-bond interactions described by an explicit hydrogen-bond function are also forbidden. The only forces that can be calculated between primary and image atoms are non-bonded ones. The Lattice Coordinate System. WARNING: If your system is not properly rotated, there will usually be bad contacts. If you have bad contacts, check the alignment. The convention used by CHARMM for orientating the crystal in real space involves the use of a symmetric transformation (h) matrix. For non-orthorhombic systems, these coordinates are different (rotated) from the aligned conventioned used by PDB and others. The conversion is performed by the COOR CONVert command, see *note convert:(chmdoc/corman.doc)Syntax. The Structure of the Crystal File. The crystal file is divided into three parts. A standard CHARMM title. A symmetry operation declaration section headed by the word Symmetry and terminated by an End. The transformations are written in the same way as for the Crystal Build command except that the identity transformation has to be explicitly listed. An image section headed by Images and terminated by an End. Here the images are defined in terms of the symmetry transformations and the lattice translations A, B and C. The comment line shows the column labelling. Sometimes it is useful to write one's own crystal files without recourse to the Crystal Build option. In this case the symmetry and image blocks can be put in any order (although only one of each is allowed) and there is no restriction on the positioning of blank and comment lines. Two examples of a crystal file are: * Crystal file for a P1bar crystal. * Symmetry (X,Y,Z) (-X,-Y,-Z) End Images ! Operation a b c 1 0 0 -1 1 0 0 1 2 0 0 0 End * Crystal file for a P212121 crystal. * Symmetry (X,Y,Z) (X+1/2,-Y+1/2,-Z) (-X,Y+1/2,-Z+1/2) (-X+1/2,-Y,Z+1/2) End Images ! Operation a b c 2 0 0 0 3 0 0 0 4 0 0 0 2 -1 0 0 3 0 -1 0 4 0 0 -1 End Second Derivative Calculations and the Use of Symmetry. Consider a crystal with a unit cell in which there is more than one asymmetric unit (i.e. all space groups other than P1). The dynamical matrix then takes a blocked form, with Z**2 blocks if Z is the number of asymmetric units. Each block is of dimension 3N x 3N and contains the sum over all unit cells of the second derivative interaction elements between the Mth and Nth asymmetric units. It is possible to calculate only the Z blocks (11), (12), ..., (1M), ..., (1Z) and then transform them to produce the full matrix. In the present program, however, it is necessary to perform vibration calculations on entire unit cells. It should be emphasised that while this symmetry transformation can be used for calculations of the normal mode eigenvectors and frequencies for the zero wavevector it does not hold at other values for all additional values. Therefore, simple symmetry arguments such as these do not hold for phonon calculations. Symmetry can also be used to block the dynamical matrix into several smaller matrices each corresponding to a different symmetry species, thereby greatly reducing the time needed for diagonalisation and automatically helping to identify the normal modes. Symmetry blocking is not coded at the moment. References. Lattice Dynamics of Molecular Crystals", Lecture Notes in Chemistry 26, S.Califano, V.Schettino and N.Neto (1981), Springer-Verlag, Berlin, Heidelberg and New York. A comprehensive monograph with good sections on the theory of lattice vibrations and normal mode symmetries. A.Warshel and S.Lifson, J.Chem.Phys. (1970), 53, 582. The original CFF paper on crystal calculations. It describes the theory behind crystal optimisations and vibrational calculations. E.Huler and A.Warshel, Acta Cryst. (1974), B30, 1822. An extension of the work in reference 2. "Infrared and Raman Spectra of Crystals", G.Turrell (1972), Academic Press, London and New York. A nice clear introduction to the subject. CHARMM Element doc/developer.doc 1.1  File: Develop, Node: Top, Up: (chmdoc/charmm.doc), Previous: (chmdoc/testcase.doc), Next: (chmdoc/changelog.doc) CHARMM Developer Guide This is to provide a guide to someone who wants to understand how CHARMM is implemented, and a variety of rules that should be followed by anyone who wishes to modify it. Anyone who wishes to modify CHARMM is advised to read through everything in this document. * Menu: * Implement:: CHARMM Implementation and Management * Directories:: What directories are used to store what information * Standards:: Standards (rules) for writing CHARMM code * Tools:: Tools for CHARMM developers * Modify:: The procedure for modifying anything in CHARMM * Document:: How to document CHARMM commands and features * Checkin:: How to deposit your development version into the central library  File: Develop, Node: Implement, Up: Top, Previous: Top, Next: Directories CHARMM Implementation and Management CHARMM is implemented as a single program package, which is developed on a variety of platforms. As a result, it includes some machine specific implementations and makes heavy use of the virtual memory capabilities. By placing everything together, the task of modifying the program is made more reliable because errors in modifying the program are more likely to be noticed. The single source package concept helps us to maintain integrity of CHARMM as the paradigmatic macromolecular research software system running on a variety of platforms. CHARMM was originally written in FLECS, FORTRAN77 and C languages. In the past, before FORTRAN77, FLECS allowed us to use a variety of control constructs, e.g., WHEN-ELSE, WHILE, UNLESS, etc. A FLECS to FORTRAN translator was used to process FLECS source code to produce FORTRAN source. FORTRAN77 provide us some structured language constructs. We began to program directly in FORTRAN77. The initial project in CHARMM23 development was to convert all FLECS source codes into FORTRAN. CHARMM 23f2 and later versions are fully in FORTRAN except some machine specific codes written in the C language. All new code should be written in FORTRAN77. Since CHARMM version 22, all files are maintained by utilizing software engineering tools. We use the CVS (Concurrent Versions System) utility to maintain the CHARMM source code, the documentation and other supporting files. The CVS repository resides on tammy.harvard.edu (Convex C220 running under UNIX). CVS is a superset of RCS (Revision Control System); file and code management is carried out with CVS and RCS commands. The CHARMM manager will be the sole owner of all CHARMM files and he will schedule/checkin/merge contributions from both in-house and remote developers.  File: Develop, Node: Directories, Up: Top, Previous: Implement, Next: Standards CHARMM Directory Structure CHARMM files are organized in the following directories. We use UNIX pathnames throughout the document. ~/ is the parent directory that contains the CHARMM main directory, ~/cnnXm. nn is the version number, X is the version trunk designator (a for alpha or developmental, b for beta release and c for gamma or general release) and m is the revision number. For example, c24b1 is CHARMM version 24 beta release revision 1. Directory Purpose ------------------ --------------------------------------------------- ~/cnnXm The main directory of the current CHARMM version. install.com procedure runs in this directory. ~/cnnXm/source Source and include files. ~/cnnXm/doc Documentation ~/cnnXm/test Testcases ~/cnnXm/toppar Standard topology and parameter files. ~/cnnXm/support Holds various support programs and data files for CHARMM. See *note Support: (chmdoc/support.doc). ~/cnnXm/tool Contains the preprocessor, prefx, and other CHARMM processing/management tools. ~/cnnXm/build Contains Makefile, module makefiles and the log file of the install make command for each machine in the subdirectory named after the machine type. ~/cnnXm/lib Contains library files ~/cnnXm/exec Will hold executables  File: Develop, Node: Standards, Up: Top, Previous: Directories, Next: Tools Standards (rules) for writing CHARMM code Because CHARMM is implemented by a group, there are a number of conventions which must be observed in order for the program to remain modifiable, usable, and transferable. The rules which have been established towards this end are listed below. 1) Gross subroutine organization: All INCLUDE statements are processed by the preprocessor to handle machine dependent INCLUDE'ing. ##INCLUDE is the preprocessor keyword. nn as in charmm_nn denotes the version number. Each subroutine should have the following structure. Note that any data statements come after all declarations and parameter statements, but before the first line of executable code. SUBROUTINE DOTHIS(ARG1,ARG2,.... C C A comment which describes the purpose of this subroutine. C This may include important variables and what their use is C to aid in understanding and modifying the routine. C C A description of all passed arrays and arguments if C users need to call this routine. C ##INCLUDE '~/charmm_fcm/impnon.fcm' (required) ##INCLUDE '~/charmm_fcm/dimens.fcm' (if dimensioned common blocks are included) ##INCLUDE '~/charmm_fcm/exfunc.fcm' (if external functions are called) ##INCLUDE '~/charmm_fcm/number.fcm' (if commonly used real*8 numbers are used) C declare all passed variables here C ##INCLUDE '~/charmm_fcm/what_i_need.fcm' ##INCLUDE '~/charmm_fcm/more_i_need.fcm' C C local Declarations of ALL local variables and parameters. . data statements at end of declarations. C C begin . . Code (liberally documented through comments) . . END 2) All code should be written clearly. Since the code must be largely self-documenting, clarity should not be sacrificed for insignificant gains in efficiency. Variable names should be chosen with care so as to illustrate their purpose. Avoid using one or two letter variable names except for scratch variables. Comments should be used where the function of code is not obvious. 3) Input/Output a) The RDCMND routine should be used to read lines from the command stream. XTRANE should be called to be sure that the entire command line is parsed. b) Short outputs, messages, warnings, and error should be sent to unit OUTU (accessed by ##INCLUDE '~/charmm_fcm/stream.fcm') for output. c) All non-fatal messages should state what subroutine generated it. d) PSF and parameter unformatted I/O file formats must remain upward compatible. Use an ICNTRL array element to indicate which version of CHARMM wrote the file. Such upward compatibility must be maintained only across production versions of CHARMM. In other words, a file format for the developmental version may be freely changed until a new version is generated, at which point all future versions must be able to read it. e) I/O of files should be possible in both card and binary format, and routines should exist to interconvert between the two. f) use as many significant digits as needed but not more; in particular WRITE(OUTU,*) X should be avoided. It makes output unreadable and makes testing on different machines difficult. g) All output must be performed based on the PRNLEV value. This is to enforce only node_0 to carry out I/O on parallel platforms. For example, WRITE (OUTU,'(FORMAT)') ITEMs should be coded as IF (PRNLEV.GT.N) WRITE (OUTU,'(FORMAT)') ITEMs where N is an appropriate print level. h) PRINT, especially PRINT,* should NOT be used. 4) All error conditions must terminate with a CALL WRNDIE(...); direct calls to DIE should not be used; subroutine DIEWRN should be phased out. 5) Large or variable storage requirements must be met on the stack or heap. When allocating space for the stack or heap, the appropriate space allocation subroutine MUST be called. For example, to allocate J integer words off the stack, POINTER=ALLSTK(J) is not sufficient. One must use POINTER=ALLSTK(INTEG4(J)) to ensure proper performance across different machines. This also applies when freeing the space. The amount of space required for any purpose should NEVER be assumed. This is essential for the portability of CHARMM. 6) Array overflows should be checked for. Error checking in general should be as complete as feasible. Consider checking for overflows (recipricals of very small numbers, exponentials of very large numbers, etc.), square roots of negative numbers, arccosine or arcsine of numbers of absolute value greater than one, etc. 7) The code should use a minimum of non-standard Fortran-77 features. Such features MUST be restricted to the machine dependent modules, or encapsulated in ##IF - ##ELSE - ##ENDIF preprocessor constructs. The only non-Fortran-77 features we use are the INCLUDE statement and the REAL*8 (and INTEGER*2) designators. 8) All common blocks are to be placed in files and INCLUDE'd into the program. The common blocks should have comments describing each variable in the common block so that new users will know what's there. The comments should also give clear relationships between the variables, so that redimensioning the common block is straightforward. The device on which the file resides must be given as ~/charmm_fcm/ where fcm stands for FORTRAN COMMON file. The common block files should be named with lower case and have the extension .fcm. Every variable in every common block must be declared within the FCM (INCLUDE'd) file. No Data statements should appear in FCM files, and variables declared in FCM's should not then be initialized in any other Data statement within subroutines. Moreover, a variable should not appear in more than one FCM if there is a possibility that both FCM's will be used in the same subroutine. The multiple declaration will result in an error. 9) Functions should NEVER be called with a CALL statement. Note that ENTRY points of functions are also functions. Moreover, avoid the use of ENTRY points. 10) The generic use of a function should be used unless there is a good reason not to. For example, use SQRT(DP) rather than DSQRT(DP). 11) Real constants should be defined in PARAMETER statements. The statement above the PARAMETER statement should declare the parameter. Only parameters defined in the following line should be declared in such a position. Double-precision (REAL*8) constants should be PARAMETERized with a D. For example: REAL*8 ONED, THREE, FIVE, SEVEN PARAMETER (ONED=1.0D0, THREE=3.0D0, FIVE=5.0D0, SEVEN=7.0D0) INTEGER MAXATM PARAMETER (MAXATM=99999) REAL ONES PARAMETER (ONES=1.0) Declaration and Parameter statements should not use continuation cards. See ~/charmm_fcm/number.fcm for frequently used numbers. 12) All routines should be up to the IMPLICIT NONE standard. This means that all variables and arrays, whether passed or not, must be declared. This is accomplished by inserting "##INCLUDE '~/charmm_fcm/impnon.fcm'" in each routine. The file ~/cnnXm/source/fcm/impnon.fcm may then be modified for testing purposes, but should contain only comments for normal usage or for machines without an IMPLICIT NONE statement. (Here ~/charmm_fcm/ is logically bound to the directory ~/cnnXm/source/fcm) All elements of common blocks MUST be declared in the appropriate common file. 13) All programming should be done in capital letters. Only comments and character strings may use lower case. No tabs should appear in code or documentation. 14) All strings must be stored in CHARACTER variables. Although integer and real variables will serve on some machines, this is non-standard and eventually causes problems in transportation. 15) For routine command parsing, the keyword parsing functions INDXA, GTRMA, GTRMF, GTRMI, and NEXTA4 should be used. 16) The DIMENSION statement should not be used. Neither should the PRINT statement. 17) Variable names longer than 10 characters should not be used. Also, 1 and 2 letter variables should be avoided in large routines (except for loop count variables). 18) Precision variables should be REAL*8 (rather than DOUBLE PRECISION). 19) All variables must be initialized before first use. This may best be done in the routine INIALL, which is called very early in every CHARMM run. 20) Other coding conventions make it easier to search through text for particular strings using the SEARCH, fpat, or grep commands. Poorly placed spaces can make it very difficult to maintain code. Never put a space within a variable name. Here are some other examples; Good Please Avoid ----------------- ------------------ GOTO GO TO CALL DOSOME(... CALL DOSOME(... ARRAY(5) = 20 CALL DOSOME (... ARRAY(5)=20 ARRAY (5) = 20  File: Develop, Node: Tools, Up: Top, Previous: Standards, Next: Modify CHARMM Developer Tools CHARMM is available on a variety of computational devices and we strongly support multiplatform development efforts. CHARMM tools are utility programs/procedures for installation, modification, optimization, etc. In ~/cnnXm/tool, we include the preprocessor PREFX and utility procedures for module makefile generation. The FLECS to FORTRAN translator FLEXFORT is no longer needed since CHARMM c23f2 and removed from this and later distribution versions. * Menu: * prefx:: CHARMM Source Code Preprocessor * makemod:: Module Makefiles and Optimization Procedure  File: Develop, Node: prefx, Up: Tools, Previous: Tools, Next: makemod CHARMM Preprocessing There is a CHARMM preprocessor, PREFX (formerly PREFLX), which reads source files as input and produces fortran files for subsequent compilation. The main purpose of this propocessor is to allow a single version of the source code work with all platforms and compile options. A summary of preflx capabilities: 1. Allows selective compile of machine specific code 2. Allows selected features to be not compiled (to reduce memory needs) 3. Supports a size directive to allows larger (and smaller) versions. 4. Handles the inclusion of .fcm files in a general manner 5. Allows alternate include file directory to be specified 6. Allows code expansion for alternate compiles (can move IFs from a DO loop). 7. Allows comments on source lines following a "!" 8. Handles the conversion to single precision (CRAY, DEC alpha,...) 9. Identifies unwanted tabs in the source code 10. Checks for line lengths exceeding 72 for non-comments 11. Allows processing multiple files from a list (Macintosh version). 12. Allows the removal of "IMPLICIT NONE" from source files. The source files have the extension ".src" and the include files have ".fcm". These files are processed by the preprocesser (PREFX) ------------------------------------------------------------------------------ SELECTIVE COMPILATION Conditional compilation is controlled by simple directives. The directives all start with "##" in the first column. Global keywords should be in upper case and have multiple letters (local keywords use single character or lower case). ##IF constructs can be nested (up to 40 levels). ##IF keyword(s) (match-token) ! process code if any keyword is active. ##ELIF keyword(s) (match-token) ! after and IF or IFN, alternate processing. ##ELSE (match-token) ! after and IF, ELIF, or IFN, process the rest. ##ERROR 'message' ! indicates an error within a ##IF construct (usually after a ##ELSE condition). ##ENDIF (match-token) ! terminates IF, IFN, ELSE, or ELIF constructs. ##IFN keyword(s) (match-token) ! process code if no keyword is active. keywords:: A set of one or more keyword that me be specified in prefx.dat or in an ##EXPAND construct (see below). match-token :: unique text string in parentheses; must be the same for each use in an ##IF ... ##ENDIF block Example (from fcm/dimens.fcm): INTEGER MAXVEC ##IFN VECTOR PARVECT (maxvec_spec) PARAMETER (MAXVEC = 10) ##ELIF LARGE XLARGE (maxvec_spec) PARAMETER (MAXVEC = 4000) ##ELIF MEDIUM (maxvec_spec) PARAMETER (MAXVEC = 2000) ##ELIF SMALL (maxvec_spec) PARAMETER (MAXVEC = 2000) ##ELIF XSMALL (maxvec_spec) PARAMETER (MAXVEC = 1000) ##ELSE (maxvec_spec) ##ERROR 'Unrecognized size directive in DIMENS.FCM.' ##ENDIF (maxvec_spec) When multiple keywords are specified, an "OR" condition is implied. IF an "AND" condition is required, use a nested ##IF construct. In the example above, MAXVEC will not be 10 if either VECTOR or PARVECT is specified. The text ".not." may be added before a keyname to test for its inverse. For example, the following constructs are equivalent: ##IFN BLOCK ##IF .not.BLOCK but these are not equivalent: ##IFN BLOCK TSM ##IF .not.BLOCK .not.TSM This is because the the first will select when both are false but the second will select when either is false. Selective compilation may also be done using on a single line using a "!##" construct. The syntax is: standard-fortran-line !## keyword(s) ! comments A space is not required between the "!##" and the keyword list. For example the following constructs are equivalent: Standard format: ##IF LONGLINE QLONGL=.TRUE. ##ELSE QLONGL=.FALSE. ##ENDIF Compact format (with comments): QLONGL=.TRUE. !##LONGLINE ! specify the QLONGL common flag QLONGL=.FALSE. !##.not.LONGLINE ! based on compilation options Both "and" and "or" conditions can be used for one line processing: !##PERT !##PARALLEL - An "AND" conditional compile !##PERT PARALLEL - An "OR" conditional compile ------------------------------------------------------------------------------ INCLUDE FILES Common files may be included with the ##INCLUDE directive. The filename must follow in single quotes. A directory may preceed the filename with the UNIX format. The keyword PUTFCM causes the contents of the included file to be copied and processed as well. This is necessary if ## constructs are present in the included file. The keyword FCMDIR may override the specified directory in the include directive. The VMS keyword will convert the directory name to VMS format. There is also special directory name conversion for the Macintosh version. An included file may invoke another include file (up to 20 levels). Example: ##INCLUDE '~/charmm_fcm/impnon.fcm' ------------------------------------------------------------------------------ CODE EXPANSION For computational intensive routines which are not too large, code expansion may be used to increase efficiency. This is achieved by moving constant IF conditions to the outside of major loops. Code expansion is optional and (if done properly) the code should function in both expanded and unexpanded forms. This means that the code should be written and tested in an unexpanded form and then retested with expansion enabled. ##EXPAND local-flag(s) .when. conditional-flag(s) (identifier) Expand subcommands control section (immediately following the ##EXPAND): ##PASS1 flag1 flag2 ... ##PASS2 flag1 flag2 ... ##PASS3 ... - code sections and conditions for each pass ##PASS[n] ... ##EXFIN - code section for the termination of the expand section ##EXEND - end of expansion specification ##ENDEX (identifier) (the identifier is required and must match the corresponding ##EXPAND). For each pass, the specified flags are temporarily set (or .not. set) as requested. If all of the conditions for the code expansion (flags specified after the .when. construct) are not set, then all flags from the ##EXPAND line (before the .when.) are temporarily set and no code expansion is processed. Example (from nbonds/enbfs8.src): ... ... ... C Do block expansion of code ##EXPAND B forces .when. BLOCK EXPAND (expand_block) ##PASS1 .not.forces IF(QBLOCK .AND. NOFORC) THEN ##PASS2 forces ELSE IF(QBLOCK) THEN ##PASS3 .not.BLOCK forces ELSE ##EXFIN ENDIF ##EXEND C DO I=1,NATOMX ! Begin of main loop ... ... ... IF (.NOT. NOFORC) THEN !##B ##IF forces DX(I)=DX(I)+DTX DY(I)=DY(I)+DTY DZ(I)=DZ(I)+DTZ ##ENDIF ENDIF !##B ... ... ... ENDDO ! End of main loop ##ENDEX (expand_block) RETURN END This example will do a multi pass compilation when BOTH the "EXPAND" and the "BLOCK" keywords are set. If they are not both set, then the local flags "B" and "forces" will be set until the corresponding ##ENDEX is reached. If the "EXPAND" and "BLOCK" conditions are met, then the body of the expanded section will be compiled three times. PASS1 - additional active flag: disabled flag: forces PASS2 - additional active flag: forces disabled flag: PASS3 - additional active flag: forces disabled flag: BLOCK ------------------------------------------------------------------------------ RESERVED KEYWORDS The following keywords are reserved: END - The end of keywords in prefx.dat (END is not a keyword) SINGLE - Conversion to single precision (SINGLE is a keyword) PUTFCM - Include files are to be copied into fortran files VMS - Use VMS directory names (from DEC's DCL) REMIMPNON - Remove any "IMPLICIT NONE" lines found in the source FCMDIR - Specification of include file directory UPPERCASE - Convert all non-text code to uppercase Fortran LONGLINE - Allows a longer line output format (>80 characters). SAVEFCM - Include all SAVE statements EXPAND - Do semi-automatic code expansion single-letter - reserved for unexpanded compile conditionals lower-case - reserved for local compile flags (within a routine) preflx.dat or pref.dat are the preprocessor instruction data files. Create a file preflx.dat or pref.dat (with UNIX) that contains one or more of the keywords specified below. On UNIX platforms, install.com generates the default pref.dat file in build/{machine_type} directory. "END" keyword stops parsing keywords. The use of a (Match-Token) can help to identify the components of ##IF blocks in source files that make heavy use of ## directives; it should follow any keywords, and must be appended to all components of a given ##IF block (if it is used). See the code for more examples. ------------------------------------------------------------------------------ LIST OF ALL KEYWORDS IN CHARMM A complete list of all compile flags and options in CHARMM (version 25a2) [1] Include File Directory FCMDIR=directory_name ! point to a particular directory FCMDIR=CURRENT ! use what is specified in the include line. FCMDIR=LOCAL ! use the local directory. [2] Machine Type (choose exactly one) ALLIANT = Alliant ALPHA = DEC alpha workstation APOLLO = HP-Apollo, both AEGIS and UNIX ARDENT = Stardent, Titan series CONVEX = Convex Computer CRAY = Cray Research Inc. DEC = DEC ULTRIX FUJITSU GWS HPUX = Hewlett-Packard series 700. IBM = IBM-3090 running AIX IBMRS = IBM-RS IRIS = Silicon Graphics MACINTOSH = Apple Macintosh computers (system 7) SUN = Sun Microsystems VAX = Digital Equipment Corp. VAX VMS. Other machine descriptors IBMMVS = IBM's MVS platform IBMVM = IBM's VM platform CMEM ???? (included with some CONVEX code) GNU = using GNU Fortran compiler CMEM = A convex option? Parallel machine types ALPHAMP ! DEC Alpha Multi Processor machines CM5 ! Machine type = TMC's CM-5 machine CSPP ! Convex PA-RISC parallel system (HP chip) CSPPMPI ! Convec SPP using proprietary MPI library DELTA ! machine type = Intel delta (Caltech) machine IBMSP ! machine type = IBM's SPn cluster machines INTEL ! machine type = Intel iPSC Hypercube PARAGON ! machine type = Intel Paragon machine SGIMP ! machine type = SGI Power Challenge T3D ! Cray massively parallel (DEC Alpha chip) T3E ! Cray massively parallel (DEC Alpha chip) TERRA ! multiprocessor DEC Alpha chip system [3] Operating system (choose at most one) AIX370 = IBM UNIX UNIX = UNIX UNICOS = Cray UNIX OS2 = IBM pre-emptive multitasking [4] Size directive (must choose exactly one) XLARGE =240480 atom limit LARGE = 60120 atom limit MEDIUM = 25140 atom limit SMALL = 6120 atom limit XSMALL = 2040 atom limit [5] Machine Architecture (may choose several) SCALAR ! machine characteristics = default for scalar machines VECTOR ! feature directive * = Vectorized routines PARVECT ! Parallel vector code (multi processor vector machines) CRAYVEC ! Fast vector code (standard vector code) SINGLE ! specifies single precision version (primarily used for CRAY) [6] Parallel CHARMM descriptors (see parallel.doc) (all require the PARALLEL keyword) COMMEASURE ! enable parallel communications timing code GENCOMM ! Use general communications scheme MANYNODES ! use options that are more efficient with many nodes MPI ! Using MPI communication primitives PARAFULL ! Full communication parallel scheme. PARALLEL ! Multi-machine (Intel, workstation clusters,...) PARASCAL ! Scalable method (coordinates and forces not global) PVM ! use PVM parallel communcations library PVMC ! use PVM parallel communcations library; alt. method SHMEM ! Shared memory put & get SOCKET ! Use socket calls for communication SYNCHRON ! Specify synchronized communication (not bidirectional) [7] Feature directives ASPENER = Atomic Solvation Parameter energy term BLOCK = Energy partition and free energy code DIMB = Iterative diagonalization, reduced basis (normal modes) DMCONS = Contact map umbrella potential routine DOCK = modification of block to include assymetric matrix EISPACK = Use the EISPACK code for diagonalization FMA = Fast Multipole method FOURD = minimization and dynamics in 4 dimensions GAMESS = Include the GAMESS QM package LATTICE = Module to read/write Skolnick lattice files LDM = Lambda-dynamcis module MCSS = Multiple Copy Simultaneous Search MMFF = Merck's Molecular Force Field MOLVIB = MOLVIB vibrational analysis code MTS = Multiple time step code NIH = NIH default specs code NOPARASWAP = inhibit ASP parameter swap method (requires ASPENER) OLDDYN = Old dynamics integrator PBEQ = Poisson Boltzmann equation solver PBOUND = Simple periodic boundary method PBOUNDC = Additional keyword for pbound in cray vector code PERT = NIH free energy code PM1 = PM1 polarization water model POLAR = Feynman path integral simulations and PM6 or PM1 PRIMSH = Shell option in MMFP? QUANTA = Quanta interface code QUANTUM = AM1 QM/MM method using MopacXX (not with GAMESS or CADPAC) REPLICA = Replica code (requirs BLOCK) RGYCONS = Umbrella potential in radius of gyration RISM = RISM solvation code RXNCOR = RXNCOR code SHAPES = NIH shape descriptor code (under development) TNPACK = truncated Newton minimization TRAVEL = PATH and TRAVEL code TSM = TSM and ICPERT code [8] Graphics keywords; choose only one (except on Apollo) GLDISPLAY = use the GL display code for the graphics window (*) NODISPLAY = no graphics window; PostScript, other files produced NOGRAPHICS = graphics code not compiled XDISPLAY = use the X11 display code for the graphics window (*) the GL code is relatively untested, and may have problems [9] Keywords Not for Normal Use JUNK = Code with problems or unused (but not ready to be discarded) DEBUG = Extra print statements. IPRESS = Pressure code in suspended development (for PBOUND) REPDEB = debug replica code. UNUSED = isolate code apparently not used [10] Major Blocks that can be Removed, but normally are not NOCORREL ! removes time series analysis NODISPLAY ! graphics w/o screen display (HP stick plot, PLUTO, LIGHT, ...) NOGRAPHICS ! removes all graphics code NOIMAGES ! removes image and crystal fascility NOST2 ! removes ST2 water model routines NOVIBRAN ! removes vibrational analysis section NOMISC ! removes miscellaneous stuff: ! BARR, DRAWSP, HBUILD, PATH, QUICKA, SBOUND, SURFAC, ! XRAY, TESTCH, RXNCOR [11] Other Control Directives EXPAND ! Do semi-automatic code expansion LONGLINE ! Allows a longer line output format (>80 characters). SAVEFCM ! Include all SAVE statements in .fcm files SINGLE ! Conversion to single precision (SINGLE is a keyword) PUTFCM ! Include files are to be copied into fortran files VMS ! Use VMS directory names (from DEC's DCL) REMIMPNON ! Remove any "IMPLICIT NONE" lines found in the source UPPERCASE ! Convert all non-text code to uppercase Fortran By employing appropriate preprocessor keys, one can generate a variant of CHARMM. CHARMM22 is defined by the feature directives; OLDDYN, BLOCK, PERT, TSM, MOLVIB, RXNCOR and TRAVEL.  File: Develop, Node: makemod, Up: Tools, Previous: prefx, Next: Tools Module Makefiles and Optimization The installation script install.com works with a set of makefiles in ~/cnnXm/build/{machine_type}. These makefiles play the key role in developing, optimizing and porting CHARMM code on the machine you are working with. [1] Porting to Other Machines You may begin with the given set of makefiles for a machine close in the architecture to the one to which you intend to port CHARMM. First you have to decide a name for the machine platform. For example, we have chosen IBMRS for IBM RS/6000 series. cp -r ~/cnnXm/build/{closely_related_machine_type} \ ~/cnnXm/build/{your_chosen_machine_type} Then delete Makefile in the new build directory and remane Makefile_{closely_related_machine_type} to Makefile_{your_machine_type}. You may have to modify compile commands and compiler flags in the Makefile template. Study carefully ~/cnnXm/install.com and modify it if necessary. In most cases, you just need to correct echo messages to address your machine properly. Then issue the install.com command. [2] Optimization Once you make the makefiles working properly, you can carry out a compiler level optimization for the CHARMM version. FORTRAN compile macro's are defined in Makefile_{machine_type}, e.g., $(FC1), $(FC2), $(FC3), etc. Compiler options are bound to these compile macros. You may inspect each module makefiles and set a proper compile command for a given FORTRAN source. For example, the following are the default optimization flags for the c24b1 release. Most of source files are compiled by $(FC2) execpt build/convex/energy.mk $(FC0) ehbond.f $(FCR) enefst2.f $(FCR) enefst2q.f $(FC3) enefvect.f build/convex/image.mk $(FCR) imnbf2p.f $(FC3) imnbfp.f $(FC0) nbondm.f build/convex/manip.mk $(FC0) corman.f $(FC3) fshake.f $(FCR) fshake2.f build/convex/nbonds.mk $(FCR) enbf2.f $(FCR) enbf3.f $(FCR) enbf4.f $(FCR) enbf5.f $(FC3) ewaldf.f $(FCR) ewaldf2.f $(FCR) nbndf2p.f $(FC3) nbndfp.f build/convex/quantum.mk $(FC0) qmdata.f $(FC0) qmene.f $(FC0) qmjunc.f $(FC0) qmpac.f $(FC0) qmset.f [3] Generating Module Makefiles We have included the makemod script that finds all include file dependencies. The makemod script is used for all source modules except main, for which we use mainmake instead. makemod [-v] [-n] module_name actual_path sourcetree_path \ makefile_name [definition_file_name] where -v means verbose and -n means the include files don't have includes (This saves time). The module name is general the module directory, e.g., dynamc. The actual path is generally ~/cnnXm/source/{module} and the source tree path is ~/cnnXm/source or the like. We generally use module_name.mk for the makefile name. The definition file if specified (we generally don't) will prepend a file of definitions to the makefile. For example, the way to generate the makefiles is to: cd ~/cnnXm/source/{module} makemod -vn {module} `pwd` `cd ..;pwd` {module}.mk where {module} is the sub-directory in source. When you want to create the full set of module makefiles, you may use setmk.com in ~/cnnXm/tool. setmk.com your_machine_type [4] Usage Note on makemod When you generate module.mk files from scratch, the FORTRAN compile macro $(FC2) is used for all source files. In order to set the compiler option for further optimization, you have to modify the module makefiles to set the macro manually.  File: Develop, Node: Modify, Up: Top, Previous: Tools, Next: Document The procedure for modifying anything in CHARMM This procedure describes the steps which should be taken when modifying a source file in CHARMM. When you are developing CHARMM source code, always maintain close contacts with the CHARMM manager and other developers. Inform them your development plan and which files you are working on. See *note Checkin:: for checkin procedure needed when you deposit your code in the CHARMM central library. 1) Get a copy of the current release package. If you are a CHARMM developer and plan to integrate your program into CHARMM in the future, make sure that you obtain the most current version. Check with the CHARMM manager. 2) Once, you obtain the package, you are branching out from the main CHARMM source code control system. You should record details of modification so that you may REDO them when you check your files in the central CHARMM library. 3) While you make modifications and debug them, follow the guidelines in *note Standards::, so that CHARMM code will be consistent. If your modification does not involve any changes in source file directory structure and makes no changes in INCLUDE statements, you may use the module makefiles supplied (with the extension .mk) in ~/cnnXm/build/UNX. If you add/remove any source files, reorganize them, modify any INCLUDE statements or are porting to other machine than those already supported, you have to build the relevant module make files. See *note Tools:: for more information on makemod. 4) In your local ~/cnnXm directory, you may issue install.com command to build the library and the executable. See *note Install: (chmdoc/install.doc)Install. Your library is built in ~/cnnXm/lib/{machine_type} and the executable will be in ~/cnnXm/exec/{machine_type}. You may find the log file {machine_type}.log in ~/cnnXm/build/{machine_type}. 5) If your modification involves a new feature, you should either modify an existing test or make a new test to demonstrate and check its operation. See *note testing: (chmdoc/testcase.doc), for a description of the tests currently available. If you add a new test, update the ~charmm/doc/testcase.doc file. 6) If your change involves adding or modifying a command or adding or modifying a feature, modify existing documentation or if none is available, make new documentation. Make sure that the emacs info program can read the document and the format of your documentation is consistent with other documents.  File: Develop, Node: Document, Up: Top, Previous: Modify, Next: Checkin How to Document CHARMM Commands and Features Documentation is an integral part of CHARMM developments. In order to document commands and features under development in a consistent manner, we recommend the following documentation format. All decumentations should be accessible (readable) through the emacs info facility. If you do not know how to put the info directives, ask the CHARMM manager for assistance. Each documentation file, with the extention .doc, should contain 1) One brief paragraph of motivation, theoty, procedure or whatever is neccessary for a particular feature. Here, some references can be given. 2) A table of contents of the documentation (to serve as the info menu) 3) The command syntax 4) Complete description of all the commands and sub-commands. Here the syntax, defaults and file names involved would be described. A brief account of what the command accomplishes would also be given. The order in which various commands should be invoked would be described. Relevant commands and subcommands can be cross-referenced with a key. 5) One or two examples involving concepts and commands described (No output listing) The same notation should be followed throughout the documentation. [...] optional, can be present only once, if at all. {...} can be repeated any number of times, must be present at least once. [{...}] or [{...}] can either be missing or be present any number of times. n{...} must be present exactly n times. either A or B must be present Syntax definitions will use literal keywords such as VIBRan, READ, MINI, VERLet, etc. These are to be typed as such. Syntax definitions can also use dummy keywords such as atom_name, atom_index and atom_type. The meaning and variable type can be listed just after the syntax notation. For literal keywords the documentation and examples will use uppercase characters immediately followed by zero or more lower case characters. Dummy keywords will be written in all lower case.  File: Develop, Node: Checkin, Up: Top, Previous: Document, Next: Top Checkin Procedure in CHARMM Management System We maintain CHARMM as an integrated single source package. The following rules have been established to keep CHARMM as such and to minimize time consuming problems that result from carelessness and conflicts between developers. 1) It is always wise to inform the CHARMM manager about your development plan and time table so that he may arrange CHARMM management schedule and prevent you duplicating works done by others. The list of files you are working on and the nature of modification should be reported in advance. You may use the template form for such a report, found in support/form/project.form. 2) When you finish your project in code development, make an appointment with the CHARMM manager and get the most current version of the distribution package. Then, he will lock the central CHARMM library and allow you to work on it. Normally you are given a couple of weeks to incorporate your contributions into the main source. It is very important to plan ahead for the appointment. 3) Follow the steps in *note Modify:: to integrate your modifications with the current source. When you finish incorporating modifications, debugging, testing and documenting, prepare to send your final version in CHARMM management system. (a) Modified/added files including source, testcase input, documentation and others and (b) the change-log file describing the modifications in detail are required for your checkin. 4) You and the manager will work together to check the modified files in the central CHARMM system. After the successful incorporation, the change-log file will be mailed to all CHARMM developers (charmm-bugs@tammy.harvard.edu) and the new developmental version will be established. CHARMM Element doc/dynamc.doc 1.1  File: Dynamc, Node: Top, Up: (chmdoc/commands.doc), Next: Syntax Dynamics: Description and Discussion There are four separate dynamics integrators available in CHARMM: (This discussion does not apply to multi-body dynamics, which has a separate set of integrators). See *note Mbond:(chmdoc/mbond.doc). Name Keyword Module Original Verlet ORIG dynamcv.src Leapfrog Verlet LEAP dynamc.src (default) Velocity Verlet VVER dynamvv.src 4-D L-F Verlet VER4 dynam4.src All methods are based on the Verlet scheme, and when used without any special features, provide identical trajectories for short simulations. All methods allow SHAKE. The ORIG integrator is a standard 3-step Verlet integrator with few frills. It allows: Langevin Dynamics (LANG) Thermodynamic Simulation Method (TSM) The LEAP integrator is similar to the ORIG integrator, but does provide increased accuracy (esp. for single precision version of CHARMM). It allows: Langevin dynamics (LANG) (with accurate temperatures printed) Constant Temperature and Pressure (CPT) (based on Berendsen's method) Accurate pressures with SHAKE High frequency correction to the total energy Parallel code Free energy equilibration indicator (deltaF*V) (with PERT) Thermodynamic Simulation Method (TSM) The VVER integrator also provides increase accuracy. It allows: Constant Temperature (NOSE) (Nose-Hoover method) Multiple Time Step (MTS) The VER4 integrator enables the energy embedding technique that entails placing a molecule into a higher spatial dimension [Crippen, G. M. & Havel,T.F. (1990) J.Chem.Inf.Comput.Sci. Vol 30, 222-227]. The possibility of surmounting energy barriers with these added degrees of freedom may lead to lower energy minima. Here, this is accomplished by molecular dynamics in four dimensions. Specifically, another cartesian coordinates was added to the usual X, Y, and Z coordinates in the LEAPfrog VERLet algorithm. In order to generate a dynamics trajectory, all requirements for evaluating the energy must be met. See *note Energy:(chmdoc/energy.doc)Needs. * Menu: * Syntax:: Syntax of the dynamics command * Description:: Description of the keywords and options * Recommended:: Recommended input options and values * Discussion:: Running dynamics * Output:: Output from a dynamics run * Trajectory:: Trajectory manipulation and I/O * Merge:: Merging or breaking up trajectory files into different size pieces. Resampling at a larger interval. Least squares fit to a reference. Recentering, or undoing image operations. * Reorient:: Reorienting a coordinate trajectory * RMSDyn:: Computes the RMS difference between two trajectories * Format:: formatting and unformatting a dynamics trajectory * Monitor:(chmdoc/monitor.doc). Monitor dihedral transitions * CPT dynamics:(chmdoc/pressure.doc). CPT dynamics * 4-D dynamics:(chmdoc/fourd.doc). 4-D dynamics * Pressure:(chmdoc/pressure.doc)Pressure. The pressure command * MTS:(chmdoc/mts.doc). Multiple Time Scales Method * Nose:(chmdoc/nose.doc). Nose-Hoover Dynamics * MBOND:(chmdoc/mbond.doc)Dynamic. Multi-body Dynamics  File: Dynamc, Node: Syntax, Up: Top, Previous: Top, Next: Description Syntax for the Dynamics Command DYNAmics { [ LEAP ] [ VERLet ] } ! Dynamics with the leap-frog integrator { [ NEW ] [ LANGevin ] } { [ CPT ] [ NOLAngevin ] } other-specs cpt-spec { [ EULEr ] } { ORIG } ! Dynamics with the verlet integrator DYNAmics { VVERlet [ NOSE ] } ! Dynamics with the velocity verlet integrator { } other-specs DYNAmics { LEAPfrog VER4 four-dim-spec } ! Four dimensional dynamics { } other-specs DYNAmics { MBONd mbond-spec } other-specs ! Multibody dynamics other-specs::= [NSTEp integer] [nonbond-spec] [hbond-spec] [frequency-spec] {[TIMEstp real]} {STARt } [unit-spec] [temperature-spec] [options-spec] { AKMA real } {RESTart} hbond-spec::= see *note Hbonds:(chmdoc/hbonds.doc). nonbond-spec::= See *note Nbonds:(chmdoc/nbonds.doc). mbond-spec::= See *note Mbond:(chmdoc/mbond.doc)DynDesc. frequency-spec::= [INBFrq integer] [IEQFrq integer] [IHBFrq integer] [IHTFrq integer] [IPRFrq integer] [NPRInt integer] [NSAVC integer] [NSAVV integer] [NTRFrq integer] [ILBFrq integer] [ISVFRQ integer] [NSAVL integer] [IMGFrq integer] [IXTFrq integer] unit-spec::= [IUNCrd integer] [IUNRea integer] [IUNVel integer] [IUNWri integer] [KUNIt integer] [CRAShu integer] [BACKup integer] [IUNLdm integer] temperature-spec::= [FINAlt real] [FIRStt real] [TEMInc real] [TSTRuc real] [TWINDH real] [TWINDL real] [TBATh real] options-spec::= [IASOrs integer] [IASVel integer] [ICHEcw integer] [ISCAle integer] [ISCVel integer] [ISEEd integer] [SCALe real] [NDEGg integer] [RBUFfer real] [AVERage] [ECHEck real] [TOL real] cpt-spec::= See *note cpt:(chmdoc/pressure.doc) four-dim-spec::= [FIL4dimension] [SKBOnd] [SKANgle] [SKDIhedral] [SKVDerWaals] [SKELectrostatics] [K4DInitial real] [INC4Dforce integer] [DEC4Dforce integer] [MULTK4di real] [E4FILLcoordinates real] [FNLT4 real] [FSTT4 real] [TIN4 real] [IHT4 integer] [IEQ4 integer] [ICH4 integer] [TWH4 real] [TWL4 real]  File: Dynamc, Node: Description, Previous: Syntax, Up: Top, Next: Recommended Options common to minimization and dynamics The following table describes the keywords which apply to both minimization and dynamics. Keyword Default Purpose NSTEP 100 The number of steps to be taken. This is the number of dynamics steps which is also equal to the number of energy evaluations. INBFRQ 50 The frequency of regenerating the non-bonded list. The list is regenerated if the current step number modulo INBFRQ is zero and if INBFRQ is non-zero. Specifying zero prevents the non-bonded list from being regenerated at all. INBFRQ = -1 --> all lists are updated when necessary (heuristic test). IHBFRQ 50 The frequency of regenerating the hydrogen bond list. Analogous to INBFRQ ILBFRQ 50 The frequency for checking whether an atom is in the Langevin region, defined by RBUF, or not. IMGFrq 0 The frequency for the image update (only used if IMAGES or CRYSTAL is in use). The image update creates image atoms needed for the energy computation from the list of allowed symmetry transformations. Recommended value: 50, if a 2A buffer is used between CUTIM and CUTNB. IXTFrq 0 The frequency for the crystal update (only used of CRYSTAL is in use). The crystal update generates a new list of allowed symmetry transformations. This option is only required if the size or shape of the periodic box (i.e. CPT) can change during a simulation (or minimization). Recommended value: 1000 (if running CPT dynamics). non-bond- The specifications for generating the non-bonded list. -spec See *note Nbonds:(chmdoc/nbonds.doc). hbond- The specifications for generating the hydrogen bond list. -spec See *note Hbonds:(chmdoc/hbonds.doc). [ STRT ] STRT The dynamics is assumed to start from the input [ ] coordinates using an assignment of velocities given by [ ] IASVEL. No restart file is read. [ REST ] The dynamics is restarted by reading the restart file from unit IUNREA. TIMESTP 0.001 Time step for dynamics in picoseconds. The default value is 0.001 picoseconds. IUNREA -1 Fortran unit from which the dynamics restart file should be read. A value of -1 means don't read any file IUNWRI -1 Fortran unit on which the dynamics restart file for the present run is to be written. A value of -1 means don't read any file. Formatted output. IUNCRD -1 Fortran unit on which the coordinates of the dynamics run are to be saved. A value of -1 means no coordinates should be written. Unformatted output. IUNLDM -1 Fortran unit on which the biasing potentials, the histograms of the lambda variables of the dynamics run are to be saved. A value of -1 means no histograms should be written. Unformatted output (for details see node: output). IUNVEL -1 Fortran unit on which the velocities of the dynamics run are to be saved. -1 means don't write. Unformatted output. KUNIT -1 Fortran unit on which the total energy and some of its components along with the temperature during the run are written using formatted output. CRASHU -1 Fortran unit where a single DCL command file will be written. If the machine crashes before a restart file is written, this file won't be touched. If the crash occurs after a restart is written but before the run completes, this file will contain the line, "$ @CRASH". If the run completes, the file will contain the line, "$ @COMPLET". This allows for an automatic recovery system after crashes. NSAVC 10 The step frequency for writing coordinates. NSAVL 0 The step frequency for writing lambda histograms. NSAVV 10 The step frequency for writing velocities. NPRINT 10 The step frequency for storing on KUNIT as well as printing on unit 6, the energy data of the dynamics run. IPRFRQ 100 The step frequency for calculating averages and rms fluctuations of the major energy values. If this number is less than NTRFRQ and NTRFRQ is not equal to 0, square root of negative number errors will occur. ISVFRQ NSTEP The step frequency for writing a restart file. IHTFRQ 0 The step frequency for heating the molecule in increments of TEMINC degrees in the heating portion of a dynamics run. Zero means do no heating. IEQFRQ 0 The step frequency for assigning or scaling velocities to FINALT temperature during the equilibration stage of the dynamics run. NTRFRQ 0 The step frequency for stopping the rotation and translation of the molecule during dynamics. This operation is done automatically after any heating. FIRSTT 0.0 The initial temperature at which the velocities have to be assigned at to begin the dynamics run. Important only for the initial stage of a dynamics run. FINALT 298.0 The desired final (equilibrium) temperature for the system. Important for all stages except initiation. TEMINC 5.0 The temperature increment to be given to the system every IHTFRQ steps. Important in the heating stage. TSTRUC -999. The temperature at which the starting structure has been equilibrated. Used to assign velocities so that equal partition of energy will yield the correct equilibrated temperature. -999. is a default which causes the program to assign velocities at T=1.25*FIRSTT. TWINDH 10.0 The temperature deviation from FINALT to be allowed on the high temperature side.(+ve). i.e. high side of the temperature window. Useful during equilibration. TWINDL -10.0 The temperature deviation from FINALT to be allowed on the low temperature side.(-ve). i.e. low side of the temperature window. Useful during equilibration. TBATH FINALT The temperature of the heatbath in Langevin dynamics. When set to zero it allows one to do purely dissipative (quenched) dynamics. RBUF 0.0 Inner radius of the buffer, or Langevin, region sphere. All atoms with radial positions greater than RBUF angstroms are propagated by Langevin dynamics, if the dynamics keyword LANGevin has been specified. IASORS 0 The option for scaling or assigning of velocities during heating (every IHTFRQ steps) or equilibration (every IEQFRQ steps). This keyword does not control the initial assignment of velocities. .eq. 0 - scale velocities. (use ISCVEL option) .ne. 0 - assign velocities. (use IASVEL option) IASVEL 1 The option for the choice of method for the assignment of velocities during heating and equilibration when IASORS is nonzero. This option also controls the initial assignment of velocities (when not RESTart) regardless of the IASORS value. .eq. 0 - Use the comparison coordinate values in AKMA units (sorry) with the STRT option. If NTRFRQ is positive, then net trans/rot will be removed first. This option supresses other assignments of velocity. .gt. 0 - gaussian distribution of velocity. (+ve) .lt. 0 - uniform distribution of velocity. (-ve) kinetic energy of 3N velocity components are same. ISEED 314159 The seed for the random number generator used for assigning velocities. ISCVEL 0 The option for two ways of scaling velocities. .eq. 0 - single scale factor for all atoms .ne. 0 - a scale factor for each atom proportional to the kinetic energy average ratio between the system and along every degree of freedom for that atom. ICHECW 1 The option for checking to see if the average temperature of the system lies within the allotted temperature window (between FINALT+TWINDH and FINALT+TWINDL) every IEQFRQ steps. .eq. 0 - do not check i.e. assign or scale velocities. .ne. 0 - check window i.e. assign or scale velocities only if average temperature lies outside the window. ISCALE 0 This option is to allow the user to scale the velocities by a factor SCALE at the beginning of a restart run. This may be useful in changing the desired temperature. .eq. 0 no scaling done (usual input value) .ne. 0 scale velocities by SCALE. WARNING: Please use this option only when you are changing the temperature of the run. SCALE 1. Scale factor for the previous option. NDEGF computed Number of degrees of freedom to use in computing the temperature. If not specified on any call, the value is computed. This specification is not remembered between successive calls to dynamics. AVERAGE no When saving coordinates every NSAVC steps, this option will cause the average structure of the last NSAVC dynamics steps to be written instead of the final snapshot coordinate set. This option is primarily used for making smooth movies. ECHECK 20.0 The maximum amount the total energy may change on any step. TOL 1.0E-10 The shake tolerance (if SHAKE is in use). PCONst false Flag to indicate that constant pressure code will be used. PINTernal true Flag to indicate that the internal pressure will be coupled the reference pressure. PEXTernal false Flag to indicate that the external pressure will be coupled to the reference pressure. PCOUpling 0.0 The coupling decay time in picoseconds for the pressure. A good value for this is 5 ps. COMPress 0.0 The compressibility in atm**-1. A good value for proteins is 4.63e-5 PREFerence 1.0 The reference pressure in atmospheres. VOLUme computed The volume in Angstroms**3 to use for the pressure calculation denominator. This value is calculated if the CRYStal feature is use. TCONst false Flag to indicate that constant temperature code will be used. TCOUpling 0.0 The coupling decay time in picoseconds for the temperature. A good value for this is 5 ps. TREFerence FINALT The reference temperature for constant temperature simulations. MBOND Signifies that the dynamics run will be based on a multi-body simulation. If no bodies have been defined, this produces an error. Many of the standard dynamics options retain their meaning, in this case, but the following options are NOT supported: SHAKE, CONSTANT PRESSURE, NOSE, LEAPFROG, VER4, LANGEVIN. See *note Mbond:(chmdoc/mbond.doc)DynDesc for a description of the MBOND dynamics options.  File: Dynamc, Node: Recommended, Previous: Description, Up: Top, Next: Discussion Recommended CHARMM input for dynamics. This section is intended only as a guide in setting up a dynamics simulation input file. Changes should be made as necessary according to personal tastes and project requirements. 1) For heating and early equilibration: DYNAMICS LEAP VERLET RESTART(*) NSTEP 20000 TIMESTEP 0.001(+) - IPRFRQ 1000 IHTFRQ 1000 IEQFRQ 5000 NTRFRQ 5000 - IUNREA 30 IUNWRI 31 IUNCRD 50 IUNVEL -1 KUNIT 70 - NPRINT 100 NSAVC 100 NSAVV 0 INBFRQ 25 - hbond-spec nonbond-spec - FIRSTT 100.0 FINALT 300.0 TEMINC 100.0 - IASORS 1 IASVEL 1 ISCVEL 0 ICHECW 0 TWINDH 10.0 TWINDL -10.0 (*) Except for first run, the use STRT in place of RESTART (+) If bonds to hydrogen atoms are SHAKEd 2) For late equilibration and analysis runs: DYNAMICS LEAP VERLET RESTART NSTEP 20000 TIMESTEP 0.001 - IPRFRQ 1000 IHTFRQ 2000 IEQFRQ 5000(*) NTRFRQ 5000 - IUNREA 30 IUNWRI 31 IUNCRD 50 IUNVEL -1 KUNIT 70 - NPRINT 100 NSAVC 100 NSAVV 0 IHBFRQ 0 INBFRQ 25 - hbond-spec nonbond-spec - FIRSTT 100.0 FINALT 300.0 TEMINC 100.0 - IASORS 0 IASVEL 1 ISCVEL 0 ICHECW 1 TWINDH 10.0 TWINDL -10.0 (*) Window checking should be disabled for the analysis run (i.e. IEQFRQ=0) if you want a real microcanonical ensemble. 3) For heating, equilibration and analysis runs using Langevin dynamics:(+) DYNA LEAP LANGEVIN STRT(*) NSTEP 20000 TIMESTEP 0.001 - IPRFRQ 1000 IHTFRQ 0 IEQFRQ 0 NTRFRQ 0 - IUNREA 30 IUNWRI 31 IUNCRD 50 IUNVEL -1 KUNIT 70 - NPRINT 100 NSAVC 100 NSAVV 0 IHBFRQ 0 INBFRQ 25 - ILBFRQ 1000 RBUFFER 0.0 TBATH 300.0 - hbond-spec nonbond-spec - FIRSTT 300.0 FINALT 300.0 - IASORS 0 IASVEL 1 ISCVEL 0 ICHECW 0 TWINDH 0.0 TWINDL 0.0 (+) Note that the friction coefficients, in units of 1/ps, must first be initialized by filling the array FBETA with the SCALAR command SCALAR FBETA SET (*) Except for first run, the use STRT in place of RESTART 4) For quenched molecular dynamics: For the first run (STRT), read velocities into the comparison coordinate set, or this should directly follow a former dynamics command. DYNA VERLET STRT(*) NSTEP 10000 TIMESTEP 0.001 - IPRFRQ 1000 IHTFRQ 200 IEQFRQ 200 NTRFRQ 400 - IUNREA 30 IUNWRI 31 IUNCRD 50 IUNVEL -1 KUNIT 70 - NPRINT 50 NSAVC 50 NSAVV 0 IHBFRQ 0 INBFRQ 25 - hbond-spec nonbond-spec - TSTRUC 300.0 FIRSTT 300.0 FINALT 0.0 TEMINC -30.0 - IASORS 0 IASVEL 0 ISCVEL 0 ICHECW 1 TWINDH 0.0 or equivalently with Langevin (dissipative) dynamics DYNA LANGEVIN STRT(*) NSTEP 10000 TIMESTEP 0.001 - IPRFRQ 1000 IHTFRQ 0 IEQFRQ 0 NTRFRQ 4000 - IUNREA 30 IUNWRI 31 IUNCRD 50 IUNVEL -1 KUNIT 70 - NPRINT 50 NSAVC 50 NSAVV 0 IHBFRQ 0 INBFRQ 25 - hbond-spec nonbond-spec - TSTRUC 300.0 FIRSTT 300.0 FINALT 300.0 - ILBFRQ 1000 RBUFFER 0.0 TBATH 0.0 - IASORS 1 IASVEL 1 ISCVEL 0 ICHECW 0 TWINDH 0.0 (*) For first run, use RESTART otherwise The IASVEL 0 option causes the comparison coordinates to be used for the initial velocities (AKMA units). For subsequent runs the options IASORS 1 and IASVEL 1 may be used if random velocities are to be periodically assigned. 5) For constant temperature and/or pressure dynamics DYNA LEAP VERLET STRT(*) NSTEP 20000 TIMESTEP 0.001 - IPRFRQ 1000 IHTFRQ 0 IEQFRQ 0 NTRFRQ 0 - IUNREA 30 IUNWRI 31 IUNCRD 50 IUNVEL -1 KUNIT 70 - NPRINT 100 NSAVC 100 NSAVV 0 IHBFRQ 0 INBFRQ 25 - PCONst PINTernal COMPress 4.63e-5 PCOUpling 5.0 PREFerence 1.0 - TCONst TCOUpling 5.0 TREFerence 300.0 - hbond-spec nonbond-spec - FIRSTT 300.0 FINALT 300.0 - IASORS 0 IASVEL 1 ISCVEL 0 ICHECW 0 TWINDH 0.0 TWINDL 0.0 6) For multi-body dynamics (assumes an atomistic equilibration has already been performed, substructures defined and modes generated): DYNA MBOND LOBA RESTART NSTEP 20000 TIMESTEP 0.001 - IPRFRQ 1000 IHTFRQ 0 IEQFRQ 0 NTRFRQ 0 - IUNREA 30 IUNWRI 31 IUNCRD 50 IUNVEL -1 KUNIT 70 - NPRINT 100 NSAVC 100 NSAVV 0 IHBFRQ 0 INBFRQ 25 - TCONst TCOUpling 5.0 TREFerence 300.0 - hbond-spec nonbond-spec - FIRSTT 300.0 FINALT 300.0 - MBPRlev -1  File: Dynamc, Node: Discussion, Previous: Recommended, Up: Top, Next: Output Running Molecular Dynamics The theoretical basis for dynamical simulations is elementary physics. The force on a particle is equal to the negative gradient of the potential energy of the particle. CHARMM can solve this equation numerically for all atoms in the molecule. A simple second order predictor two step method due to Verlet is used for integration. The choice of the integration step size is very important. One must weigh the increased accuracy of using a small step size against the longer real time that can be simulated with a given amount of execution time when a larger step size is used. The time step may be entered in picoseconds (using the TIMESTP keyword). CHARMM provides information on the accuracy of the numerical solution. Since the system has no external forces, the total energy should be conserved. Numerical errors will result in some fluctuations in the total energy so a good test is to compare the fluctuations in total energy to the fluctuations in kinetic energy as these fluctuations are proportional to the heat capacity of the system. See the next node for a description of dynamics output. Because the force constants for the bonds and bond angles are fairly large, it is reasonable under certain circumstances to constrain their values during dynamics. Such constraints are applicable if the harmonic motions are weakly coupled to other motions. The advantage of such constraints is that the step size of the numerical integration may be increased without sacrificing accuracy as these terms have the largest gradients in macromolecules simulated at physiological temperatures. We use the SHAKE algorithm for applying the constraints, see *note shake:(chmdoc/cons.doc)SHAKE. SHAKE can be applied to just the bonds involved with hydrogens, all bonds, all bonds and the angles involving hydrogens, or all bonds and angles. A dynamics run has basically four parts; initialization, heating, equilibration, and the simulation itself. Initialization means providing an initial position and velocity for all the atoms. Heating is the process of increasing the kinetic energy of the system up to a final temperature at which the simulation will be conducted. Equilibration is the process where the kinetic energy and the potential energy of the system evenly distribute themselves throughout the system. Only when the average temperature of the system stabilizes can one collect the trajectory information for analysis. The initial coordinates of a simulation are obtained after applying the minimization algorithm to a complete coordinate set. One cannot start with a system with a large potential energy as it will quickly heat up to unreasonable temperatures. For initializing the velocities, the user can specify velocities from the comparison coordinates (IASVEL 0), a uniform distribution of kinetic energy along each coordinate with random sign of the motion along each axis (IASVEL -1) or a Gaussian distribution of velocities (IASVEL 1 the default). The temperature at which velocities are assigned is determined by FIRSTT and TSTRUC by the algorithm: Tassign = 2*(FIRSTT-TSTRUC) + TSTRUC. For a harmonic system equilibrated to TSTRUC equal partition of the energy will result in an equilibrated temperature of roughly FIRSTT. If TSTRUC is not specified 1.25*FIRSTT will be used for assignment. Velocities may also be passed to dynamics in the comparison coordinate set (as opposed to assignment). This allows the user considerable flexibility in setting up the initial conditions. The heating of system is performed gently by increasing the kinetic energy by a small amount periodically. The number of integration steps between heating applications, the final temperature, and the kinetic energy increment are all user specified. In addition, there is a choice in the method of increasing the kinetic energy of the system. One may scale existing velocities or reassign them. The velocities can be scaled by either one scale factor calculated for the kinetic energy of the system averaged over many time steps or by scale factors established for each atom base ed on the ratio of its time averaged kinetic energy with that of the system. If reassignment is chosen, the velocities can have either a uniform or Gaussian distribution. To equilibrate the structure, one can specify a window around the final temperature where velocity adjustments will be made. The choice of velocity adjustments is the same as described above for heating. For the actual run, CHARMM will output the position and velocities of all atoms at intervals specified by the user. The temperature window can be set larger so that any gross conformational changes which result in a different potential energy will cause the temperature to be maintained. At any time energy is added to the system, the angular momentum of the system will be reduced to zero and translational motion will be stopped. One can also request that these operations be performed at any time during the dynamics run. The use of a restart file is essential for running dynamics. The restart file contains information about the most recent coordinate sets necessary for the VERLET algorithm. In addition the values of the energy accumulators are stored. All other information (such as SHAKE, fixed atoms, harmonic constraints, friction coefficients) has to be regenerated before invoking a dynamics restart. When the run is initiated, a restart file must be written using the IUNWRI keyword. As the dynamics routine complete NCYCLE, see *note Output::, steps of dynamics, the Fortran unit specified by IUNWRI will be rewound and a restart file will be written. In case of crashes, one has restart files corresponding to various points in the run. The CRASHU variable may prove valuable. Successive runs of CHARMM to continue the dynamics run must read the previous restart file using the IUNREA keyword and write it out for the next part of the run. Restarts may be done to reset various options, or to break up a long run into several shorter runs. Restart files will only run with the version of CHARMM they are created with. There are many numbers giving the frequency of actions to be taken during dynamics such as updating the non-bonded list, heating the molecule etc. Some of these numbers are adjusted along with the number of steps to run so that numbers all have a common divisor. At the present time, there are combinations which result in errors. At some point an attempt may be made to catalog all the actions, and check for erroneous processing. If one is interested in simulating the motion of part of the system with the rest of the system remaining fixed, it is possible to fix atoms in place, see *note fix:(chmdoc/cons.doc)fixed atom. If this is done, there are several effect on the dynamics. First, since the system is now anchored in space, the center of mass motion and total angular velocity is never stopped. Second, the number of degrees of freedom used for calculating the temperature is set to the number of free atoms times 3 minus 6. Third, the coordinate and velocity trajectory files will contain the position of the fixed atoms only once, and all other records will hold just the moving atoms. This saves a great deal of disk space. Trajectory files can be merged, broken in smaller pieces, and sampled at different intervals. Likewise, said operations can be performed on coordinate trajectories while rotating the coordinates to match a given coordinate set. When the DYNAmics command exits, the main coordinate set contains the final coordinate positions from the last energy evaluation and the comparison coordinates will contain the final velocities In AKMA units. Finally, a brief discussion of the Langevin dynamics algorithm is presented. The Langevin dynamics algorithm presently in CHARMM was intented to be used primarily with the "Stochastic Boundary Molecular Dynamics" method and consequently has been restricted to an algorithm which is valid only for the case FBETA*TIMESTEP<<1.0. That is for cases where relatively small friction coefficients are used. Typically values of FBETA*TIMESTEP up to about 0.3 still produce a stable dynamics which also satisfy the fluctuation-dissipation theorem. The algorithm itself reduces to the Verlet algorithm when FBETA is zero and consequently may be used to do regular dynamics, actually it is the same routine which does both dynamics. In using Langevin dynamics care must be taken to first initialize the array FBETA by using the scalar commands e.g., CHARMM >SCALAR FBETA SET Failure to do this just means you are doing regular dynamics so no warning is given by CHARMM.  File: Dynamc, Node: Output, Previous: Discussion, Up: Top, Next: Trajectory Contents of a dynamics output Note: This description of the output of a command is not normally going to be part of the documentation of commands. The dynamics output is sufficiently confusing that this description is necessary. The first part of CHARMM's output after a dynamics command lists all of the options that apply to that part of the run. Then, any information about velocity assignments (temperature changes) follows. Any time the velocities are changed in an anisotropic way, the motion of and about the center of mass will be stopped. This results in a printout both before and after this operation of the "DETAILS ABOUT CENTRE OF MASS". Its position and velocity are output followed by the components of the angular momentum. The last line gives the translation kinetic energy of the system, and thus one should expect a drop in the total energy and temperature of the system afterwards. Non-bonded interaction and hydrogen bond updates will appear intermittently and are cleared labeled. Every NPRINT steps, the total energy and various contributions will be printed. This output is preceded by a title which gives the correspondence of numbers to energy names. After IPRFRQ steps will appear the averages and RMS fluctuations. After the second such printout of averages and RMS fluctuations, the averages and RMS fluctuations for the run upto the last turning of the molecule will be given. This gives you longer range statistics. Such a calculation will not be done if IPRFRQ equals NTRFRQ. The ratio of total energy to kinetic energy fluctuations is an excellent measure of the accuracy of the run. After the averages are printed, a least squares fit of the total energy against the step number will be made to look for drift in the energy. Two such values are printed, one for the last IPRFRQ steps, and one to the previous turn. Next, the initial energy for the statistics, both short range and long, are printed. Finally, the correlation coefficient of the energy versus step is given for both ranges. A value close to zero indicates no systematic drift; a magnitude near 1 means you have a real problem with the dynamics. This process of printout continues until the end of the run is reached. Just before the last energy is printed will appear a message about the writing of coordinates and velocities to their respective files. Output of the lambda dynamics and post-processing (a) Output The output of the lambda dynamics, i.e. the histograms and the biasing potentials on the lambda variables, is writen in a separate file from the coordinate file. Parallel to a regular coordinate file of the dynamic run, the lambda dynamic output file will automatically include a header and an integer array. They are used to provide the information on the values of NSTEP, NSAVL, NPRIV etc. To name a title for the output file (in complying with the CHARMM file requirement), the command LDTItle (similar to TITLE command) can be used. E.g. LDTItle * mte: Methanol to ethane * output for lambda dynamics * will write out a title before any other output data. The information on biasing potentials will also be written out. It takes a similar form as they were read in (see BLOCK.DOC), i.e. INTEGER : total No. of biasing potentials. I J CLASS REF CFORCE NPOWER : the format for individual one. INTEGER : the total no. of blocks. To specify the output fortran unit and the writing frequency, keywords IUNLDM and NSAVL are used. They are treated in the same fashion as IUNCRD and NSAVC. There is no separate restart file for the lambda dynamics. The information necessary for restarting a lambda dynamics is included at the end of a regular dynamic restart file. Thus, to restart the lambda dynamics is exactly same as restarting a regular dynamics run except you have to specify IUNLDM and NSAVL. E.g !input title for lambda i/o LDTITLE * This is a test * output for lambdas * open unit 11 writ form name output_file open unit 12 read form name input_file open unit 15 writ file name histogram dyna rest leap time 0.001 - nstep 10 nprint 1 iprfrq 10 - iunrea 12 iunwri 11 iuncrd -1 nsavc 1 IUNLdm 15 NSAVL 5 - first 300. - inbfrq 40 nbxmod 5 atom cdie shif vatom vdist vshif - cutnb 8. ctofnb 7.5 ctonnb 6.5 eps 1. e14fac 0.4 wmin 1.5 - cutim 8. imgfrq 40 Caution: the file is an unformatted output. However, the order of the output is very similar to a regular output file: (1) header, icntrl (automatically written) (2) title (3) total no. of biasing potentials (4) form of each biasing potential ( total = Nbias) (5) total no. of blocks (6) lambda**2 ( total = No. of blocks) (b) Post-processing WHAM ?. END  File: Dynamc, Node: Trajectory, Previous: Output, Up: Top, Next: Merge Reading and writing trajectory frames with direct commands This facility allows the creation or manipulation of trajectory files The main uses of this facility are; 1) creating artificial trajectory files from coordinate frames 2) reading an existing trajectory frame by frame for analysis that requires access to a variety of CHARMM commands 3) modifying an existing trajectory (copy with changes) such as minimizing each frame or other operations. [Syntax TRAJectory command] =================================================================== There are four commands that comprise this facility. 1) Initializing trajectory I/O TRAJectory {read-spec} {write-spec} read-spec:== [IREAd unit] [NREAd int] [SKIP int] [BEGIN INT] [STOP INT] write-spec:== [IWRIte unit] [NWRIte int] [NFILE int] [EXPAnd] [NOTHer int] [DELTa real] [SKIP int] IREAd - first unit to read from (default: do not read) NREAd - number of units to read from (default:1) SKIP - skip value for both reading and writing (default:1) IWRIte - first unit to write to (default: do not write) NWRIte - number of units to write to (default:1) NFILe - number of frames on each output file (default: total) EXPAnd - flag to free fixed atoms in copying (only if reading) NOTHer - number of frames in previous files (if not reading) (d:0) DELTa - output delta value (if not reading) (default:0.001) 2) Reading a frame TRAJectory READ [COMP] The reading command does not have any specifiers other than whether the comparison or main coordinates will be used. 3) Writing a frame TRAJectory WRITe [COMP] The writing command does not have any specifiers other than whether the comparison or main coordinates will be used. 4) Query a trajectory file TRAJectory QUERy UNIT integer The query command rewinds an open trajectory file and then reads the header information from this trajectory file. It prints a summary and sets the following command line substitution parameters: 'NFILE' - Number of frames in the trajectory file 'START' - Step number for the first frame 'SKIP' - Frequency at which frames were saved (NSTEP=NFILE*SKIP when not using restart files) 'NSTEP' - Total number of steps from the simulation 'NDEGF' - Number of degrees of freedom from the simulation (Can be use to get the temperature with velocity files). 'DELTA' - The dynamics step length (in picoseconds). This command, again, rewinds the trajectory file upon completion. =================================================================== There are three modes of operation; 1) Create a new trajectory. The IWRIte and NFILe keywords must be used. The default values for the others are listed above. If several files will be made in different CHARMM runs that will be linked together later, the NOTHer keyword value should be increased by NFILe on each subsequent run. EXAMPLE: Create a "movie" trajectory that involves the rotation of a single sidechain (residue 21). COOR AXIS SELE ATOM * 21 CA END SELE ATOM * 21 CB OPEN WRITE UNIT 22 FILE NAME TYR21.ROT TRAJECTORY IWRITE 22 NWRITE 1 NFILE 360 SKIP 1 * trajectory showing the rotation of sidechain 21 * SET 1 1 LABEL LOOP COOR ROTATE AXIS PHI 1.0 SELE ATOM * 21 * .AND. .NOT. ( TYPE C - .OR. TYPE N .OR. TYPE H ) END TRAJ WRITE INCR 1 BY 1 IF 1 LT 360.5 GOTO LOOP STOP =================================================================== 2) Reading an existing trajectory The IREAD keyword must be used. The default NFILe value is 1 and the remaining values if not specified will be read from the trajectory file. EXAMPLE: find the structure with the lowest energy and save it. UPDATE ... OPEN READ UNIT 22 FILE NAME DYN1.TRJ OPEN READ UNIT 23 FILE NAME DYN2.TRJ TRAJECTORY IREAD 22 NREAD 2 SKIP 100 SET 1 1 SET 9 9999.0 CALC NTOT = ?NFILE * 2 LABEL LOOP TRAJ READ GETE IF 9 LT ?ENER GOTO NEXT SET 8 @1 COOR COPY SET 9 ?ENER LABEL NEXT INCR 1 BY 1 IF 1 LT @NTOT GOTO LOOP OPEN WRITE CARD UNIT 12 NAME LOWE.CRD WRITE COOR COMP CARD UNIT 12 * structure with the lowest energy * frame number @8 with energy @9 * STOP =================================================================== 3) Copying from one trajectory to another. The operation of this command works in the same mode as the MERGE command, except a variety of CHARMM commands can be executed between reading and writing of frames. EXAMPLES: Create a new trajectory where every frame is minimized for 200 steps. OPEN READ UNIT 22 FILE NAME DYN.TRJ OPEN WRITE UNIT 32 FILE NAME DYN.MIN TRAJECTORY IREAD 22 SKIP 100 IWRITE 32 * minimized trajectory * SET 1 1 LABEL LOOP TRAJ READ MINI ABNR NSTEP 200 TRAJ WRITE INCR 1 BY 1 IF 1 LT ?NFILE GOTO LOOP STOP  File: Dynamc, Node: Merge, Previous: Trajectory, Up: Top, Next: Reorient Merges or breaks up a trajectory into different numbers of files Frequently, one generates a trajectory into small files to minimize the CPU time of one job. However, so many files are usually hard to manage so it is desirable to merge said files into larger units. This command provides that capacity. In addition, it is possible to break up the trajectory into smaller pieces and to sample the trajectory less frequently than originally generated. Another option is to optionally rotate the structure at each frame to least squares fix a reference structure. [Syntax MERGE dynamics trajectories] MERGE [ COOR ] [FIRSTU unit-number] [NUNIT integer] [SKIP integer] [ VEL ] [OUTPutu unit-number] [NFILE integer] [ DRAW ] [BEGIN integer] [STOP integer] [first-atom-selection] [ XFLUct ] [ UNFOld ] [ ORIEnt [MASS] [WEIGht] [NOROt] [PRINT] second-atom-selection ] [ RECEnter second-atom-selection] Keyword table Option Default Purpose [COOR] COOR Specification of the type of trajectory file. COOR is [VEL ] coordinates; VEL is velocities. [DRAW] Make a CHARMM movie (do not write any files, just display) FIRSTU 51 The first unit of the trajectory to be read. NUNIT 1 The number of units to be read starting with FIRSTU SKIP 1 Only those coordinate whose dynamics step number modulo SKIP will be reoriented and written out. OUTPUTU 61 The first unit number of the output trajectory NFILE The number of coordinate sets written to each output merged file. If left out, this will be set to the number of coordinates in the first input file times the number of input files. WARNING: This default will generate a bad trajectory file if SKIP is not set to the interval actually present in the trajectories. Further, if you set its value to be larger than the number of coordinates that are actually written in any output file, you will have problems. The error that is generated results from the control array in the beginning specifying that there are more coordinates than actually exist in the file. EOF errors will result when the trajectory is read. BEGIN First step number to start reading from STOP Last step number to read first-atom-sel Selection of atoms to include in the output file. RECEnter Will re-center atoms based on the existing IMAGE transformations ("coor ... rece ..") thus HAS to be preceded by a normal image setup (read image, image byresidue ..) for the atoms (usually solvent) that are to be transformed as if the center of the primary box coincided with the center of geometry of the atoms in the second selection. In short: The second selection defines the origin of your lattice and the solvent molecules are put as close as possible to the solute, even if things drifted slightly out of the box during the simulation. Useful for calculation of solvation properties. Does not work with XFLUct or UNFOld. The possibly large amounts of output reporting on all transformation operations being performed may be suppressed by setting PRNLev 4, or PRNLev 1 to get rid of the SELECTE IMAGES BEING CENTERED messages as well. NOTE: Uses SAME second-selection as ORIENt, so if both RECEnter and ORIEnt are specified there should only be one second selection, which will first be used to define the recentering, then for the orienting. UNFOld removes the effects of image centering (not the same as RECEnter), ie will let a particle continue out to the right if that is what it was doing when PBC moved it back into the primary box during the simulation. XFLUct removes the effects of the box size/shape changes from constant pressure simulations. This allows an accurate calculation of transport properties (diffusion constants,..) from CPT trajectories. ORIEnt Flag to specify best fit rotation and translations. MASS Use mass weighting in best fit. WEIGht Use weighting array for best fit weights. NOROt Only translate in the best fit. PRINT Print the details of best fit second-atom-sel Selection of atoms to use in the best fit. NOTE: Uses SAME second-selection as RECEnter (see above) The title of the output trajectory will be copied from the input trajectory. NOTE: If the input trajectory is from a CRYStal simulation, the CRYStal setup has to be invoked also before the MERGe operation if CRYStal data is to be written out to the resulting trajectory.  File: Dynamc, Node: Reorient, Previous: Merge, Up: Top, Next: RMSDyn Reorienting a coordinate trajectory If one is interested in reorienting every set of coordinates found in a dynamics trajectory with respect to some reference structure, one can use the ORIEnt option in conjunction with the MERGe command. The process of reorienting a coordinate trajectory works as follows: A series of files containing the trajectory are assigned to successive units prior to a CHARMM run. The coordinates stored therein are presumed to have been written every NSAVC steps. CHARMM will read each coordinate, select some periodically, reorient them, and write them to successive units where each output file will have a user specified number of coordinates. The following table lists the options involved: Option Default Purpose ORIE .false. Specify that a least squares RMS fit will be done. MASS .false. Use a mass weighting in the fit WEIGH .false. Use the weighting array (wmain) in the fit NOROt .false. Just shift the centers to best fit. PRINt .false. Print what happened to each coordinate set. atom-selection all Select which atom to use in the fit. If atoms were fixed during the dynamics, the new trajectory produced will not have fixed atoms because the rotations applied to each coordinate set will be different thereby yielding different coordinates for the fixed atoms. Fixing the coordinates leads to a large space reductions, so the reorientation process will therefore result in potentially much larger trajectory files. See *note fix: (chmdoc/cons.doc)Fixed Atom.  File: Dynamc, Node: RMSDyn, Previous: Reorient, Up: Top, Next: Format Computes the RMS difference between two trajectory files and make a matrix of results. Large files should be reduced with the MERGe command before processing this command. RMSDynmics ORIEnt [MASS] [WEIGht] [NOROt] [RMS] atom-selection [IREAd unit-number] [JREAd unit-number] [IWRIte unit-number] [BEGIn integer] [STOP integer] [IMAGes] IREAd int - unit number of the first trajectory file. JREAd int - unit number of the second trajectory file. IWRIte int - Unit for the output matrix. BEGIn int - Starting step number (default: first) STOP int - Ending step number (default: last) IMAGes - Use image atoms for the analysis ORIEnt - Do best fit of structures MASS - Use a mass weighting in best fit. WEIGht - Use the weighting array in best fit. NOROt - Best fit without letting the structures rotate. RMS - Do RMS fit between structures, otherwise, align structures with the axis. atom-selection - Atoms to use in the fitting procedure.  File: Dynamc, Node: Format, Previous: RMSDyn, Up: Top Format or unformat a dynamics trajectory DYNAmics FORMat FIRStunit NUNIt BEGIn SKIP STOP OUTPut OFFSet SCALe MODE DYNAmics UNFOrmat INPUt OUTPut These commands allow to convert binary trajectory files into a machine independent yet compact format and to convert them back into binary files. The defaults for OFFSet, SCALe and MODE are: OFFSet=600, SCALE=10000, and MODE=12Z6. The trajectory is converted into positive integers according to the formula =INT(+OFFSET)*SCALE). The user has to make sure that all coordinates of the trajectory are within OFFSET angstroms. The precision may be increased by choosing a larger SCALE and FORTRAN-format, e.g. MODE=11Z7 OFFSET=100000. ("Z" is the hexadecimal format and is available on most machines.) ------------------ CONSTANT VELOCITY ------------------ A constant velocity method has been developed for use with DYNA (right now, it only works with LEAP [in charmm] and LOBATTO [in MBO(N)D] integrators). The main purpose of this facility is to run simulations similar to atomic force microscopy. The constant velocity method, therefore, is used in conjunction with the NOE facility used to apply a 'spring' between two atoms. A constant velocity for an atom is entered via CVEL in CHARMM syntax: CVELocity where = constant velocity in Angst/ps; the constant velocity vector and direction is defined from to . the position of the , typically a dummy atom, is moved to the position of + 0.0001 Angstr. along the vector (because charmm does not like duplicate coordinates); then traverses along the vector at the constant velocity rate. The second atom is not really needed, but it is helpful in analyzing the vector visually before running dynamics. Note: If you want to apply a spring between the constant velocity atom and the first atom in the vector, you must use (currently) the NOE facility in charmm. --- Here are the relavent syntax from a sample input file (typical usage). --- *afm.inp *Simulated Atomic Force Microscopy *Continually loops over 10ps segments of dynamics (NVT'ish) * ...lots of typical charmm stufff... !--------DEFINITIONS !Two atoms, one is the near the end of myosin, the other is a dummy atom ! to be cvel'ed define tip SELE atom dumm 1 dumm END define pp SELE atom hc 835 ca END !Actin binding region define actb sele segid hc .and. (resi 405:415 .or. resi 529:550 .or. - resi 626:647) end !----FORCES set f 4. !spring constant; See Grubmueller Science 1996, 271, 997 set com 100 !force used to pin actin binding site set max 80 !tot number of dyn runs--arbitrary right now !##CVEL !## These two atoms define the pulling vector; the first selection !## is the pull point, and the second selection is the atom that moves at !## constant velocity along the pull point. Currently, the 'spring' !## between these two atoms is defined using the NOE facility below. cvel @{cv} SELE pp END SELE tip END !set up spring between atoms in cvel noe reset assign SELE pp END SELE tip END - kmin 0.0 rmin 0.0 kmax @f rmax .00001 fmax 1000 PRINT ANAL end label skip !----Pin protein cons harm sele actb end force @{com} DYNAMICS MBOND (or LEAP) (re)START - dynamics equilibration or constant temperature method. !lots of loops over the above stop References: 1. Grubmueller Science 1996, 271, 997. 2. "The Evaluation Of Multi-Body Dynamics For Studying Ligand-Protein Interactions. Using MBO(N)D To Probe The Unbinding Pathways Of Cbz-Val-Phe-Phe-Val-Cbz From The Active Site Of Hiv-1 Protease" Chin, D. N.; Haney, D. N.; Delak, K.; Chun, H. M.; Padilla, C, In Rational Drug Design; Parrill, A., Reddy, R. Eds.; ACS Washington, 1998, in press. MODIFIED CODE IN CHARMM 26?? ------------------ a build/sgi/newmk/charmm.mk 9 blocks a build/sgi/newmk/dynamc.mk 25 blocks a build/sgi/newmk/mbond.mk 17 blocks a source/fcm/newfcm/cveloci.fcm 2 blocks a source/dynamc/newsrc/cveloci.src 9 blocks a source/dynamc/newsrc/dcntrl.src 110 blocks a source/dynamc/newsrc/dynamc.src 104 blocks a source/charmm/newsrc/charmm_main.src 47 blocks a source/charmm/newsrc/iniall.src 57 blocks a source/moldyn/newsrc/compin.f 24 blocks a source/moldyn/newsrc/delta_v.f 19 blocks a source/moldyn/newsrc/engmom.f 21 blocks a source/moldyn/newsrc/engmom_ke.f 9 blocks a source/moldyn/newsrc/mbdyna.f 58 blocks a source/moldyn/newsrc/ydot.f 74 blocks a source/moldyn/newsrc/CHARMM.INC a source/mbond/newsrc/mbback.src 52 blocks a source/mbond/newsrc/mbdyn.src 40 blocks CHARMM Element doc/eef1.doc $Revision: 1.3 $  File: EEF1, Node: Top, Up: (chmdoc/commands.doc), Next: Syntax Effective Energy Function 1 EEF1 is an effective energy function combining the CHARMM 19 polar hydrogen energy function (with certain modifications, see below) with an excluded volume implicit solvation model. The solvation model is similar in spirit to the Atomic Solvation Parameter approach, but does not use surface areas and is therefore much faster. Latest benchmarks say that simulations with EEF1 take about 50% longer than the corresponding vacuum simulation. The solvation model assumes that the solvation free energy of each group is equal to the solvation free energy of that group in a small model compound less the amount of solvation it loses due to solvent exclusion by other atoms of the macromolecule around it. The exclusion effect of nearest and next-nearest neighbors (1-2 and 1-3 interactions) are neglected because such neighbors also exist in small model compounds. The CHARMM nonbonded atom and exlusion lists are used for the solvation calculation. Because not only DG but also DH and DCp data are available, we can calculate the solvation free energy at different temperatures. This calculation assumes a DCp independent of temperature. Therefore extrapolation to temperatures very different from 300 K is not reliable. EEF1 refers not only to the implicit solvation model but also to the specific modifications and nonbonded options used in CHARMM. The nonbonded options must be: ctonnb 7. ctofnb 9. cutnb 10. group rdie (see example file below). Three files are needed to use EEF1: toph19_eef1.inp : This is a modification of toph19.inp where ionic sidechains and termini are neutralized and contains an extra parameter type (CR) param19_eef1.inp: This is a modification of param19.inp which includes the extra parameter type (CR) solvpar.inp : This file contains the solvation parameters When the INTE command is used with EEF1, the number listed under ASP is the amount of solvation free energy that is excluded between the two atom selections. For example, the INTE between atom A and atom B will give the amount of solvation A loses due to B plus the amount B loses due to A. The command "INTE sele all end" will give the amount of solvation free energy excluded, not the total solvation free energy of all atoms. That is, it is not equivalent to "ENERGY". EEF1 can be used with images. In that case the ASP energy term refers to the solvation free energy of the primary atoms. This is usually less negative than when images are not present, because image atoms exclude some solvation free energy from the primary atoms. * Menu: * Syntax:: Syntax of the EEF1 commands * References:: Useful references * Example:: Input file  File: EEF1, Node: Syntax, Up: Top, Next: References, Previous: Top Syntax for EEF1 There are only two EEF1 commands: EEF1 SETUP [TEMP real] UNIT int NAME solv_param_file EEF1 PRINT The first sets up the solvation calculation by giving TEMP and reading in the solvation parameters. And the second prints out the solvation of each group. The solvation energy is stored in ETERM(ASP) and reported under the name "ASP". Obviously, it makes no sense to use both ASP and EEF1. If one wants to skip the solvation term after one has set it up, one can issue the command SKIP ASP. TEMP is the temperature to which the solvation parameters refer to (default is 298.15). Note that this is unrelated to the temperature at which one runs dynamics. It just determines the solvation free energy parameter values. PRINT prints out the solvation free energy of each atom/group as well as the solvation enthalpy and heat capacity  File: EEF1, Node: References, Up: Top, Next: Example, Previous: Syntax References [1] T. Lazaridis and M. Karplus, Effective energy function for proteins in solution, Proteins, 35:133-152 (1999) [2] T. Lazaridis and M. Karplus, Discrimination of the native from misfolded protein models with an energy function including implicit solvation, J. Mol. Biol., 288:477-487 (1999) [3] T. Lazaridis and M. Karplus, "New View of Protein Folding reconciled with the Old through Multiple Unfolding Simulations", Science, 278:1928 (1997)  File: EEF1, Node: Example, Up: Top, Next: Top, Previous: References --------------------------------------------------------------------- * Example file for EEF1 * open read card unit 3 name toph19_eef1.inp read rtf unit 3 card close unit 3 open read card unit 3 name param19_eef1.inp read para unit 3 card close unit 3 open read unit 3 card name filename.crd read seque coor unit 3 close unit 3 generate main setup open read unit 2 card name filename.crd read coor card unit 2 close unit 2 ! IMPLICIT SOLVATION SETUP COMMAND ! The nonbonded options below are part of the model eef1 setup temp 298.15 unit 93 name solvpar.inp update ctonnb 7. ctofnb 9. cutnb 10. group rdie mini abnr nstep 300 !This command prints out solvation free energy for each atom eef1 print dynamics verlet timestep 0.002 nstep 1000 nprint 100 iprfrq 100 - firstt 240 finalt 300 twindh 10.0 ieqfrq 200 ichecw 1 - iasors 0 iasvel 1 inbfrq 20 inte sele resid 2 end sele resid 19 end !the command below is not equivalent to energy inte sele all end energy skip asp energy stop CHARMM Element doc/energy.doc 1.1  File: Energy, Node: Top, Up: (chmdoc/commands.doc), Next: Description Energy Manipulations: Minimization and Dynamics The main purpose of CHARMM is the evaluation and manipulation of the potential energy of a macromolecular system. In order to compute the energy, several conditions must be met. There are also several support commands which directly relate to energy evaluation. * Menu: * Description:: Description of the energy commands * Skipe:: Selection of particular energy terms * Interaction:: Computation of interaction energies and forces. * Fast:: Requirements for using the fast routines * Needs:: Requirements for all energy evaluations * Optional:: Optional actions to be taken beforehand  File: Energy, Node: Description, Up: Top, Next: Skipe, Previous: Top Syntax for Energy Commands There are two direct energy evaluation commands. One is parsed through the minimization parser and the other involves a direct call to GETE. See *note Minimiz:(chmdoc/minimiz.do,,) and *note Gete:(chmdoc/usage.doc)interface. In addition to getting the energy, the forces are also obtained. The ENERgy command. (processed through the minimization parser) [SYNTAX ENERgy] ENERgy [ nonbond-spec ] [ hbond-spec ] [ image-spec ] [ print-spec ] [ COMP ] [ INBFrq 0 ] [ IHBFrq 0 ] [ IMGFrq 0 ] [NOUPdate] hbond-spec *note Hbonds:(chmdoc/hbonds.doc). nonbond-spec *note Nbonds:(chmdoc/nbonds.doc). image-spec *note Images:(chmdoc/images.doc)Update. If the COMP keyword is specified, then the comparison coordinate set is used, but this disables the use of the fast routines. The keyword NOUPdate turns off all update routines, and thus requires all lists to be present already. The GETE command. (a direct call to GETE) [SYNTAX GETEnergy] GETE [ COMP ] [ PRINt [ UNIT int ] ] [ NOPRint ] For this command to work, all list must be set up. This is best done through the UPDAte command. The COMP keyword will cause the comparison coordinate set to be used. The PRINt keyword will result in a subsequent call to PRINTE in order to print the energy. If the PRINt keyword is not specified, then NO indication that the energy has been called will be given. The UPDAte command (sets up required lists for GETE) [SYNTAX UPDAte lists] UPDAte [ nonbond-spec ] [ hbond-spec ] [ image-spec ] [ COMP ] [ INBFrq 0 ] [ IHBFrq 0 ] [ IMGfrq 0 ] [ EXSG {list-of-segment-names} | EXOF ] The update command will set up the codes lists and also create a nonbond list (unless INBFrq is 0) and a new hbond list (unless IHBFrq is 0). If the COMP keyword is specified, then the comparison coordinates will be used in setting up the nonbond and hbond lists. EXSG keword with optional following list of segment names allows to exclude some nonbonded interactions (ELEC & VDW). If list of names is empty ALL INTERsegment nonbonded interactions will be excluded. If list is not empty all INTER and INTRA segment nonbonded interactions for listed segments will be ecluded. EXOF turns off this option. H-bond energies (HBON) are not affected at the moment (Dec 3, 1991).  File: Energy, Node: Skipe, Up: Top, Next: Interaction, Previous: Description Skipping selected energy terms There is a facility to skip any desired energy terms during energy evaluation. For each energy term there is associated a logical flag determining whether that energy term is to be computed. Specifications are processed sequentially. The default operation is INCLude which implies that subsequent energy term are to be removed from the energy calculation. NOTE: that EXCLude implies that the energy term is to be computed. If for some reason, the list presented here is out of date, the data in SKIPE(energy.src) and in ENER.FCM of the source should be consulted. Syntax: [SYNTAX SKIP energy terms] [ INCLude ] [ EXCLude ] SKIPe repeat( [ ALL ] ) [ NONE ] [ item ] item::= [ BOND ] [ ANGL ] [ UREY ] [ DIHE ] [ IMPR ] [ VDW ] [ ELEC ] [ HBON ] [ USER ] [ HARM ] [ CDIH ] [ CIC ] [ CDRO ] [ NOE ] [ SBOU ] [ IMNB ] [ IMEL ] [ IMHB ] [ XTLV ] [ XTLE ] [ EXTE ] [ RXNF ] [ ST2 ] [ IMST ] [ TSM ] [ QMEL ] [ QMVDW] [ ASP ] [ EHARM] [ GEO ] [ MDIP ] [ STRB ] [ VATT ] [ VREP ] [ IMVREP ] [IMVATT] [ OOPL ] description: BOND - bond energy ANGL - angle energy UREY - Urey-Bradley energy term DIHE - dihedral energy IMPR - improper dihedral energy VDW - van der Waal energy ELEC - electrostatic energy HBON - hydrogen bond energy USER - user supplied energy (USERLINK) HARM - harmonic positional constraint energy CDIH - constrained dihedral energy CIC - internal coordinate constraint energy CDRO - quartic droplet potential energy NOE - NOE general distance restraints SBOU - solvent boundary energy IMNB - image van der Waal energy IMEL - image electrostatic energy IMHB - image hydrogen bond energy XTLV - crystal van der Waal energy XTLE - crystal electrostatic energy EXTE - extended electrostatic energy RXNF - reaction field energy ST2 - ST2 water-water energy IMST - image ST2 water-water energy TSM - TMS free energy term. QMEL - energy for the quantum mechanical atoms and their electrostatic interactions with the MM atoms using the AM1 or MNDO semi-empirical approximations QMVDW - van der Waals energy between the quantum mechanical and molecular mechanical atoms ASP - solvation free energy term based on Wesson and Eisenberg surface area method EHARM - second harmonic restraint term (for implicit Euler integration) GEO - Mean-Field-Potential energy MDIP - MDIPole mean fields constraints STRB - strech-bend interaction (MMFF) VATT - VdW attraction (MMFF) VREP - VdW repulsion (MMFF) IMVREP - image VdW repulsion (MMFF) IMVATT - image VdW attraction (MMFF) OOPL - out-of-plane (MMFF) Examples; SKIP ALL EXCL BOND - do just bond energy SKIP EXCL ALL - return flags to default state SKIP ELEC VDW - throw out electrostatics and van der Waals energy  File: Energy, Node: Interaction, Up: Top, Next: Fast, Previous: Skipe Interaction energies and forces The INTEraction command computes the energy and forces between any two selections of atoms. [SYNTAX INTEraction energy] INTEraction [ COMP ] [ NOPRint ] 2x(atom-selection) [UNIT int] If only one atom selection is given, then a self energy will be computed. This routine is quite efficient and may be used within a Charmm loop without too much overhead, though there are some restrictions. The COMP keyword causes the comparicon coordinates to be used. The NOPRint keyword will prevent the results from being printed. This routine works in the same manner as the GETE command in that all of the lists (CODES, nonbond, and Hbond) must be specified before invoking this command. One difference is that SHAKE will not be respected with this command (i.e. if the coordinates don't satisfy the constraints, neither will the energy). The following energy terms may be computed by this routine (unless supressed with the SKIP command); Bond - Energy defined by the two atoms involved. Angles - Energy allocated to the central atom (auto energy only). Dihedral - Energy defined between central two atoms Improper - Energy defined by first atom (auto energy only) van der Waal - ATOM option only. Energy defined by two atoms involved. Electrostatic - ATOM option only. Energy defined by two atoms involved. Hbond - Energy defined by heavy atom donor and acceptor atom. Harmonic cons - Energy allocated to central atom (auto energy only). Dihedral cons - Energy defined by central two atoms. User energy - Atom selections may be passed to USERE in the selection common (DEFIne command). Fill forces and energies as desired. All other energy terms will be zeroed. For terms listed "auto energy only", the corresponding atom must be present in both atom selections. For the remaining terms, one atom of the pair must be present in each of the atom selections. The energy division matches the method used in the analysis facility. This command will not work with the selection of images atoms, or the selection of ST2 waters. All energy terms not listed above will not be computed. The nonbond list must be generated with the ATOM and VATOM options. [T.Lazaridis, July 1999: Now INTE can work with the GROUP option] The individual energy terms are stored in the energy common and are available in commands and titles via the "?energy-term" substitution. The forces for all kept energy terms will be returned in the force arrays. Note, that it is possible for atoms to have a force that were not selected in either selection specification. This may happen for angle or dihedral terms on the first and last atoms. It may also happen in a similar manner for improper dihedrals, hydrogen bonding terms, and dihedral constraints.  File: Energy, Node: Fast, Up: Top, Next: Needs, Previous: Interaction [SYNTAX FASTer ] FASTer {integer} {OFF } {ON } {DEFAult} {SCALar } ! for testing only {VECTor } ! for testing only {CRAYvec } ! Use parallel code designed for a CRAY {PARVec } ! Use parallel/vector code best SMP machines and Convex Instead of using an integer value, FASTer command can be issued with one of the following keywords. Keyword Equivalent integer ---------------- ---------- FASTer OFF -1 DEFAult 0 ON 1 SCALar 2 VECTor 3 The FASTer keyword or integer defines which versions of the energy routines to be used. FASTer -1 : Always use slow routines FASTer 0 : Use fast routine if possible, no error if cannot (default) FASTer 1 : Use best optimized routine for the current machine (Error message if cannot) FASTer 2 : Use fast scalar routine (Error message if cannot) FASTer 3 : Use fast vector routine (Error message if cannot) There exist a general and a fast version of the internal energy routines (bond, angle, dihedral, and improper dihedral). The is also a fast version of nonbond energy evaluation (roughly 30-50% faster). These routines were designed for long minimization or dynamics calculations. To request the FAST routine, the FASTer command should be used with a positive integer or an appropriate keyword. A negative integer will disable the fast energy routines. If the fast routines are requested and it is not possible to use the fast routines, a warning will be issued, and the general routines will be used in their place. The fast routines are more efficient in several ways; (1) arrays are included in common files rather than passed (2) second derivatives have been removed (3) analysis and print options have been removed The restrictions are that; (1) the MAIN coordinate set must be used in the energy evaluations (2) second derivatives may not be requested (3) The PSF, parameter, and codes arrays must be used (from the common files) (4) a limited set of nonbond options must be used. The current nonbond options supported by the fast nonbond routine are as follows. ATOM [CDIE] [SHIFt ] VATOM [VSHIft ] [RDIE] [SWITch ] [VSWItch ] [FSWItch] [VFSWitch] [FSHIft ] GROUP [CDIE] [SWITch ] VGROUP [VSWItch ] [RDIE] [FSWItch]  File: Energy, Node: Needs, Up: Top, Next: Optional, Previous: Fast Requirements before energy manipulations can take place Before the energy of a system can be evaluated and manipulated, a number of data structures must be present. First, a PSF must be present. Second, a parameter set must be present. It must contain all parameters which are required by the PSF being used. Third, coordinates must be defined for every atom in the system. An undefined coordinate has a particular value, and if two coordinates have the same value, division by zero will occur in the evaluation of the energy. If the positions of hydrogens are required, the hydrogen bond generation routine, see *note Hbond: (chmdoc/hbonds.doc), must be called before the energy is evaluated. Fourth, provisions must be made for having a hydrogen bond list and a non-bonded interaction list. Having non-zero frequencies for updating this lists is one way, one can also read these lists in, see *note read:(chmdoc/io.doc)read, or generate them with separate commands, see *note HBgen:(chmdoc/hbonds.doc), or *note NBgen:(chmdoc/nbonds.doc).  File: Energy, Node: Optional, Up: Top, Previous: Needs, Next: Substitution Optional actions you can take to modify the energy manipulations There exist several commands which can modify the way the potential energy is calculated or can affect the way energy manipulations are performed. The Constraint command, see *note Cons:(chmdoc/cons.doc), can be used to constraints of various kinds. First, it can be used to set flags for particular atoms which will prevent them from being moved during minimization or dynamics. Second, it can be used to add positional constraint term to the potential energy. This term will be harmonic about some reference position. The user is free to set the force constant. Third, the user can place a harmonic constraint on the value of particular torsion angles in an attempt to force the geometry of a molecule. Other constraints are also available. The SHAKe command, see *note shake:(chmdoc/cons.doc)SHAKE, is used to set constraints on bond lengths and also bond angles during dynamics. It is very valuable in that it permits a larger step size to be used during dynamics. This is vital for dynamics where hydrogens are explicitly represented as the low mass and high force constant of bonds involving hydrogen require a ridiculously small step size. The user interface commands can be used to modify the calculation of the potential and to add another term to the potential energy. See *note Modify:(chmdoc/usage.doc)interface for details.  File: Energy, Node: Substitution, Up: Top, Previous: Optional, Next: Top The following command line substitution values may be included in any command or title. To get the total energy, the syntax; ...... ?TOTE ..... should be used. Energy related properties: 'TOTE' - total energy 'TOTK' - total kinetic energy 'ENER' - total potential energy 'TEMP' - temperature (from KE) 'GRMS' - rms gradient 'BPRE' - boundary pressure applied 'VTOT' - total verlet energy (no HFC) 'VKIN' - total verlet kinetic energy (no HFC) 'EHFC' - high frequency correction energy 'EHYS' - slow growth hysteresis energy correction 'VOLU' - the volume of the primitive unit cell = A.(B x C)/XNSYMM. Defined only if images are present, or unless specified with the VOLUme keyword. 'PRSE' - the pressure calculated from the external virial. 'PRSI' - the pressure calculated from the internal virial. 'VIRE' - the external virial. 'VIRI' - the internal virial. 'VIRK' - the virial "kinetic energy". Energy term names: 'BOND' - bond (1-2) energy 'ANGL' - angle (1-3) energy 'UREY' - additional 1-3 urey bradley energy 'DIHE' - dihedral 1-4 energy 'IMPR' - improper planar of chiral energy 'STRB' - Strech-Bend coupling energy (MMFF) 'OOPL' - Out-off-plane energy (MMFF) 'VDW ' - van der waal energy 'ELEC' - electrostatic energy 'HBON' - hydrogen bonding energy 'USER' - user supplied energy term 'HARM' - harmonic positional restraint energy 'CDIH' - dihedral restraint energy 'CIC ' - internal coordinate restraint energy 'CDRO' - droplet restraint energy (approx const press) 'NOE' - general distance restraint energy (for NOE) 'SBOU' - solvent boundary lookup table energy 'IMNB' - primary-image van der waal energy 'IMEL' - primary-image electrostatic energy 'IMHB' - primary-image hydrogen bond energy 'EXTE' - extended electrostatic energy 'EWKS' - Ewald k-space sum energy term 'EWSE' - Ewald self energy term 'RXNF' - reaction field electrostatic energy 'ST2' - ST2 water-water energy 'IMST' - primary-image ST2 water-water energy 'TSM' - TMS free energy term 'QMEL' - Quantum (QM) energy with QM/MM electrostatics 'QMVD' - Quantum (QM/MM) van der Waal term 'ASP' - Atomic solvation parameter (surface) energy 'EHAR' - Restraint term for Implicit Euler integration 'GEO ' - Mean-Field-Potential energy term 'MDIP' - Dipole Mean-Field-Potential energy term 'PRMS' - Replica/Path RMS deviation energy 'PANG' - Replica/Path RMS angle deviation energy 'SSBP' - ??????? (undocumented) 'BK4D' - 4-D energy 'SHEL' - ??????? (undocumented) 'RESD' - Restrained Distance energy 'SHAP' - Shape restraint energy 'PULL' - Pulling force energy 'POLA' - Polarizable water energy 'DMC ' - Distance map restraint energy 'RGY ' - Radius of Gyration restraint energy 'EWEX' - Ewald exclusion correction energy 'EWQC' - Ewald total charge correction energy 'EWUT' - Ewald utility energy term (for misc. corrections) Energy Pressure/Virial Terms: 'VEXX' - External Virial 'VEXY' - 'VEXZ' - 'VEYX' - 'VEYY' - 'VEYZ' - 'VEZX' - 'VEZY' - 'VEZZ' - 'VIXX' - Internal Virial 'VIXY' - 'VIXZ' - 'VIYX' - 'VIYY' - 'VIYZ' - 'VIZX' - 'VIZY' - 'VIZZ' - 'PEXX' - External Pressure 'PEXY' - 'PEXZ' - 'PEYX' - 'PEYY' - 'PEYZ' - 'PEZX' - 'PEZY' - 'PEZZ' - 'PIXX' - Internal Pressure 'PIXY' - 'PIXZ' - 'PIYX' - 'PIYY' - 'PIYZ' - 'PIZX' - 'PIZY' - 'PIZZ' - Examples: 1. Save the structure with a lower NOE restraint energy. READ COOR CARD UNIT 1 ! Read the first structure READ COOR CARD COMP UNIT 2 ! Read the second structure ENERGY ! Compute energy of first structure SET 1 ?NOE ! save the NOE energy value ENERGY COMP ! Compute the energy of the second structure IF ?NOE LT @1 COOR COPY ! replace first structure if second has ! a lower energy. 2. Write some energy values when saving coordinates .... COOR ORIE RMS MASS ENERGY OPEN WRITE CARD UNIT 22 NAME RESULT.CRD WRITE COOR CARD UNIT 22 * Final coordinates * energy=?ENER and electrostatic energy=?ELEC * mass weighted rms deviation from xray structure is ?RMS * CHARMM Element doc/ewald.doc 1.1  File: Ewald, Node: Top, Up: (chmdoc/commands.doc), Next: Syntax, Previous: Top The Ewald Summation method Invoking the Ewald summation for calculating the electrostatic interactions can be specified any time the nbond specification parser is invoked. See the syntax section for a list of all commands that invoke this parser. Prerequisite reading: nbonds.doc * Menu: * Syntax:: Syntax of the Ewald summation specification * Defaults:: Defaults used in the specification * Function:: Description of the options * Discussion:: More general discussion of the algorithm  File: Ewald, Node: Syntax, Up: Top, Next: Defaults, Previous: Top [SYNTAX EWALD] { NBONds } { nonbond-spec } { UPDAte } { } { ENERgy } { } { MINImize } { } { DYNAmics } { } The keywords are: nonbond-spec::= [ method-spec ] { [ NOEWald ] } { } method-spec::= { EWALd [ewald-spec] { [ NOPMewald [std-ew-spec] ] } } { { PMEWald [pmesh-spec] } } ewald-spec::= KAPPa real [erfc-spec] std-ew-spec::= { [ KMAX integer ] } KSQMAX integer { KMXX integer KMXY integer KMXZ integer } pmesh-spec::= FFTX int FFTY int FFTZ int ORDEr integer [QCOR real (***) ] erfc-spec::= { SPLIne { [EWMIn real] [EWMAx real] [EWNPts int] } } { INTErpolate { } } { } { ABROmowitz } { CHEBychev } { EXACt_high_precision } { LOWPrecision_exact } { ERFMode int }  File: Ewald, Node: Defaults, Up: Top, Next: Function, Previous: Syntax The defaults for the ewald summation are set internally and are currently set to NOEWald, KAPPa=1.0, KMAX=5, KSQMax=27, and NOPMewald, KAPPa=1.0, FFTX=FFTY=FFTZ=32, ORDEr=4, QCOR=1.0 Recommended values for Ewald are: EWALD PMEWald KAPPa 0.34 ORDEr 6 - FFTX intboxvx FFTY intboxvy FFTZ intboxvz - CTOFNB 12.0 CUTNB 14.0 QCOR 1.0(***) Where intboxv* is an integer value similar to or larger than the corresponding unit cell dimension that has prime factors of 2,3, and 5 only (2,3 preferred). grid point spacing should be between 0.8 and 1.2 Angstroms. These recommended values should give relative force errors of roughly 10**-5. To reduce the total PME cost at the expense of accuracy, decrease the cutoff distances while increasing KAPPa (keep the product near 4) reduces the real space cost. To reduce the K-space cost, either reduce ORDEr from 6 to 4 or increase the grid spacing up to perhaps 1.5 Angstroms. (***) The QCOR value should be 1.0 for vacuum, solid, or finite systems. For periodic systems in solution, it should be reduced (or set to zero) by an amount that depends on how the net charge is distributed and on the effective dielectric constant. For a treatise on this correction term, see: S. Bogusz, T. Cheatham, and B. Brooks, JCP (1998) 108, 7070-7084 and references contained therein (esp. Hummer and Levy).  File: Nbonds, Node: Function, Up: Top, Previous: Defaults, Next: Discussion i) The EWALD keyword invokes the Ewald summation for calculation of electrostatic interactions in periodic, neutral systems. The formulation of the Ewald summation dictates that the primary system must be neutral. If otherwise, the summation is not formally correct and some convergence problems may result. The NOEWald (default) suppresses the Ewald method for calculating electrostatic interactions. Van der waals options VSHIFT and VSWITCH are supported with ewald. The algorithm currently supports the atom and group nonbond lists and the CRYSTAL facilty must be used. The PMEWald keyword invokes the Particle Mesh Ewald algorithm for the reciprocal space summation. For details on the PME method, see J. Chem. Phys. 103:8577 (1995). The EWALd algorithm is limited to CUBIC, TETRAGONAL, and ORTHORHOMBIC unit cells. The PMEWald algorithm supports all unit cells that are supported by the CRYSTAL facility. ii) The KAPPa keyword, followed by a real number governs the width of the Gaussian distribution central to the Ewald method. An approximate value of kappa can be chosen by taking KAPPa=5/CTOFNB. This is fairly conservative. Values of 4/CTOFNB lead to small force errors (roughly 10**-5). See discussion section for details on choosing an optimum value of KAPPa. iii) The KMAX key word is the number of kvectors (or images of the primary unit cell) that will be summed in any direction. It is the radius of the Ewald summation. For orthorombic cells, the value of kmax may be independently specified in the x, y, and z directions with the keywords KMXX, KMXY, and KMXZ. In the PME version, the number of FFT grid points for the charge mesh is specified by FFTX, FFTY, and FFTZ. iv) The KSQMax key word should be chosen between KMAX squared and 3 times KMAX squared. v) An appropriate, although not optimal, set of parameters can be chosen by taking KAPPA=5/CTOFNB and KMAX=KAPPa*boxlength. The actual values should then be performanced optimized for your particular system. For the PME method, FFTX should be approximately the box length in Angstroms. (for efficiency, FFTX should be a multiple of powers of 2,3, and 5). IMPORTANT NOTE::: THE SUGGESTION THAT FFTX, FFTY, AND FFTZ HAVE NO PRIME FACTORS OTHER THAN 2, 3, AND 5 SEEMS TO BE A REQUIREMENT. LARGE ERRORS IN THE FORCE ARE OBSERVED WHEN THIS CONDITION IS NOT MET. FUTURE VERSIONS OF CHARMM WILL FLAG THIS AS AN ERROR CONDITION. ORDEr specifies the order of the B-spline interpolation, e.g. cubic is order 4 (default), fifth degree is ORDEr 6. The ORDEr must be an even number and at least 4. vi) EWALd runs in parallel on both shared (PARVECT) and distributed memory parallel computers. PME runs in parallel on distributed memory computers. vii) several algorithms are available for the calculation of the complimentary error function, erfc(x). EXACt and LOWPrecision use an interative technique described in section 6.2 of Numerical Recipies. ABRO and CHEB are polynomial approximations. A lookup table (filled at the beginning of the simulation using the EXACt method) can be used with either a linear (INTE) of cubic spline (SPLINe) interpolation. SPLIne is recommended. viii) Ewald with MMFF A version of EWALD was developed for MMFF. The usual MMFF electrostatic term: qq/(r+d) is split into two terms: qq/r - qq*d/(r*(r+d)) The first term is handled by the Ewald method in the usual manner (real-space and k-space parts) and the second term is truncated at the cutoff distance using a switching function (from CTONNB to CTOFNB). Since the second term is quite small at the cutoff distance, the use of a switching function should not introduce significant artificial forces.  File: Ewald, Node: Discussion, Up: Top, Previous: Function, Next: Top The Ewald Summation in Molecular Dynamics Simulation The electrostatic energy of a periodic system can be expressed by a lattice sum over all pair interactions and over all lattice vectors excluding the i=j term in the primary box. Summations carried out in this simple way have been shown to be conditionally convergent. The method developed by Ewald, in essence, mathematically transforms this fairly straightforward summation to two more complicated but rapidly convergent sums. One summation is carried out in reciporcal space while the other is carried out in real space. Based on the formulation by Ewald, the simple lattice sum can be reformulated to give absolutely convergent summations which define the principal value of the electrostatic potential, called the intrinsic potential. Given the periodicity present in both crystal calculations and in dynamics simulations using periodic boundary conditions, the Ewald formulation becomes well suited for the calculation of the electrostatic energy and force. If we consider a system of point charges in the unit or primary cell, we can specify its charge density by ro(r) = sum_i [ q_i * delta(r-r_i)] In the Ewald method this distribution is replaced by two other distributions ro_1(r) = sum_i [ q_i ( delta(r-r_i) - f(r-r_i)] and ro_2(r) = sum_i [q_i f(r-r_i) such that the sum of the two recovers the original. The distribution, f(r), is a spherical distribution generally taken to be Gaussian, the width of the gaussian dictated by the parameter, KAPPa. The charge distributions are situated on the ion lattice positions, but integrate to zero. The potential from the distribution ro_1(r) is a short range potential evaluated in a direct real space summation (truncated at CTOFNB). The diffuse charge distribution placed on the lattice sites reduces to the potential of the corresponding point charge at large r. ro_2(r), being a continuous distribution of Gaussians situated on the periodic lattice positions, is a smoothly varying function of r and thus is well approximated by a superposition of continuous functions. This distribution is, therefore, expanded in a Fourier series and the potential is obtained by solving the Poisson equation. The point of splitting the problem into two parts, is that by a suitable choice of the parameter KAPPa we can get very good convergence of both parts of the summation. For the real space part of the energy, we choose kappa so that the complementary error function term, erfc(kappa*r) decreases rapidly enough with r to make it a good approximation to take only nearest images in the sum and neglect the value for which r > CTOFNB. The reciprocal space sums are rapidly convergent and a spherical cutoff in k space is applied so that the sum over k becomes a sum over {l,m,n}, with (l**2+m**2+n**2) < or = to KSQMAX A large value of KAPPa means that the real space sum is more rapidly convergent but the reciprocal space sum is less rapid. In practice one chooses KAPPa to give good convergence at the cutoff radius, CTOFNB. KMAX is then chosen to such that the reciprocal space calculation converges. The equation (KMAX/(box length)=KAPPa may be used as a rough guide. Optimization with respect to the timing trade offs, ie. how much time is spent in real space vs k-space should be performed before a lengthy production run. The CCP5 notes in several articles in 1993 cover some possible optimization strategies and criteria although a simple line search will suffice. Complete optimiztion of the ewald method for a particular application requires optimizing CTOFNB, KAPPa, and KMAX. A discussion of optimization and error analysis can be found in Kolfka and Perram, Molecular Simulation, 9, 351 (1992). For PME, see Feller, Pastor, Rojnuckarin, Bogusz, and Brooks. J. Phys. Chem., 100, 42, 17011 (1996) and some of Tom Darden's published work. CHARMM Element doc/fourd.doc 1.1  File: Fourd, Node: Top, Up: (chmdoc/commands.doc), Next: Syntax 4 Dimension dynamics: Description and Discussion The energy embedding technique entails placing a molecule into a higher spatial dimension {Crippen,G.M. & Havel,T.F. (1990) J.Chem.Inf.Comput.Sci. Vol 30, 222-227}. The possibility of surmounting energy barriers with these added degrees of freedom may lead to lower energy minima. Here, this is accomplished by molecular dynamics in four dimensions. Specifically, another cartesian coordinates was added to the usual X, Y, and Z coordinates in the LEAPfrog VERLet algorithm. To employ 4D energy embedding, the energy function and force field in CHARMM was modified to include fourth dimension coordinates. An additional harmonic energy function has been included to control the extent to which a molecule is embedded. This is quantatitatively done by altering the value of its force constant, initially given by the parameter K4DI. The 4D energy embedding procedure can be broken down into three parts: 4D coordinate generation, relaxation, and back projection. Fourth dimensional coordinates can be generated in several ways. An energy, E4FILL, in the Fourth dimension can be specified with random coordinates generated as to sum up to the 4D harmonic energy that a user specifies (i.e. E4FILL 50.0 will give coordinates such that the total sums approximately 50.0 Kcal). This method may seem a bit abrupt since a molecule is suddently "thrown" into a higher dimension, hence, molecular dynamics can be used to allow a molecule to more slowly obtain fourth dimension coordinates. This is done by specifying an initial 4D temperature, FSTT4, with subsequent velocities assigned accordingly. Finally, both these methods may be applied simultaneously. Relaxation involves allowing the molecule to explore the potential energy surface and is essentially equilibration. Alternatively, minimization in 4D can be done with the steepest descent algorithm followed by 4D dynamics. Now all that remains is to project this structure back into three dimensions. This last step is thus termed the back projection and is achieved by increasing the fourth dimensional force constant linearly from its initial value of K4DI to MULTK4*K4DI step-wise over the period INC4D to DEC4D. This results in a stronger force, confining the 4th dimension coordinates to smaller values (i.e. eventually back to 3D). A problem inherent in the final step of 4D energy embedding is that "sometimes all projections lead to a bad final conformation" {Crippen,G.M & Havel,T.F.(1990)J.Chem.Inf.Comput.Sci.Vol 30,222-227}. Thus, the structure is rotated into its principal axis of intertia (center of mass) both before and after its back projection. When this step is applied the message ROTATION APPLIED TO PRINCIPAL AXES will appear. Dynamc4.src is essentially dynamc.src in 4 dimensions. Note that even though qeuler still exists in dynamc4.src it has not yet been tested. Also, the usual shake algorithm will only be applied to 3-dimensional space. * Menu: * Syntax:: Syntax of the 4 dimension dynamics command * Description:: Description of the keywords and options * Recommended:: Recommended input options and values * Discussion:: Running 4 dimension dynamics * Output:: Output from a 4 dimension dynamics run  File: Fourd, Node: Syntax, Up: Top, Previous: Top, Next: Description Syntax for the Dynamics Command DYNAmics { [LEAPfrog] } VER4 {STRT } {[TIMEstp real]} [NSTEp integer] - { [LANGevin] } {STARt } {[FIL4dimension]} {RESTart} {[SKBOnd]} {[SKANgle]} {[SKDIhedral]} {[SKVDerWaals]} {[SKELectrostatics]} four dimension-spec nonbond-spec hbond-spec frequency-spec - unit-spec temperature-spec options-spec hbond-spec::= updated as in normal LEAPfrog VERLet. nonbond-spec::= updated in 4 dimensions. four dimension-spec::= [K4DInitial real] [INC4Dforce integer] [DEC4Dforce integer] [MULTK4di real] [E4FILLcoordinates real] frequency-spec::= [INBFrq integer] [IEQFrq integer] [IHBFrq integer] [IHTFrq integer] [IPRFrq integer] [NPRInt integer] [NSAVC integer] [NSAVV integer] [NTRFrq integer] [ILBFrq integer] [ISVFRQ integer] [IEQ4 integer] [IHT4 integer] unit-spec::= [IUNCrd integer] [IUNRea integer] [IUNVel integer] [IUNWri integer] [KUNIt integer] [CRAShu integer] [BACKup integer] temperature-spec::= [FINAlt real] [FIRStt real] [TEMInc real] [TSTRuc real] [TWINDH real] [TWINDL real] [FNLT4 real] [FSTT4 real] [TIN4 real] [TWH4 real] [TWL4 real] options-spec::= [IASOrs integer] [IASVel integer] [ICHEcw integer] [ISCAle integer] [ISCVel integer] [ISEEd integer] [SCALe real] [NDEGg integer] [RBUFfer real] [AVERage] [ECHEck real] [TOL real] [ICH4 integer]  File: Fourd, Node: Description, Previous: Syntax, Up: Top, Next: Recommended Options common 4D dynamics & minimization The following table describes the keywords which apply to only four dimension dynamics & minimization. The remaining parameters are described in dynamc.doc and minimiz.doc. FOURdimensions [INC4d int] [DEC4d int] [K4DI real] [MULTK4 real] - [ SKBO ] [ SKAN ] [ SKDI ] [ SKVD ] [ SKEL ] [ SKCO ] - [FIL4 [E4FILL real ] ] [ SHAKe ] Keyword Default Purpose INC4D NSTEP The step number (specifically, the time in a dynamics run) at which the back projection from 4 to 3 dimensions will begin. Note the default value of NSTEP will result in no back projection. DEC4D NSTEP The step number at which the back projection from 4 to 3 dimensions will end. K4DI 50.0 The initial force constant for the 4th dimensional harmonic energy term. MULTK4 1.0 The factor by which K4DI will increase linearly from INC4D to DEC4D. FSTT4 FIRSTT The initial temperature, in the 4th dimension, at which the velocities have to be assigned to begin the dynamics run. If an equal amount of kinetic energy is needed in all 4 dimensions, the default value should be used. This is because the velocities are all assigned independently in accordance to the initial temperature. FNLT4 FINALT The desired final (equilibrium) temperature, in the 4th dimension, for the system. A final temperature of zero degrees is recommended during a back projection (from INC4D to DEC4D). IEQ4 IEQFRQ The step frequency for assigning or scaling the 4th dimension velocities to FNLT4 temperature during the equilibration stage of the dynamics run. IHT4 IHTFRQ The step frequency for heating the molecule in the 4th dimension, in increments of TIN4 degrees in the heating portion of a dynamcis run. TIN4 TEMINC The temperature increment to be given to the system every IHT4 steps. Important in the 4th dimension heating stage. TWH4 TWINDH The temperature deviation from FNLT4 to be allowed on the high temperature side. Used only during 4th dimension equilibration. TWL4 TWINDL The temperature deviation from FNLT4 to be allowed on the low temperature side. Used only during 4th dimension equilibration. ICH4 ICHECW The option for checking to see if the average 4th dimension temperature of the system lies within the allotted temperature window (between FNLT4+TWH4 and FNLT4-TWL4) every IEQ4 steps. FIL4 The flag to fill the 4th dimension coordinates. The harmonic energy potential of these coordinates will sum to E4FILL. If not present (recommended), the 4th dimension coordinates are set to zero and the system will 'go into the 4th dimension' as a result of their initial velocities. E4FILL 0.0 The total harmonic potential energy from which the initial 4th dimension coordinates will be calculated. Only used when the flag FIL4 is present. SKBO Flag to skip 4th dimension bond energies (i.e.only compute bond energies in 3 dimensions). SKAN Flag to skip 4th dimension angle energies. SKDI Flag to skip 4th dimension proper dihedral energies. SKVD Flag to skip 4th dimension Van der Waals energies. SKEL Flag to skip 4th dimension electrostatic energies. SKCO Flag to skip 4th dimension restraint (so restraining Forces are calculated in 3D only). SHAKe Command to place all 4D W's into same W every iteration (NOTE:energy not conserved). The 4D forces are not normally mass weighted, but if SHA4 is used then they are. Maybe it should be a 4D option in the future. Other Commands: CONS FIX4 ... Used in analogy to the FIX command to FIX 4th D coordinates with CONS (meaning one can FIX something in 3D only). SCALar FDEQ (0.0) The equilibrium value(s) that the 4th D function will use as the center of the harmonic. Used for restraining the 4th D to non zero values (i.e. forcing a system into the 4th Di). It should be set with the SCALAR option for individual atoms (if one wants to set different atoms into different 4th D coordinate minima). (1/2)*K4d*W**2, where W=FDIM(I)-FDEQ(I) SCALar FDIM (0.0) The coordinate(s) (in analogy to X,Y, & Z) of the 4th D. It should be set with the SCALAR option for individual atoms (if one wants to set different atoms into different 4th D coordinates).  File: Fourd, Node: Recommended, Previous: Description, Next: Top, Up: Top Recommended CHARMM input for 4d dynamics 1) Beginning with a 3d structure and no 4d coordinates, a structure is equilibrated in 4d and then back projected (forced back) to 3d. DYNAMCS LEAP VER4 START K4DI 50.0 NSTEP 20000 - TIMESTEP .001 FSTT4 300.0 FNLT4 300.0 CUTBN 8.0 - IHTFRQ 0 IEQFRQ 100 IEQ4 100 NPRINT 10 - IUNREA -1 IUNWRI 16 - IHBFRQ 25 FIRSTT 1000.0 FINALT 1000.0 TEMINC 0.0 TIN4 0.0 DYNAMCS LEAP VER4 RESTART NPRE 0 NSTEP 15000 - K4DI 50.0 INC4D 0 DEC4D 15000 MULTK4 10.0 - TIMESTEP .001 FSTT4 300.0 FNLT4 300.0 CUTBN 8.0 - IHTFRQ 0 IEQFRQ 100 IEQ4 100 NPRINT 10 - IUNREA 16 IUNWRI 17 - IHBFRQ 25 FIRSTT 1000.0 FINALT 100.0 TEMINC 3.0 TIN4 1.0 2) Beginning with a 4d structure with 10.0 Kcal initially in the 4th dimension. DYNAMCS LEAP VER4 START K4DI 50.0 NSTEP 20000 - FIL4 E4FILL 10.0 - TIMESTEP .001 FSTT4 300.0 FNLT4 300.0 CUTBN 8.0 - IHTFRQ 0 IEQFRQ 100 IEQ4 100 NPRINT 10 - IUNREA -1 IUNWRI 16 - IHBFRQ 25 FIRSTT 1000.0 FINALT 1000.0 TEMINC 0.0 TIN4 0.0 3) Fixing the 4th D coordinates of some bulk solvent and setting the solute coordinates "out" in 4D space and along with its equilibrium value. Following this the energy is determined. CONS FIX4 SELE SEGID BULK END SCALAR FDIM SET 10.0 SELE SEGID SOLV END FOUR K4DI 50.0 SKBO SKAN SKDI SKCO ENERGY CHARMM Element doc/galgor.doc $Revision: 1.1 $  File: Galgor, Node: Top, Up: (chmdoc/commands.doc), Next: Implementation Galgor: Commands which deal with Genetic Algorithm and Monte Carlo. # Michal Vieth,H. Daigler, C.L. Brooks III -Dec-15-1997 Initial release. The commands described in this node are associated with genetic algorithm module for conformational searches and docking of small ligands to rigid proteins. The full description of the GA features is presented in the paper "Rational approach to docking. Optimizing the search algorithm" * Menu: * Implementation:: A brief description of the anatomy of GA * Syntax:: Syntax of the replication commands * Description:: Description of key words and commands usage * Restrictions:: Restrictions on usage * Examples:: Supplementary examples of the use of GA  File: Galgor, Node: Implementation, Up: (chmdoc/commands.doc), Next: Syntax Genetic Algorithm and Monte Carlo: Description and Discussion Name Keyword Module GA setup GALGOR SETUP genetic.src Genetic algorithm GALGOR EVOLVE genetic.src, genetic2.src Monte Carlo GALGOR EVOLVE MCARLO genetic.src, genetic2.src This code was created by Michal Vieth, Heidi Daigler and Charles Brooks III at The Scripps Research Institute during the summer/fall of 1997 based on the code provided by Charles Brooks and Heidi Daigler, Department of Chemistry, Carnegie Mellon University developed during the summer of 1994. Its purpose is to enable monte carlo and genetic algorithm based conformational searches to be performed on peptides/proteins, small organic molecules and docking of (small) ligands to their receptors. It builds upon the replica ideas of Leo Caves to make multiple copies of the system, i.e., the chromosomes. These chromosomes make up a population of molecular conformations which are crossed and mutated according to the specific genetic algorithm used. It is recomended to use a generational update version of genetic algorithm with elitism 1 or 2, and and island/nicheing ideas to evolve subpopulations independently. Each chromosome is represented by the set of internal coordinates (and rigid body degrees of freedom if desired) deemed to "evolve", i.e., the genes. The genes are represented as real numbers. The chromosome are crossed at a randomly chosen gene or random genes are mutated with a given rate of mutation. In addition migration of individuals between subpopulations is used if the island model is employed. From each population, the members suitable for crossing, the "parents", are chosen based upon their ranking, as measured by a priori chosen preference to select higher ranked individuals. Evolutionary presure of 1.2 based of scaled fitness following Jones, JMB 245, 43-53 is used for parents selection. The potential energy function used can be all of CHARMM's energy terms, including constraints, or any subset of them. The user energy term can be utilized to include any special functional form which is particularly amenable to the GA search. The current implementation uses the standard CHARMM energy functions. For docking the essential part of the energy function is the use of soft core VDW potential that permits ligands to diffuse into the protein. The protein in docking is rigid, and at this time it is not possible to include partial flexibility. Monte Carlo scheme uses the same engine, the only difference is that the members of the entire population become completly separate entities which do not exchange any information. Their evolution by simulated annealing is similar to MCSS molecular dynamics. Entire approach to docking and comparison of performance of GA/MD/MC is described in detail in two back to back papers submitted to J.Comp.Chem. "Rational approach to docking I and II." In the initial set-up stage, the GALGorithm keyword is parsed and the main subroutine for the genetic algorithm is called from CHARMM (GENETIC). The keyword SETUp causes a number of things to occur. First the number of chromosomes (replicas) and the atoms whose internal coordinates are to be sampled and replicated are parsed. Following this, the specific IC variables which are to be "evolved" are selected by the VARIable_ICs sub-parser. This is carried out in a separate routine which is modeled after the routine used for processing the IC edit command. Finally, the replica subroutine from Leo Caves is called to replicate the system and the original subsystem atoms are deleted by a call equivalent to delete atoms select segid orig end. Each new segment generated by the replica command is given the prefix C (for chromosome) followed by the number of that replica (chromosome). The apropriate pointer structures for inclusion/exclusion of dihedral, angle and bond ICs are then created and control is passed back to the CHARMM level, where specific manipulations can be performed on the individual chromosomes, e.g., to vary initial conformations around the initial progenerator of all chromosomes. Evolution of the population of chromosomes is controlled by calling GALGorithm with the EVOLve keyword from CHARMM. This uses a parser which sets-up the control variables for the genetic algorithm evolution and then runs the genetic evolution. The specific functions performed by this portion of the code are: 1. Parse parameters controlling the GA evolution. GA follows the scheme: 2. Evolve the population of chromosomes. This involves the following steps: a) Evaluate the energy of the population at step 0. b) Rank the population members in accordance with the viability. c) Choose parents to develop the next generation based on roulette wheel selection according to scaled fitness. d) Mate parents by crossover or random mutations in genes, e) Reconstruct (via IC build-like procedure) the cartesian positions of the new generation from modified ICs and if applicable from the modified values of the rigid degrees of freedom. f) Evaluate energies of the new generation g) if elitist model is being used transfer fittest parent into the new generation. If steady state/generational update is used replace least fit parents by children. if elitism is used retain some most fit parents. IF evolutionary strategy is pick the best members to form a new population h) Begin again at step b), cycle through to convergence or MAXGenerations or NEVALuations Monte Carlo proceeds via the following scheme: 2. Evolve the population of chromosomes. This involves the following steps: a) Evaluate the energies of each replica b) Generate new replica by mutations c) Reconstruct (via IC build-like procedure) the cartesian positions of the new generation from modified ICs and if applicable from the modified values of the rigid degrees of freedom. d) Evaluate energies of the new replicas e) Accept new replicas with Boltzmann probability h) Begin again at step b), cycle through MAXGenerations or NEVALuations  File: Galgor, Node: Syntax, Up: Top, Previous: Top, Next: Description Syntax for the Genetic ALGORithm command GALGOR setup: {[SETUP] [CHROmosomes int] [atom selection] - {[VARIable_IC] - {[DIHEdral] [IMPRO] [INCLude] [4x ALL Within VARI specifies the selection of dihedral angles to be used as active variables. May be followed by DIHE DEPE. DIHE DEPE OFFS value Within VARI specifies the dihedral that can be computed from the value of the dihedral defined immediately before by adding the OFFSet value, This specifies dihedral dependency if two or more quartets of atoms describe rotation about the same