DelPhi V3.0 -1- 15555552DelPhi- A Macromolecular Electrostatics Modelling Package DelPhi is a software package which calculates electrostatic potentials in and around macromolecules. It incorporates the effects of ionic strength through the (non)linear Poisson- Boltzmann equation. Any value for the dielectric constant of the molecule and solvent may be specified. Periodic boundary conditions can be used to model long periodic molecules such as DNA. The output from the programs can be used to calculate interactions, changes in pKa, solvation energies and many other properties of interest. Potentials may be displayed as 3-Dimensional isopotential contours, 2- Dimensional contour slices or as color coded molecular surfaces DelPhi V3.0 -2- Version DelPhi, Version 3.0, May 1989 Authors: Kim A. Sharp, Anthony Nicholls Dept. of Biochemistry and Molecular Biophysics, Columbia University, 630 W168th St. New York, NY 10032 Phone: (212) 305-6884 Bitnet: SHARP@CUMBG Internet: sharp@128.59.98.1 nicholls@128.59.98.1 Copyright This software forms version 3 of a package called DELPHI, which is copyrighted by COLUMBIA UNIVERSITY. INSIGHT is a tradename of Biosym Corp. Permission to use this software is given on the understanding that it will not be distributed without consulting Professor Barry Honig, in the Dept. of Biochemistry at Columbia, (212)-305-7970. Other copyright laws may also apply. For further information contact Barry Honig. References The following references may be useful for any use of this software in research that will be published. In particular references 1, 3 and 4 if read in detail will provide a basic tutorial in the way we do electrostatics, and provide a guide to how various electrostatic quantities can be calculated. 1) Klapper, I., Hagstrom, R., Fine, R., Sharp, K., Honig, B. (1986). Focusing of electric fields in the active site of Cu-Zn Superoxide Dismutase: Effects of ionic strength and amino-acid modification. Proteins 1, p47. 2) K.A. Sharp, M.K.Gilson, R.M.Fine and B.H. Honig. (1987). Electrostatic interactions in proteins. UCLA Symposium on Molecular and Cell Biology, Vol 69: Protein Structure, Folding and Design, Ed. D.L. Oxender, p235. 3) Gilson, M., Sharp, K., Honig, B. Calculating electrostatic interactions in bio-molecules: Method and error assessment. J. Computational Chem. 9, pp327-335. 4) Gilson, M., Honig, B. Total Electrostatic Energy of DelPhi V3.0 -3- a Protein. Proteins, 4, p7 (1988). 5) B. Jayaram, K.A.Sharp and B.H.Honig. The electrostatic potential of B-DNA. Biopolymers, 28, p975 (1989). 6) K. Sharp, and B. Honig. Lattice Models of Electrostatic Interactions: The Finite Difference Poisson-Boltzmann Method. Chemica Scripta, In Press. 7) K. Sharp, and B. Honig. Electrostatic Interactions in Macromolecules: Theory and Applications. Ann. Rev. Biophys. Biopys. Chem. 19:301-32 (1990). This review has references to all known applications of the finite-difference Poisson- Boltzmann method to macromolecules at the time of writing. The original reference to the use of the finite difference method for macromolecular electrostatics is: J. Warwicker and H.C. Watson, J. Mol. Biol., 157, p671 (1982). DelPhi V3.0 -4- _1. INTRODUCTION This is a collection of programs to take a brookhaven data base format coordinate file of a molecule and calculate the potential in and around the molecule, using a finite differ- ence solution to the non-linear POISSON-BOLTZMANN equation for any given ionic strength, solvent and molecule dielec- trics. Briefly, it consists of the following steps, explained in more detail below: _1._1. Prepare three files containing: a) run parameters b) atomic radii c) atomic charges. _1._2. Modify the brookhaven format atomic coordinate file as necessary: eg. if a full partial charge set on all atoms is required it will be necessary to build in hydrogens using a suitable modelling program. Also if the output of potentials at various specified atoms are required, a second atomic coordinate file may be needed. _1._3. Assign logical units to files. _1._4. Run the program QDIFF (finite difference algorithm), in batch or interactively directing the output to unit 6 into a log file if required. QDIFFX is a much faster algo- rithm which supersedes QDIFF, except when the non-linear option is required. QDIFFXS is the latest version of QDIFFX, which provides a more accurate representation of the surface, hence gives better solvation energies, particularly at small scales (less than 2 grids/angstrom). If QDIFFXS is not included it should be available soon (as of 2 july 1990). _1._5. Analyse the results. Two things that are commonly required are to read out and display the potentials. Several programs are provided to do this: A conversion utility PHITOINS will convert the poten- tial map output into a format that can be read by INSIGHT, Biosym's molecular modelling package. Alter- natively, QDIFF can be made to output INSIGHT compati- ble files directly. The program PHITOMBK will convert the potential map output into Peter Goodford's brickmap format which can be read for example by QUANTA or FRODO to produce DelPhi V3.0 -5- 3D contour displays of the potential. The program PHITOPDB will write the potentials from the DelPhi output into the B factor, or temperature factor field of a brookhaven data base format coordinate file. The atoms may then be displayed and colored by poten- tial by any display program that can color atoms by B, or Temperature factor, eg QUANTA. _1._6. Transferring files. The primary output file from the program is a three dimen- sional array of potentials calculated at the lattice points. This is a large file (65**3) and is written in binary to save space and time. In some cases it will be necessary to transfer files between different machines. This can be done in two ways: The program ASCIIPHI converts a binary potential map file to ascii or vice-versa. The ascii form may be reconverted to the binary form on the receiving machine. Some truncation of potentials and loss of pre- cision occurs during the conversion so these files should only be used for display purposes. The program MBKBIN converts a binary brickmap file to ascii or vice-versa. The ascii form may be reconverted to the binary form on the receiving machine. Some trun- cation of potentials and loss of precision occurs dur- ing the conversion so these files should only be used for display purposes. Note that the real world enters the picture through three data files: 1) atom coodinates, in file.pdb; 2) atom radii, in file.siz; 3) atom charges, in file.crg. _1._7. DIRECTORIES Upon installation, a directory structure should be created in order to group files in some kind of order. The direc- tories should be arranged as: parent_directory/ bin/ all executables and script files data/ atomic radii and charge data files docs/ documentation examples/ ex1/ protein potentials: SUPEROXIDE DISMUTASE ex2/ test results: SPHERE ex3/ solvation energy: ACETATE ex4/ interaction energy: pK change in SUBTILISIN ex5/ non-linear case: DNA potentials DelPhi V3.0 -6- ex6/ intrinsic pKa calculation on rat trypsin ex7/ binding/packing energy on haemerythrin source/ gcs/ Graphics library cont2d/ 2-D contour program qdiff_v2/ original finite difference program (does non-linear) qdiffx_v3/ faster version of qdiff (~60 times). may not do all cases of non-linear, ie high charge, small scale utility/ Conversion programs, etc DelPhi V3.0 -7- _2. INSTALLATION Details will vary from system to system. This procedure is based on that for a CONVEX C1-XP machine running Unix. _2._1. TAPE/DIRECTORY STRUCTURE Copy all the files on the tape to a parent directory: Unix: % tar -xv Vax/VMS $ MOUNT/RECORDSIZE=80/BLOCKSIZE=1600 MUA0: ANSTAR (for Ansi tape) $ COPY/LOG MUA0:*.* *.* Vax/VMS $ MOUNT/FOREIGN MUA0: (for VMS ver- sion) $ BACKUP/LOG MUA0:*.* *.* _2._2. COMPILING EXECUTABLES In each source directory a Makefile exists (for VAX/VMS: *.make) which shows the dependencies and compilations neces- sary to produce the executables. All the makefiles are written for the convex c2. Equivalent makefiles for the Sil- icon Graphics Iris are in makefile.iris. Note also each Makefile has a copy in a unique named file suffix .cnvx in case any Makfile gets accidently overwritten by another dur- ing any file movement. Small changes to these files will be necessary for different systems. These differences include: - Any full path names referenced in the makefiles will have to be changed to those of your local system. - The name of the fortran compiler (eg Convex: fc, Cray: cft77, Iris: f77). - The -Olimit option (eg for the SGI Iris) may have to be changed depending on the memory size of your machine). - Flags for the fortran compiler (eg Convex optimization flags -O1, -O2 may not exist on non-vector machines); - Remove link to the Convex vectorization routines in /lib/libveclib, which of course won't exist on a non- vector machine; - Existence of various intrinsic functions such as cpu- time, which returns elapsed cpu time, and lnblnk, which returns the position of the last non-blank character in a string. A fortran version of some of these functions DelPhi V3.0 -8- is given in source/utility. The file nmptch.f provides intrinsic function renaming for different systems by calling dummy functions. This is handled by linking the programs to the library source/utility/kimlib.a. In particular the bit diddling routines iibset and ibtest, in routines wrteps.f, lookeps.f and reform.f may have to be renamed for your particular system. In each source directory compile the executable by typing: make Start with the utility directory, and MAKE the library kimlib.a. Next, if you want to use the 2-D contour program CONTOUR2D, make in the directory gcs as the graphics library is needed to link to CONTOUR2D. Note that all executables are moved to parent_directory/bin. To run these, this directory should be included in your path in the .login file, or the files moved to another bin directory. Note that the makefile for the utilities makes more than one execut- able, so type: make to make the utility library first, then: make program_name for each program _2._3. SCRIPT FILES The directory bin contains various Cshell scripts. assqdiffx generates a parameter file for QDIFFX, and makes logical unit assignments for QDIFFX. The script qdefault takes a *.pdb file name, uses default values for everything and runs QDIFFX, then invokes 2 other script files, mkcont and sur- face. QDEFAULT is really aimed at getting you up and running quickly by providing a complete run through using defaults for everything- qdefault, mkcont and surface in effect pro- vide template scripts which you can use to modify and learn how the programs work. The path name in qdefault will have to be changed to that on your system before running it. The script may need extensive modification (for other shells like Bourne, other versions of Unix), and only run under Unix of course. Other script files are: SURFACE, which start with an *.eps map, reformats it with REFORM, and con- tours it with BIOSYM's CONTOUR program (provided with INSIGHT) to produce a *.grd file. When displayed in INSIGHT this provides a nice representation of the molecule's sol- vent accessible surface; MKCONT starts with a *.phi file, and generates potential contours at +/- 0.25, 0.5, 1., 2., 4 kT/e (levels read in from file continp1) using CONTOUR. DelPhi V3.0 -9- Output is in a *.grd file which can be displayed in INSIGHT; _2._4. EXAMPLES The best way to check the installation is by working through the examples (see note on script file QDEFAULT above). In particular, it is recommended that example 2, on charges in a sphere be worked through carefully, and the results com- pared to those in section 9.2: The logfiles and *.frc output files should be compared with those provided from the test run on the convex (*cvx.log, *cvx.frc). This will provide a good check that the program is working, and also is a tutorial on how delphi can be used to calculate the various components of an electrostatic system. It is no exaggeration to say that if you understand the case of two charges in a sphere then that is 90% of continuum electrostatics! The input files, and the output files that are not in binary form (ie not *.phi, *.eps) are provided in examples/ex1...ex7. Explanation and analysis is also pro- vided in Section 9. Plot files that were generated from some of the *.phi potential map files are given in *.plt. Type (or cat) these out to a Hewlett Packard HP7475A or similar plotter to give plots that can be compared to those gen- erated by CONTOUR2D. These examples are taken from pub- lished papers that used this or similar forms of DelPhi, and these should be referred to for more details. The calcula- tions are typical of those that can be made with this pack- age, although for the purpose of illustration, several shortcuts were taken. Also read section 10, Hints. _2._5. PORTABILITY This package is written in Fortran 77 compatible code, except for the Graphics programs: gcshp.f, contour2d.f which use VAX/VMS extensions to Fortran 77 (notably enddo con- structions), as implemented on the Convex C-1 running Unix (CU). Differences between this and VAX/VMS (VV) include: - the bit diddling intrinsic functions. eg: VV: btest, CU: bitest A fortran version of the bit test function, bitest, and the bit set function, kibset, are given in nmptch.f in the utility directory. - the include statement: VV: include 'qdiffpar.h', CU: #include "qdiffpar.h" - the intrinsic function: VV: secnds, CU: cputime - source lines may extend past column 72, so compile option /EXTEND should be used on VV DelPhi V3.0 -10- - hash functions ichash.f and irhash.h allow a possible integer overflow on the hash variable n rather than explicit truncation, so should be compiled with /NOOVERFLOW option on VV. DelPhi V3.0 -11- _3. MAKEFILES These example makefiles illustrate the dependencies of vari- ous files that need to be compiled to produce the executable programs QDIFF, CONTOUR2D, and the the utility programs, and provide a guide to installation on Unix and other systems. This section presumes a familiarity with the Unix utility make. Actual makefiles used on your system will differ in details such as directory path names, compiler names, com- piler options etc. It should be straight forward to write analogous command files for other sytems, eg VMS. ********************** # Example Makefile for contour2d # Places output in directory /honig1/poisson/bin. This # destination directory name will vary from machine to machine # Source is compiled at first level of optimization (on a # Convex. Optimization and compiler options will vary from # machine to machine). It uses gcs library, which must have # been previously compiled (see following makefile) ********************** FFLAGS = -O1 DEST = /honig1/poisson/bin OBJ = contour2d.o konturs.o geom.o draw.o plot.o misc.o slice.o potential.o surface.o setup.o lines.o LIB = /honig1/poisson/gcslib # .f.o: fc -c $(FLAGS) $*.f # c2d: $(OBJ) $(LIB) fc $(FLAGS) $(OBJ) $(LIB) mv a.out $(DEST)/contour2d $(OBJ): contour2d.h ********************************************* Makefile for gcs plot routine library gcshp.a ********************************************* FFLAGS = -O1 OBJ = gcshp.o ngcshp: $(OBJ) fc $(FFLAGS) -c $(OBJ) ar vur gcshp.a gcshp.o ranlib gcshp.a ar vt gcshp.a ****************** Makefile for qdiffx ****************** DelPhi V3.0 -12- KIMLIB = /honig1/poisson/delphi_v2/source/utility/kimlib.a DEST = /honig1/poisson/bin VECLIB = /usr/lib/libveclib.a OBJ4 = qdiff4o.o elb.o up.o cent4.o cfind4.o ichash4.o irhash4.o rent4.o rfind4.o phintp4.o scaler4.o chrgit4.o setbc4.o conplt.o itit4k.o expand4.o mkeps4c.o wrteps4.o relfac4b.o rfe.o cse.o chrgsub4.o mem4.o chrg1sub.o non.o # # #----------------------------- # rules- cray: #.f.o: # cft77 $*.f #----------------------- # rules- iris: #.f.o: # fc77 -c $*.f #----------------------- # rules- convex, to f77 standard add flag -cfc: .f.o: fc -c -ep 2 -O3 $*.f #----------------------- # qdiffx #----------------------- qdiffx: $(OBJ4) fc $(OBJ4) $(KIMLIB) $(VECLIB) mv a.out $(DEST)/qdiffx $(OBJ4): qdiffpar4.h #------------------------------------ # ****************************** Makefile for utility programs ****************************** # BIN = /honig1 /poisson /bin KIMLIB = /honig1 /poisson /delphi_v2 /source /utility /kimlib.a #----------------- # rules- cray .f.o: cft77 $*.f #----------------- # rules- convex, to f77 standard add flag -cfc .f.o: fc -c -O1 $*.f #----------------- #FFLAGS = -no -db -cs FFLAGS = -O1 lookeps: lookeps.o lnblnk.o fc $(FFLAGS) -o lookeps lookeps.o lnblnk.o mv lookeps $(BIN) DelPhi V3.0 -13- phitoins: phitoins.o lnblnk.o fc $(FFLAGS) -o phitoins phitoins.o lnblnk.o mv phitoins $(BIN) reform: reform.o lnblnk.o fc $(FFLAGS) -o reform reform.o lnblnk.o mv reform $(BIN) phitombk: phitombk.o lnblnk.o fc $(FFLAGS) -o phitombk phitombk.o lnblnk.o mv phitombk $(BIN) mbkbin: mbkbin.o lnblnk.o fc $(FFLAGS) -o mbkbin mbkbin.o lnblnk.o mv mbkbin $(BIN) asciiphi: asciiphi.o lnblnk.o fc $(FFLAGS) -o asciiphi asciiphi.o lnblnk.o mv asciiphi $(BIN) phitopdb: phitopdb.o lnblnk.o up.o fc $(FFLAGS) -o phitopdb phitopdb.o lnblnk.o up.o mv phitopdb $(BIN) # for cray: # kimlib: cputime.o nmptch.o lnblnk.o # for convex: kimlib: cputime.o lnblnk.o ar ruv kimlib.a cputime.o ar ruv kimlib.a lnblnk.o ranlib kimlib.a ar vt kimlib.a DelPhi V3.0 -14- _4. FILES _4._1. FILE NAMING Most file types have default extensions to simplify things: file.pdb: Atomic coordinates, input to QDIFF, CONTOUR2D, PHITOPDB. Ouput from QDIFF and PHI- TOPDB. file.siz: Atomic radii, input to QDIFF. file.crg: Atomic charges, input to QDIFF. file.prm: Parameters for QDIFF. file.log: Logfile from QDIFF. contour.kt: Contour levels in kT/e, input to CONTOUR2D. file.phi: Binary potential map, output of QDIFF, input to CONTOUR2D, PHITOPDB, PHITOINS, PHITOMBK, ASCI- IPHI. file.big: Ascii version of file.phi, input/output of ASCIIPHI. file.ins: Binary output from PHITOINS, input to Biosym's INSIGHT contour routine. file.mbk: Binary brickmap form of potential map. Output from PHITOMBK, and input to Polygen's QUANTA pack- age file.mba: Ascii form of file.mbk file. Input/Output of MBKBIN. file.frc: Potentials and fields, output from QDIFF. file.plt: Plot file for HP75 plotter, output from CONTOUR2D. file.eps: Dielectric bit map, output of QDIFF, input of CONTOUR3D, LOOKEPS. _4._2. FILE FORMATS _4._2._1. FILE.pdb: (6A1,I5,1X,A4,A1,A3,1X,A1,I4,A1,3X,3F8.3,2F6.2,1X,I3) for atom record, where fields contain 'ATOM--' or 'HETATM' atom DelPhi V3.0 -15- serial number, atom name, alternate location indicator, residue name, chain identifier, residue sequence number, residue insertion code, x,y,z coordinates, occupancy, tem- perature factor, footnote number. QDIFF reads the records somewhat differently, treating the records before the coor- dinates as all ascii- See Section 6.5.4 for more details. _4._2._2. FILE.prm: parameters for QDIFF in free format (See Section 6.5.1 for more details): igrid== (odd) number of points in cubic lattice, min=5,max=65. perfil== percentage of grid box filled by molecule. offset== x,y,z of molecule centre in grid units from lattice centre. epsin, epsout== dielectric constants for molecule and solvent. rionst== ionic strength in moles/litre exrad, radprb== ion exclusion and probe radii in Angstroms. ibctyp== integer flag for boundary condition . iper(3)== logical flags for periodic boundary condi- tions. nlit== number of iterations with linear P-B equation. nnit== number of iterations with non-linear P-B equa- tion. iconc,ibios== logical flags for mode of output. isite== logical flag for site potential output. iatout== logical flag for modified *.pdb file output. toplbl== 60 characters of ascii title information for run. isph== logical flag for charge distribution algorithm. DelPhi V3.0 -16- _4._2._3. FILE.siz: Van der Waals radii to be assigned to each atom/residue type. 1: any number of lines with ! in the first position are treated as comments. 2: a single line of ascii information that DOESN'T have ! in the first column, which marks the positions of the various fields for human readability and indicates the start of the radius records, but is otherwise not used. 3: an indefinite number of lines each containing an atom label record a residue label record and a radius in angstroms in the format (A6,A3,F8.3). For example: aaaaaannnrrrrrrrr c 1.8 ca 1.7 ca pro 1.6 Using this file, all proline ca atoms will have radius 1.6A, all other ca atoms will have radius 1.7, and all other types of carbon atoms will have radius 1.8. _4._2._4. FILE.crg: Charges to be assigned to each atom/residue/#/chain type. 1: any number of lines with ! are ignored as comments. 2: a single line of ascii information that DOESN'T have ! in the first column which marks the positions of the various fields for human readability and indicates the start of the charge records, but is otherwise not used. 3: an indefinite number of lines each containing an atom label record, a residue label record, a residue number record, a chain label record and a charge in the format: (A6,A3,A4,A1,F8.3). For example: !example of charge file aaaaaarrrnnnncqqqqqqqq ca -0.2 ca pro -0.23 ca pro 1 -0.25 ca pro 1y -0.20 In this fictitious example all alpha carbons will have a charge of -0.2, except for proline C alpha's, which will have a charge of -0.2 on the y chain PRO 1, a charge of -0.25 on all other chain PRO 1 atoms, and a charge of -0.25 DelPhi V3.0 -17- on all other PRO's. Atoms that do not find a match in the *.crg file will be neutral _4._2._5. FILE.frc: List of potentials and fields at cooordinates in *.pdb file 12 lines of ascii header information, followed by a variable number of records written as: 230 format(8G10.3) write(16,230)xo,chrgv,phiv,fx,fy,fz where xo(3) are x,y,z coordinates of charge, chrgv is the charge value, phiv is the potential in kT/e at that point, and fx,fy,fz are the field components in kT/a/A. (1 kT/e = 25.6mV = 0.593 kcal/mole/e). _4._2._6. CONTOUR.KT: Free format, one value per line, any number of lines where values are contour levels in kT/e _4._2._7. FILE.phi: Unformatted file character*20 toplabel character*10 head,character*60 title real*4 phi(65,65,65) character*16 botlabel real*4 scale, oldmid(3) where phi is potential map, scale, oldmid relate grid cordi- nates back to real space coords. other variables are text information. _4._2._8. FILE.eps: Unformatted file real*4 oldmid(3) integer*4 imap integer*2 eps(5,65,65,3) write (10) imap, scale, oldmid write (10) eps DelPhi V3.0 -18- where eps is the dielectric map, scale,oldmid relate grid coords. to real space coords, imap is unused flag DelPhi V3.0 -19- _5. PROGRAMS QDIFF/QDIFFX Subroutines: elb.f up.f cent.f cfind.f ichash.f irhash.f rent.f qdiff2.f rfind.f phintp.f scaler.f chrgit.f setbc2.f conplt.f itit2.f expand.f mkeps.f wrteps.f qdiffpar.h Purpose: solves finite difference form of non-linear Poisson- Boltzmann equation. Input: file.pdb, file.prm, file.siz, file.crg (file1.pdb, file1.phi) Output: file.log, file.phi, file.eps (file.frc, file2.pdb) CONTOUR2D Subroutines: cont2dsub.f, read_atom.f, gcshp.a (graph- ics library) Purpose: interactive production of 2D plots of any arbitrary slice through potential maps, and out- line of slice through molecule. The program is totally menu driven, so the best way to learn this program is to play with it. Input: file.phi, file.pdb, contour.kt. Menu driven Output: colour plots on AED512, HP75A, mono-chrome on Tektronix 4014 emulators such as the VT241, Visual 550. Output to HP75 is written to unit 98 for sub- sequent tranfer to plotter. GCSHP.A GCS graphics package library required by CONTOUR2D. PHITOINS Subroutines: none Purpose: convert output potential map format from QDIFF (file.phi) to INSIGHT input format (file.ins) and vice-versa. Input: file.phi (file.ins). Prompts for file names. Output: file.ins (file.phi) PHITOMBK DelPhi V3.0 -20- Subroutines: none Purpose: convert output potential map format from QDIFF (file.phi) to Peter Goodfords brickmap format which can be read by QUANTA (file.mbk), and I think also by FRODO. Input: file.phi. Prompts for file names Output: file.mbk PHITOPDB Subroutines: none Purpose: takes output potentials from QDIFF output (file1.phi) and writes the potentials at the coor- dinates of an input brookhaven protein data bank format file (file2.pdb), to the B factor field of a modified protein data bank file (file3.pdb). The protein can then be displayed and colored by potential by any display program that can color atoms by B, or temperature factor. Input: file1.phi file2.pdb Output: file3.pdb ASCIIPHI Subroutines: none Purpose: Enables transfer of binary potential files: takes output potentials from QDIFF output (file.phi) and reformats and compresses potentials to integer*4 format, writing result as an ascii file, and vice-versa. The ascii file may then be transferred between machines, and reformatted as a binary *.phi file for display. Input: file.phi/big. Prompts for file name Output: file.big/phi. Notes: i) a version of ASCIIPHI must exist on both machines. ii) On compression to an integer, the potentials are truncated so loss of precision results- the resultant files should only be used for display, not high accuracy calculations. MBKBIN Subroutines: none DelPhi V3.0 -21- Purpose: Enables transfer of binary potential files: takes output potentials from PHITOMBK output (file.mbk) writing result as an ascii file, and vice-versa. The ascii file may then be transferred between machines, and reformatted as a binary *.mbk file for display programs. Input: file.mbk/mba. Prompts for file name Output: file.mba/mbk. Notes: i) a version of MBKBIN must exist on both machines. ii) On conversion the potentials are truncated so loss of precision results- the resul- tant files should only be used for display, not high accuracy calculations. NMPTCH A collection of subroutines that rename or provide for some non-existent intrinsic functions- to improve portatibility without changing source code. LOOKEPS Subroutines: none Purpose: to display a slice trhough a dielectric map on terminal in a line plot form, in order to check the dielectric map. Input: file.eps. Prompts for files names. Output: to alpha-numerics terminal. REFORM Subroutines: none Purpose: reformats a *.eps file into *.phi form- this enables the solvent accessible surface to be displayed using the same methods as used for the potentials ('potential' values range from Input: file.eps. Prompts for files names. Output: file.phi DelPhi V3.0 -22- _6. PROGRAM QDIFF last revision: 11 nov 1988 All arrays, including the maximum lattice dimension, are dimensioned in an include file: qdiffpar.h. _6._1. BRIEF DESCRIPTION QDIFF is a program to solve the (non-linear) Poisson- Boltzmann equation by finite difference methods on a NSZxNSZxNSZ cubical lattice. It can handle both the linear and non-linear forms of the equation. The program can use: periodic or focussing boundary conditions on the lattice edge; a variable ion exclusion (or Stern) layer around the molecule; a variable probe radius to define the solvent accessible surface; a variable bulk solvent ionic strength; any required dielectric constant values for the molecule and solvent. For input the program requires four files, containing i) Run parameters, ii) atom coordinates iii) Rules for assigning atomic radii, and iv) Rules for assigning atomic charges. For focussing boundary conditions a fifth file containing potentials calculated from a previous run of the program is also needed. The output is a 3D array giving the potential (or net sol- vent ion concentration) at each lattice point and (optional) a list of potentials at charge locations contained in another atom coordinate file, _6._2. ALGORITHM QDIFF relaxes the difference equation form of the nonlinear Poisson-Boltzmann equation (Equation 1) Pn = (1-W)Po + W.sum(Ei.Pi) + 4PI.Qo/ (sum Ei + Do(1 + Po**2/6 + Po**4/120...) i = 1,6 where W is a relaxation parameter between 0 and 2 that con- trols the rate of iteration. Po is the current estimate of the potential at the o'th lat- tice point. Pn is next estimate of the potential at the o'th lattice point. Pi is potl. at 6 lattice points surrounding the o'th. DelPhi V3.0 -23- Ei is the dielectric constant assigned to the lattice lines connecting the grid point to its 6 neighbours. Q is charge at the o'th lattice point. D is debye huckel term at the o'th lattice point. Terms in powers of Po in the rhs denominator represent suc- cessive non-linear corrections to the linear equation. The current version of the program includes two such correction terms, which appears to be sufficient for the types of potentials encountered in aqueous macromolecular systems. The largest permissible grid size is NGRID (default=65) CUBED. The difference equation is iterated until the change in potential at the grid ponts is smaller than some required value. Grid points on the boundary of the lattice do not have a full set of six neighbors unless periodic boundary conditions are applied to them, hence they are not relaxed, and initial values of the potential must be provided for them through some kind of boundary condition. _6._3. INPUT FILES logical units for read/write should be assign to file names at the system level. Use of certain default file types will simplify things UNIT NAME ------------------------------------------- 10 *.prm input parameters 11 *.siz atomic van der Waals radii 12 *.crg atomic charges 13 *.pdb brookhaven format atom coordinates (15) *.pdb list of atom coordinates if site potential output is required (18) *.phi binary potential needed if focuss- ing boundary conditions are selected. ------------------------------------------- Units 15 and 18 are optional depending on values for the site potential output flag, and boundary condition option respectively, given in the parameter file. _6._4. OUTPUT FILES UNIT NAME ------------------------------------------- 6 *.log logfile when run in batch 14 *.phi output binary potential map (16) *.frc list of potentials at coordinates DelPhi V3.0 -24- contained in the unit 15 pdb file 17 *.eps binary dielectric map output (19) *.pdb output atom coordinate file with radii and charge records added ------------------------------------------- Units 16 and 19 are not always written, depending on values assigned in the parameter file for the site potential output flag, and modified pdb file output flag options respec- tively. _6._5. INPUT FORMAT A description of the input required by QDIFF is given below, with some suggested parameter values. However no hard and fast rules can be given since in many cases a tradeoff is involved. More details are given in the notes, Section 6.8. _6._5._1. UNIT 10 Default extension .prm. Contains input parameters read in free format. Values are real unless stated otherwise: line 1== IGRID ---(odd) integer number of points per side of the cubic lattice, min=5, max=65 (=NGRID). A larger grid size will in general mean a better resolution representation of the molecule on the lattice, and hence more accurate potentials, but will require more time (see note 1). line 2== PERFIL ---percentage (> 0) of the lattice that the largest of the x,y or z linear dimensions of the 'molecule' will fill. This will determine the scale of the lattice (grids/angstrom). The per- centage fill of the lattice will depend on the application (see note 2). line 3== OFFSET(3) ---x,y,z position in GRID UNITS from centre of lattice where the centre of molecule will be positioned. ie 0,0,0 will put the centre of molecule at the centre of the grid; x,x,x {x = (IGRID+1)/2.} will put it at one corner; -x,-x,-x will put it at opposite corner. See note 3. line 4== EPSIN, EPSOUT ---dielectric constants (>=1.0) for the molecule and surrounding solvent. line 5== RIONST ---ionic strength of solvent, in moles/litre (>= 0.). line 6== EXRAD, RADPRB ---thickness of the ion DelPhi V3.0 -25- exclusion layer around molecule; and the radius of the solvent probe molecule that will define sol- vent accessible surface in Lee and Richard's sense (both in angstroms,>0.0). See note 4. line 7== IBCTYP ---integer flag specifying the type of boundary condition imposed on the edge of the lattice- allowed options: 1== potential is zero. 2== Approximated by the Debye/Huckel potential of the equivalent dipole to the molecular charge dis- tribution, where phi = q+.exp(-r/d)/(diel*r) + q-.exp(-r/d)/(diel*r) where phi is the potential estimate at a given lattice boundary point, q+ (q-) is the sum of all positive (negative) charges, and r is distance from the point to the centre of positive (nega- tive) charge, d is the debye length, and diel is the solvent dielectric constant. 3== focussing, where a potential map from a previous calculation is read in on unit 18, and values for the potential at the lattice edge are interpolated from this map- clearly the first map should have been generated with a coarser grid (greater dis- tance between lattice points) and positioned such that current lattice lies completely within old lattice or program will protest. (See note 5) 4== Approximated by the sum of Debye/Huckel poten- tials of all the charges. phi = sum over i [qi.exp(-r/d)/(diel*r)] where now qi is the i'th charge, and r is distance from the lattice boundary point to the charge. line 8== IPER(3) ---three logical flags (t/f) for periodic boundary conditions for the x,y,z edges of the lattice respectively. Note that periodic boundary conditions will overide other boundary conditions on edges to which they are applied. See note 6. line 9== NLIT ---integer number (> 3) of iterations with linear equation. See note 7. line 10== NNIT ---integer number (> = 0) of non-linear iterations. If linear PB equation only is DelPhi V3.0 -26- required, set NNIT=0. line 11== ICONC,IBIOS ---two logical flags (t/f) con- trolling the output format of the potential map on unit 14. iconc=f produces standard potential output in kT/e (aproximately equal to 25.6 mV at 25oC, or to 0.593 kcal/mole of charge). iconc=t will give net solvent ion concentration output in M/l, where for every lattice point inside the molecule the concentration is 0, and the outside concentration is obtained from the ionic strength*2*sinh(potential). ibios=t INSIGHT output format *.ins file is out- put. ibios=f DELPHI output format *.phi file is output. line 12== ISITE ---logical flag (t/f). isite=t will cause potentials and fields at specified coordi- nates to be written to unit 16. Coordinates must be provided in a *.pdb file which will be read in on unit 15. This pdb file may be the same as that assigned to unit 13. line 13== IATOUT ---logical flag (t/f). iatout=t will produce a modified PDB file written on unit 19, containing the radius and charge assigned to each atom written after the coordinates, in the fields used for occupancy and B factor. It is recom- mended that this option be set initially so that you can check that all the radius and charge assignments are correct. An additional check on the charge assignment can be made by looking at the total charge written to the log file. Also this modified PDB file is required for displaying cross-sections of the molecule surface in the 2D potential contour program CONTOUR2D line 14== TOPLBL ---60 characters of ascii information that will be written as a header in potential map output for identification purposes. line 15== ISPH ---logical flag (t/f) controlling the algorithm by which the charge is assigned to the lattice points. Set this to false initially, and read note 8 for further details. DelPhi V3.0 -27- _6._5._2. UNIT 11 Default extension *.siz. List of rules describing the van der Waals radii to be assigned to each atom/residue pdb record type. The format is described in Section 4.2.3 on files. Note the atom and residue fields ignore case and leading blanks. The residue field may be left blank (wild card), causing a match with the given atom type of any resi- due. ONLY if the residue field is left blank, the LAST 5 characters of the atom record maybe left blank. In this case all atom types beginning with the letter in column 1 will be matched. Records of greater specificity overide those of less specificity. Beware of ambiguities like calcium (ca) and alpha carbon! All atoms of an input *.pdb file must be assigned a radius through the *.siz file, even if it is 0, or an error will be flagged. _6._5._3. UNIT 12 Default extension *.crg. List of rules describing the atomic charges to be assigned to each atom/residue/number/chain pdb record type. The format is described in Section 4.2.4 on files. The ascii fields for atom, residue, number and chain ignore case and leading blanks. Any field except the atom name may be left blank and will be treated as a wild card. Records of greater specificity overide those of less speci- ficity as for the *.siz file above. search order: atom_res_num_chain atom_res_num______ atom_res_____chain atom_res__________ atom_____num_chain atom_____num______ atom_________chain atom______________ Atoms that do not find a match in the *.crg file will be neutral (q=0.0) DelPhi V3.0 -28- _6._5._4. UNIT 13 Standard format Brookhaven protein data bank file containing atom labels and coordinates. Only records starting with ATOM or HETATM are used. Default extension *.pdb. Format (as assumed by the program: beware many minor and not so minor differences abound in the literature and on disk!): (A6,5X,A5,X,A3,X,A1,A4,4X,3F8.3) header, atom name, residue name, chain name, residue number, x,y,z. coordinates. Note that the program treats the residue number as an ascii string, not an integer. _6._5._5. UNIT 15 Default extension: *.pdb. list of coordinates where site potentials are output, format as for unit 13 _6._5._6. UNIT 18 default extension *.phi potential map for focussing boundary conditions. Potentials are in kT/e (25.6mV, 0.593 kcal/mole at 25oC). Format: unformatted (binary file) character*20 uplbl character*10 nxtlbl,character*60 toplbl real*4 phi(65,65,65) character*16 botlbl real*4 scale,oldmid(3) uplbl,nxtlbl,toplbl,botlbl are ascii information. Phi is the 3D array conatining values of potential for all the lattice points. Index order is x,y,z. Scale is lattice scale in grid squares/angstrom. Oldmid is the x,y,z coordinates in real space (angstroms) of the centre of the lattice: thus the real space coordinates x,y,z of the lattice point for phi(IX,IY,IZ), for the case where IGRID = 65, are: x = (IX - 33)/scale + oldmid(1) y = (IY - 33)/scale + oldmid(2) z = (IZ - 33)/scale + oldmid(3) where 33 = (65+1)/2 is the middle point of the grid. _6._6. OUTPUT FORMATS _6._6._1. UNIT 6 Output from program, including error messages and conver- gence history. When run interactively, appears on standard DelPhi V3.0 -29- output. Default extension *.log when run in batch _6._6._2. UNIT 14 If flag IBIOS is false, then output is in DELPHI format, default extension *.phi. Output potential map or concentra- tion map, with format same as for unit 18 above. When igrid<65(NGRID), then potentials are interpolated to produce 65X65X65 density lattice for compatibility with display/analysis programs If flag IBIOS is true, then output is in INSIGHT format, default extension *.ins. This is an unformatted (binary) file character*132 toplbl !ascii header integer*4 ivary !0 => x index varys most rapidly integer*4 nbyte !=4, # of bytes in data integer*4 inddat !=0, floating point data real*4 xang,yang,zang !=90,90,90 unit cell angles integer*4 intx,inty,intz !=igrid-1, # of intervals/grid side real*4 extent !maximum extent of grid real*4 xstart,xend !beginning, end of grid sides real*4 ystart,yend !in fractional real*4 zstart,zend !units of extent write(14)toplbl write(14)ivary, nbyte, intdat, extent, extent, extent, xang, yang, zang, xstart, xend, ystart, yend, zstart, zend, intx, inty, intz do k = 1,igrid do j = 1,igrid write(14)(phimap(i,j,k),i=1,igrid) end do end do Note that for grid sizes less than 65, INSIGHT format files will occupy less disk space than the corresponding DELPHI files. *.ins files are designed as input to a Biosym Corp. stand alone utility called CONTOUR, supplied with INSIGHT Version 2.4. This program will produce contour files for display with INSIGHT. DelPhi V3.0 -30- _6._6._3. UNIT 16 Default extension *.frc. A list of potentials and fields at coordinates in *.pdb file read on unit 15. Format: 12 lines of ascii header information, followed by a variable number of records written as: 230 format(8G10.3) write(16,230)xo,chrgv,phiv,fx,fy,fz where xo(3) are x,y,z coordinates of charge, chrgv is the charge value, phiv is the potential (in kT/e) at that point, and fx,fy,fz are the field components (in kT/e/Angstrom). The last line of the file is the sum of chrgv*phiv/2 over all the charges in the file. This quantity can be used for calculating solvation and interation energies- see examples 2 and 3 in Section 7. _6._6._4. UNIT 17 Dielectric bit map, default extension: *.eps. There are 3*65*65*65 lines joining neighbouring grid points, 65*65*65 each in of the x,y,z directions. The midpoint of each line is given a value of 1 if it lies within the solvent accessi- ble volume of the molecule, 0 if outside. By this means the shape of the molecule is defined, and space is separated into two regions with different dielectrics. For compact output purposes the array of REAL*4, epsmap(65,65,65,3), is compressed into an INTEGER*2 array, neps(5,65,65), by bit- mapping: the first index of epsmap, range 1-65 is compressed into the first index of neps, range 1-5, where the indices 1-16 go into bits 0-15 of the word with index 1, indices 17-32 -> bits 0-15 of word with index 2 etc. The array neps is then written to an unformatted binary file: write (17) imap, scale, oldmid write (17) neps where imap is an unused integer*4 flag and scale, oldmid(3) are real*4 scaling information as above. _6._6._5. UNIT 19 List of input coordinates read from unit 13, default exten- sion *.pdb, with the assigned atomic radius and charge of each atom placed in columns 55-60 and 61-67 in formats F6.2,F7.3 respectively. Other formats are same as for unit 13 DelPhi V3.0 -31- _6._7. OVERALL PROGRAM FLOW: In general outline, the program flow is as follows: Header with time, date written. Parameters are read from unit 10 and echoed to output. Radius data read from unit 11 and stored in hash table for efficient look up. Charge data read from unit 12 and stored in hash table for efficient look up. Atomic coordinates are read from unit 13 and scaling is computed. Arrays that describe 3D distribution of dielectric and ion accessibility in space are initialized. Atomic coordinate and label data read from unit 13 again, and radii and charge assigned to each atom. Dis- tribution of dielectric values, ionic strength parame- ters and charge values over the lattice are determined from the coordinate/charge/radius data Atom file with charge and radii records outputted on unit 19 if requested. Centres of + and - charge distributions, and net charge calculated for check on charge distribution. Arrays are set up for the difference eqn. iteration. Boundary values are set- either through analytical expressions or from interpolation into the potential map read on unit 18. Linear then non-linear iterative relaxations are done and convergence histories are displayed as simple log/lin line plots. Potentials are converted to concentrations if requested. Potentials and fields are calculated at the coordinates of the atoms read on unit 15, and outputted to unit 16 if requested. Grid of potentials is increased in fineness to 65x65x65 if necessary and outputted on unit 14. The dielectric map is outputted on unit 17. DelPhi V3.0 -32- _6._8. NOTES ON INPUT PARAMETERS _6._8._1. IGRID The number of iterations required to reach a certain conver- gence will increase approximately linearly with parameter IGRID. Since the time per iteration will go up as the cube of this parameter the amount of calculation will thus increase at about the fourth power of IGRID. For some applications a smaller value of IGRID will be adequate (eg 45 to 55), but it is recommended that the largest value pos- sible be used (currently set at 65 by the parameter NGRID in the qdiffpar.h file). _6._8._2. PERFIL A large % fill will provide a more detailed mapping of the molecular shape onto the lattice. On the other hand it will bring the dielectric boundary of the molecule closer to the lattice edge. This will cause larger errors arising from the boundary potential estimates, which are set to zero or approximated by coulombic/debye-huckel type functions using a uniform solvent dielectric. At higher salt concentra- tions, and for weakly charged molecules the potentials at the boundary, and consequently, the error, will be smaller. Smaller percentages will increase the accuracy of the boun- dary conditions, but result in a coarser representation of the molecule. For a single run, a reasonable compromise seems to be about 60% for looking at potentials outside a protein, and 90% for solvation energy calculations. The tradoff involved in scaling can at the cost of extra comput- ing be avoided by FOCUSSING. Start with a small percentage, say 10-20%, using zero or debye-huckel boundary conditions, and then focus in to say 90%, in one (or two) stages, using focussing boundary conditions for the second (and third) runs. It is not necessary for the molecule to lie completely within the grid although then the potential boundary condi- tions must be generated by focussing. However when calcu- lating solvation energies with box fills of > 100% remember that unexpected results may be obtained since parts of the surface, (and perhaps some charges) are not included in the grid (see notes on QDIFFXS regarding energy calculations). In any quantitative calculation, it is advised that the largest scale possible be used, preferably above 2 grids/angstrom (for larger molecules this may not be possi- ble without increasing IGRID, and thus NGRID in the *.h include file, above 65). Whatever the grid scale, calcula- tions should be repeated at different scales to assess the size of lattice resolution errors. DelPhi V3.0 -33- _6._8._3. OFFSET This parameter is used to specify the grid point at which the centre of the molecule [pmid(3)] is placed, or con- versely, what point of the molecule [oldmid(3)] is placed at the centre or the grid. The relationship between real space r(i) and grid g(i) coordinates for a grid size of igrid, with a scale of gpa grids/angstrom is as follows: The centre of the grid is midg = (igrid+1)/2 oldmid(i) = pmid(i) - OFFSET(i)/gpa g(i) = (r(i) - pmid(i))*gpa + midg + OFFSET(i) r(i) = (g(i) - midg)/gpa + oldmid(3) The scale, gpa, and oldmid are printed in the logfile. Note that a certain error inevitably results from the map- ping of the molecule onto the grid. By moving the molecule slightly (changing OFFSET between 0,0,0 and 1,1,1) and repeating the calculations it is possible to see whether the results are sensitive to the particular position on the grid, and if so, to improve the accuracy by averaging (This is related to rotational averaging, discussed in the J. Comp Chem paper of Gilson et al.). However using a larger scale is a more effective way of improving accuracy than averag- ing. To check on the positioning of the molecule within the grid as determined by PERFIL and OFFSET, the output dielectric bit map, FILE.eps, can be examined by taking slices using the utility program LOOKEPS. _6._8._4. EPSIN, EPSOUT A value of epsout=1 corresponds to the molecule in vacuum, epsout=80 to the molecule in water, and epsout=epsin to a reference state where there is no dielectric boundary. Depending on the application runs with epsout equal to either of these values may be used to represent different states in a thermodynamic cycle. A value of epsin=1 corresponds to a molecule with no polarizability- a state of affairs assumed in most molecular mechanics applications. Epsin=2 represents a molecule with only electronic polariza- bility (ie assuming no re-orientation of fixed dipoles- pep- tide bonds etc). A value of 2 is based on the experimentally observed high frequency dielectric behavior of essentially all organic materials. Epsin=4-6 represents a process where some small reorganization of molecular dipoles occurs which is not represented explicitly (for example in modelling the effects of site directed mutageneis experiments, when the DelPhi V3.0 -34- structure of the wild type, but not mutant protein is known). This value is based on a number of experimental and theoretical papers (eg M.K. Gilson and B. Honig, Biopoly- mers, 25:2097 (1986)) which indicate that a macroscopic material which consisted of similar dipole density, moment and flexibility as globular proteins would have a dielectric of 4-6. In modelling any process where a large reorienta- tion of dipoles, or large conformational change occurs, ie upon folding or denaturation, then using a dielectric con- stant for the molecule would be inappropriate, and the change in conformation should be modelled explicitly. EXRAD,RADPRB The first parameter, EXRAD, in combination with the atomic van der Waals radii in the *.siz file, determines the regions of space, and hence the lattice points, which are inaccessible to solvent ions (ie where Do=0 in equation 1). The second parameter, RADPRB, in combination with the atomic van der Waals radii in the *.siz file, determines the regions of space, and hence the lattice points, which are inaccessible to solvent molecules (water) (ie where Ei=EPSIN in equation 1). Suggested values are EXRAD = 2.0 for sodium chloride, and RADPRB = 1.4-1.8 for water. To understand how these parame- ters work, you should be familiar with the concepts of con- tact and solvent accessible surface, as discussed by Lee and Richards, and by Mike Connolly. For the purpose of QDIFF, a solvent ion is considered as a point charge, which can approach no closer than its ionic radius, EXRAD, to any atoms van der Waals surface. The ion excluded volume is thus bounded by the contact surface, which is the locus of the ion centre when in van der Waals contact with any accessible atom of the molecule. A zero value for EXRAD will just yield the van der Waals volume. A non zero value of EXRAD will thus introduce a Stern, or ion exclusion layer around the molecule where the solvent ion concentration will be zero, and whose dielectric constant is that of the solvent, EPSOUT. For the purpose of QDIFF, any region of space that is acces- sible to any part of a solvent (water) molecule is con- sidered as having a dielectric of EPSOUT. A value of zero for RADPRB used with a *.siz file containing the standard van der Waals radii values will assign any region of space not inside any atom's van der Waals sphere to the solvent. Note that this can include regions of the molecule not actually accessible to the solvent, such as the interstices where the spherical atoms pack together in the molecule's interior, which would erroneously be assigned the DelPhi V3.0 -35- solvent dielectric. Thus this combination of parameters should generally not be used. If a constant distance (say equal to the solvent molecule radius) is added to each radius in the *.siz file, and RADPRB is set to 0, then any point lying within the contact surface (as defined by by Lee and Richards) will be assigned the molecule dielectric. Note that on the surface a shell of space that is in fact accessible to the solvent will be included as part of the molecule dielectric- This could be used for example to model a shell of immobilized water around a molecule that had a low dielectric constant. If RADPRB is not zero, and standard van der Waals radii are used (default situation), then the solvent accessible volume will be generated. This is the region inaccessible to any part of a solvent probe sphere of radius RADPRB. Its surface as define by Lee and Richards, consists (Connolly) of convex portions composed of the atomic van der Waals surfaces when the probe sphere is in contact with one atom, and concave portions composed of the probe sphere surface, when the latter is in contact with two or more atoms. _6._8._5. IBCTYP Unless focussing is being used, it is recommended that the coulombic boundary condition (ibctyp=4) be used. For focussing boundary conditions, the program reads in a poten- tial map from a previous run, and compares the scale of the focussing map with that for the current run. If they are the same, it assume that this is a continuation of a previous run, and iteration of the potentials contained in the previ- ous potential map is continued. If the scales are not the same, it checks to ensure that the new lattice lies com- pletely within the old lattice before interpolating the boundary conditions. _6._8._6. IPER Periodic boundary conditions can be applied in one or more of the x,y or z directions. When applied, the potential at each periodic lattice boundary point is iterated by equation 1 by supplying its missing neighbor(s) from the correspond- ing point on the opposite edge of the lattice. This can be used for example to model an infinite length of DNA. Assume that the helical axis of the DNA in the *.pdb file is aligned along the Z axis. The periodic boundary flags are set to false,false,true, and the percent fill of the box, PERFIL, is adjusted so that an integral number of turns just fill the box in the Z direction. Normal boundary conditions are applied to the X,Y boundaries. DelPhi V3.0 -36- By setting two, or three of the boundary flags to true, one can simulate 2 dimensional or 3 dimensional cubic lattices of molecules. _6._8._7. NLIT The convergence behaviour of the finite difference procedure is reported in the log file as both the mean and maximum absolute change in potential at the grid points between suc- cessive iterations. The latter is probably more important since it puts an upper bound on how much the potential is changing at the grid points. It is suggested that sufficient iterations be performed to give a final maximum change of less than 0.001 kT/e. The number of iterations per se is not important, as long as its sufficient to give the required convergence. The convergence behavior can also be judged from the slope of the semi-log plot of the mean and max changes given in the log file. NLIT is best determined by experience, since the convergence rate depends on several factors. Start with say 100 iterations, and then increase the number of iterations until sufficient. Note that a run can be restarted by using focussing boundary conditions with exactly the same PERFIL and OFFSET values (see note 5). Some guidelines are: The number of iterations needed will increase with grid size. It will decrease with decreasing PERFIL, since the potentials converge more rapidly in the solvent. It will decrease with increasing ionic strength. The number is fairly insensitive to the size and number of charges on the molecule. Convergence will slow with use of the non-linear equation (NNIT not zero). It is suggested that for more rapid con- vergence and greater numerical stability do 30% linear iterations, then refine the solution with 70% non- linear iterations. _6._8._8. ISPH If an atomic charge does not lie exactly on a grid point, then it must somehow be distributed onto the grid points. If this flag is set false, the standard algorithm is used which distributes charge to nearest 8 grid points (quick and simple, see the Proteins paper of Klapper et al.). If this flag is set true, then an algorithm is used which gives a more spherically symmetric charge distribution, although the charge is now spread over a wider region of space. For cer- tain cases this gives higher accuracy for potentials less than 3 grid units from a charge (see J.Comp. Chem paper), although this point has not been exhaustively explored. DelPhi V3.0 -37- _7. PROGRAM QDIFFX last revision: 22nd may 89 All arrays, including the maximum lattice dimension, are dimensioned in an include file: QDIFFpar4.h. source files specific to QDIFFX have the postfix 4. _7._1. BRIEF DESCRIPTION The program QDIFFX supersedes QDIFF except when you bomb out when using the non-linear Poisson-Boltzmann equation option. QDIFFX uses a superior algorithm for solving the finite difference matrix equation, plus a speeded up setup routine. It should run 20-30 times faster. Input and out- put are similar to that of QDIFF. Parameter files generated to run QDIFF are compatible with QDIFFX, and apart from the increase in speed, changeover should be transparent to the user. However to take advantage of QDIFFX, users should take note of the differences between the programs, described below. The assignment file for QDIFFX is ASSQDIFFX, as ASSQDIFF is for QDIFF. It handles the assignment of file names to logi- cal unit numbers, and allows you to create an extended parameter file. The differences between QDIFF and QDIFFX that affect the user refer mainly to the unformatted I/O options now avail- able. Since charges and radii have to be assigned to a molecules atoms before the Poisson Boltzmann equation can be solved, the program has assignment routines. Since many runs are often done on the same protein/dna/whatever, it seemed sensible to be able to avoid the repetition of this by writing out a file with the coordinates, radii and charges. Secondly, since unformatted read/writes are much quicker, we store the information in an unformatted file, typically with extension Further, since assignments of charge are made if a site potential file is to be written, one might want to avoid this too, so there is an option to store the frc-type pdb file in an unformatted file, so as to be read in rapidly on successive runs. Finally, the site potential file itself can be written in unformatted form. Unformatted versions of the files required for input do not initially exist- only formatted *.pdb files exist. The unformatted versions are generated by QDIFFX itself, by set- ting certain output flags in the parameter file (lines 16 DelPhi V3.0 -38- and 17). Old parameter files that do not contain these lines will by default not write unformatted files. Once the unformatted files exist, they can be linked to the appropri- ate input units. On subsequent runs of QDIFFX, the program will automatically read the files as formatted/unformatted, depending on what type of file is linked. _7._2. UNFORMATTED VERSIONS OF INPUT FILES for QDIFFX UNIT NAME ------------------------------------------- 13 *.bb1 unformatted input atom coordi- nates: generated from a previous run writing to unit 20 15 *.bb2 unformatted list of atom coordi- nates if site potential output is required: gen- erated from a previous run writing to unit 21 ------------------------------------------- _7._3. UNFORMATTED OUTPUT FILES for QDIFFX UNIT NAME ------------------------------------------- 20 *.bb1 unformatted output atom coordi- nate file with radii and charge records added for faster, unformatted input on unit 13 in subsequent QDIFFX runs 21 *.bb2 unformatted list of coordinates for site potentials1 generated from the formatted pdb file attached to 15 22 *.bb2 unformatted list of potentials at coordinates read from the unformatted coordinate file attached to unit 15 ------------------------------------------- _7._4. FORM OF UNFORMATTED OUTPUT FILES The format of files written on units 21 and 22 are both: inum,xo,chrgv,pot,ex,ey,ez where inum= atom number xo = three entry array with atom coordinates chrgv=any charge associated with the atom DelPhi V3.0 -39- pot=the potential ex,ey,ez = the field at the atom. For the file on unit 21, pot, Ex, Ey and Ez are all set at zero. The format for the unformatted pdb file is: xo,rad,chrgv where rad is the atom radius, other quantities as before. _7._5. INPUT PARAMETER FILE FORMAT line 9== NLIT ---integer number (> 3) of iterations with linear equation. If the number is replace by 0, or the letter A, (automatic) the optimal number of iterations required for convergence as judged from the spectral radius of the finite difference matrix will be used. Generally between 80-150 iterations will be sufficient. Some experimenta- tion and an examination of the convergence behavior recorded in the logfile will soon show the correct value to use. line 10== NNIT ---integer number (> = 0) of non-linear iterations. Generally 100-200 iterations will be sufficient. It is possible that for highly charged molecules, and high salt concentrations that the non-linear part of QDIFFX will not con- verge, in which case QDIFF should be used. In any case, when using the non-linear form, the conver- gence behavior should be closely monitored. line 16== logical flag, TRUE: write out an unformatted pdb file with radii, charges, on unit 20. line 17== logical flag, TRUE: write out an unformatted pdb file for generating site potentials, on unit 21 line 18== upto three letters which switch on the calcu- lation of certain electrostatic energy quantities by the program: G grid energy (now called total energy). C coulombic energy, S for induced sur- face, or solvation energy (now called reaction field energy). See example 2, and the references for a detailed description of these quantities. The grid energy is that obtained from the poten- tial at each charge WITHIN the grid, multiplying DelPhi V3.0 -40- by that charge, and summing over all such sites. This is not a meaningful number in itself, but can be useful in extracting solvation energies, salt effects etc. The coulombic energy is calculated via coulombs law, and is the energy to bring the charges present from infinity to their final resting place, in a dielectric of that specified as being internal. This includes all charges, not just those in the grid. The third energy term is obtained by calculating the induced surface charge at each surface point within the box, then using these charges to calcu- late the potential at every charge, not just those in the box. If the molecule lies entirely within the box, and there is no salt present, this corresponds to the energy of taking the molecule from a solvent of dielectric equal to that of its interior, to that of the exterior. Results are not accurate if the grid spacings are less than one grid per angstrom (see QDIFFXS, which calcu- lates this quantity more accurately). Warning: The later two calculations require square roots. If the number of charges is large (e.g. if one is using a partial charge set), then one may be better off doing two runs for the sol- vation energy, using the grid energy approach, which needs no square roots (see example 3). Line 19== two logical flags, two integers. controls log file output characteristics. The first logical flag toggles the plot in the log file of the con- vergence. The second logical flag toggles the lengthy potential output to the log file. This output is mainly for debugging and reproducibility checks. The two integer parameters deal with how the convergence criteria are calculated. The first decides at what iteration interval conver- gence is checked, the second what fraction of grid points are used in assessing this (1=all, 2=half, 5=fifth etc). If in doubt, there are default values for these of 10 and 1. The idea behind these parameters is to allow convergence to be checked less frequently to reduce the amount of time spent. _7._5._1. EXAMPLE OF PARAMETER FILE USE (a) Producing an unformatted pdb file (but with radii and charge information): DelPhi V3.0 -41- 1) Run QDIFFX with the write flag set true for unformatted pdb (line 16) This results in a file being written to fortran unit 20. 2) Rename this file as appropriate, and link this to unit 13, the unit previously used to read the formatted atom coordinate information. 3) Rerun QDIFFX. 4)Notice the increase in speed (b) Producing an unformatted frc-type pdb file: 1) Run QDIFFX with the write flag set for the frc pdb file (line 17) This results in a file being written to unit 21. 2) Rename this file and link it to unit 15, nor- mally used to read in the formatted coordinates for the site potential file 3)Rerun QDIFFX. This results in an unformatted file being written to unit 22. _7._5._2. EXAMPLE PARAMETER FILE 1. 65 ! grid size 2. 70 ! percentage box fill 3. 2,2,5 ! x,y, and z offsets for center of box 4. 2,80 ! dielectric in and dielectric out 5. 0.145 ! salt concentration in mM 6. 2.0,1.4 ! salt exclusion radius, probe radius 7. 2 ! dipolar coulombic type of boundary con- dition 8. f,f,t ! periodic boundary conditions in z direction only 9. A ! number of linear iterations, set here to automatic 10. 0 ! number of nonlinear iteration, not used by QDIFFX DelPhi V3.0 -42- 11. f,f ! flags for phi-map output format 12. t ! site potential output 13. f ! no formatted modified output pdb file 14. Sample parameter file ! label 15. f ! spherical charge distribution algorithm- not used in QDIFFX 16. t ! unformatted pdb file output with radii, charges 17. t ! site potential pdb file to be written in unformatted form 18. GC ! grid and coulombic energies calculated 19. f,t,20,2 ! log file, and convergence date: Graph ouput off, potential output on, check for convergence every 20 iterations at every other grid point. DelPhi V3.0 -43- _8. PROGRAM QDIFFXS last revision: 2nd may 90 _8._1. BRIEF DESCRIPTION The program QDIFFXS supersedes QDIFFX. QDIFFXS has a better description of the dielectric boundary surface, and so pro- duces more accurate estimates of reaction field and solva- tion energies, particularly at scales of less than 2 grids/angstrom. Input is identical to QDIFFX. The major difference is in the output log file, where various energy terms are outputted. The calculation of these terms is con- trolled by line 18 of the parameter file, as for QDIFFX, but their names have been changed to be more accurate and reflect previous usage in published papers: On line 18 upto four letters switch on the calculation of certain electros- tatic energy quantities by the program: G total energy. C coulombic energy, S for induced surface, or reaction field energy. A analytical estimate of the grid energy. The COULOMBIC ENERGY is calculated via coulombs law, and is the energy to bring the charges present from infinity to their final resting place, in a dielectric of that specified as being internal. This includes all charges, not just those in the grid. The TOTAL ENERGY (formally called the grid energy) is that obtained from the potential at each charge WITHIN the grid, multiplying by that charge, and summing over all such sites. It contains all the real electrostatic energy terms, plus the self energy of the grid. This latter is not a meaning- ful number in itself, but can be subtracted out to yield meaningful quantities such as solvation eneries etc. The ANALYTICAL GRID ENERGY is an accurate estimate of the grid and coulombic energies. This term is thus the total energy that would be obtained from a QDIFFXS run with a uni- form dielectric equal to that of the interior. Subtracting the analytical grid energy from the total energy thus gives an estimate of the reaction field energy, ie that due to the dielectric boundary. However a better estimate of the reac- tion field energy is obtined directly, see below, so this value should not be used. The REACTION FIELD ENERGY term (formally called the solva- tion energy term) is obtained by calculating the induced surface charge at each surface point within the box, then using these charges to calculate the potential at every charge, not just those in the box. If the molecule lies entirely within the box, and there is no salt present, this corresponds to the energy of taking the molecule from a DelPhi V3.0 -44- solvent of dielectric equal to that of the interior, to that of the exterior. Depending on the physical process, this may be the required solvation energy, but in general the solvation energy is obtained by taking the difference in reaction field energies between suitable final and initial reference states that define the required process- hence the name change. In QDIFFX the reaction field energy was not accurate if the scale was less than 2 grids/angstrom. QDIFFXS however by using a better representation of the dielectric surface, and by repositioning the induced surface charge at its correct position before calculating the reac- tion potential back at the charge, gives much more accurate estimates, even at 0.5 to 1 grids/angstrom. Thus solvation energy calculations should always be done via differences in reaction field energy, rather than via differences in total energy as was done before. NOTE however that the best way to assess the inaccuracies arising from the lattice representa- tion is by repeating calculations at different scales or box fills, and observing the change in the calculated quantity. DelPhi V3.0 -45- _9. WORKED EXAMPLES _9._1. THE POTENTIAL AROUND CU,ZN SUPEROXIDE DISMUTASE OBJECTIVE produce a potential map showing the potential distribu- tion in and around CU,ZN Superoxide Dismutase REFERENCES Klapper et al, Proteins, 1:47 (1986) for parameters of the run and pictures of potentials in a slice through the two active site coppers. Getzoff, Tainer et al., Nature 306:284,287 (1983) for structural details. FILES contour.kt: list of contour values in kT, 1 per line, input to CONTOUR2D. sod.siz atomic radii, input to qdiff. sod.crg atomic charges, input to qdiff. sod.eps dielectric map=protein 'shape'. sod.log log file of qdiff run (finite difference calculation). sod.pdb atomic coordinates, input to qdiff. sod.phi output potential map, input to contour2d. sod.prm input parameters for qdiff. sodout.pdb output atomic coordinates with assigned radii and charges, input for contour2d. sod.plt output plot file from contour2d for HP75 plotter showing potentials in slice thru 2 coppers. NOTES The grid size in this example is set to 33 for speed. For greater accuracy 45-65 would be better. The protein fills 66% of box- a tradeoff between accurate boundary values (smaller % ), and detail in protein (larger %). For further accuracy, focussing can be used (see exam- ple 5). 400 iterations were perform, using the linear PB equation, since no salt is present (even with salt, SOD is not highly charged, so we could probably still use the linear equation). For large grids or better convergence, iterations should be increased. A max change in potential of <0.001kT/e is reasonable. Look- ing at the charge file sod.crg, note that only 'full' charges have been assigned-for potentials far out in solution, dipole charges (or partial charges) make DelPhi V3.0 -46- little difference. A complete set of partial charges (including polar H's) will require building of H atoms into sod.pdb using some modelling package. DelPhi V3.0 -47- _9._2. TEST CHARGES IN A SPHERE- Comparison with analytical solutions OBJECTIVE Determine the potential and field at each of 3 points in a 10A sphere of dielectric 2, surrounded by solvent with a dielectric of 80, containing 1-1 electrolyte , concentration 0.145M (physiological). Find the total electrostatic energy, and contributions from self energy (interaction of each charge with the potential it induces by polarizing solvent, a.k.a the image or solvation terms) and the interaction between charges. Results from the finite difference method (FDPB) are compared with analytical solutions from the Tanford- Kirkwood equations (TK). Potentials are in kT/e (== 25.6mV, or 0.593 kcal/mole at 25oC). Fields are in kT/e/A, and energies are in kT. REFERENCES Gilson Sharp and Honig, J. Comp. Chem 9:327 (1987), for method and error assessments in the finite difference procedure for solving the Poisson-Boltzmann eqn. (FDPB method). Gilson, M., Sharp, K., Honig, B. Calculating electros- tatic interactions in bio-molecules: Method and error assessment. J. Computational Chem. 9, pp327-335. For a definition of the various electrostatic terms calcu- lated here. Tanford and Kirkwood, JACS 79:533 (1957), T.L. Hill, J.Phys.Chem. 60:253 (1956) for analytical series solu- tions to the Poisson-Boltzmann equation for charges in a sphere. OUTPUT FILES Files with extension *.frc contain potentials, fields and sum of 1/2.(potential*charge) from runs as follows: For prefix sph#, # refers to which of the three charges was present. No number means all three charges were present. suffix _2_80 refers to dielectric 2 for the sphere, 80 for the solvent, and ionic strength of 0.145M. Suffix _2_2 refers to uniform dielectric of 2 inside and out, and no ionic strength. Analytical solu- tions from the Tanford-Kirkwood equations for the 2/80 case are in files with suffix _anal: sph1_2_2.frc sph1_2_80.frc sph1_anal.frc sph2_2_2.frc DelPhi V3.0 -48- sph2_2_80.frc sph2_anal.frc sph3_2_2.frc sph3_2_80.frc sph3_anal.frc sph_2_2.frc sph_2_80.frc sph_anal.frc sph_anal_slf.frc NOTES Grid size in this example is 45 for speed. For greater accuracy 45-65 would be better. Sphere actually fills 66% of box-(note this is large than the nominal value of 20% requested in the parameter file since this did not include the 'radius' of the 10A 'atom' See section 6.8.2). This should be born in mind when specifying the desired fraction of box to be filled. 66% is a tradeoff between accurate boundary values (smaller % ), and detail in representing the sphere boundary (larger %). For further accuracy, focussing can be used (exam- ple 5). 800 iterations were perform, using the linear PB, since Tanford-Kirkwood solutions are for linear case only. For uniform dielectric 2 cases, with no ionic strength, 1500 iterations were used, since con- vergence is slower with higher potentials and with no salt. BACKGROUND TO ANALYSIS OF RESULTS The image or reaction field: Consider a single charge in a low dielectric (Di) spherical cavity surrounded by high dielectric (Do) solvent. Its field polarizes the material in the cavity less than the surrounding sol- vent. The mathematical statement of this is that there is a discontinuity in the normal component of the field at the sphere boundary, such that Di.Eo = Do.Eo. Alter- natively, a surface charge density is induced at the sphere boundary. This surface charge produces a field (the reaction field), which when added to the field from the charge yields the total field. The potential at the charge due to the reaction field is the self, or image potential, Ps, and the change in 'solvation energy' of the charge is q.Ps/2. Calculation of the reaction potential via the Finite Difference Procedure: When a FDPB calculation is per- formed, the potential at the site of a charge is not singular, since mapping onto a lattice essentially dis- tributes the point charge over some volume element of order of one little grid box. The potential at the charge is then Pd = Pg + Ps, where Pg is some grid potential term, whose values has no-intrinsic meaning, DelPhi V3.0 -49- but which is constant for a given charge location, grid size and boundary condition. The term Ps can be evaluated by subtracting the potential, Pu, at the charge with Do=Di (when Ps==0 by definition, and P = Pg), from the potential, Pd, with Do not = Di. Thus: Ps = Pd - Pu, Also the solvation energy change is: dGs = q.(Pd - Pu)/2. Similarly, the field at the charge is Ed = Eg + Es, and since (ignoring the error due to boundary conditions, Eg ==0), a good approx. to the field at the charge can be obtained directly from grad(Pd). This is provided in the *.frc file. Many Charges in a Sphere: When there is more one charge, the potential at charge qi is Pid = Piis + Pig + sum(Pjis + Pjic), i.ne.j where Piis is the potential at i due to qi's polariza- tion of solvent, Pig is grid potential at qi, Pjis is the potential at qi due to qj's polarization of the solvent, Pjic is the direct Coulombic potential due to qj at qi [ie Pjic = qj/(Di.rji)]. For a uniform dielectric case (Di=Do), Pijs==0 for all i,j, and: Piu = Pig + sum(Pjic), i.ne.j Similarly for the fields: Eid = Eis + sum(Ejis + Ejic), i.ne.j where E's are vectorial quantities, and Eig==0. The total solvation energy is then dGs = sum[qi.(Pid - Piu)/2.] = sumi[qi.(Piis + sumj{Pjis},i.ne.j)]/2. ANALYSIS OF RESULTS In the first run, dielectrics of 2 and 80 are used and all three charges are present. The total field at each charge can be obtained directly with all 3 charges present (sph_2_80.frc), giving: DelPhi V3.0 -50- inner,outer dielectric: 2.0 80.0 ionic strength (M): 0.145 coordinates | charge| field (kT/e/Ang.) #| x | y | z | (e) | Ex | Ey | Ez -|------|------|----|-------|--------|--------|----- 1|-3.00 | 0. | 0. | 1.00 | 6.02 | -4.36| 0. 2| 3.00 | 0. | 0. | 2.00 | -7.60 | -67.0 | 0. 3| 3.00 | 3.00 | 0. | -2.00 | -4.42 | -69.2 | 0. -|------|------|----|-------|--------|--------|----- while the TK analytical results (in sph_anal.frc) are: coordinates | charge| field (kT/e/Ang.) x | y | z | (e) | Ex | Ey | Ez -----|------|----|-------|--------|--------|----- -3.00 | 0. | 0. | 1.00 | 5.84 | -4.20| 0. 3.00 | 0. | 0. | 2.00 | -7.55 | -68.0 | 0. 3.00 | 3.00 | 0. | -2.00 | -4.48 | -70.2 | 0. -----|------|----|-------|--------|--------|----- with around 5% error or less. The total solvation energy is obtained from summing q.phi/2. for the three charges for the D=2/80 (last line in sph_2_80.frc) and subtracting the same sum for the D=2/2 case (sph_2_2.frc): inner,outer dielectric: 2.0 80.0 ionic strength (M): 0.145 total energy = 2869.8 kt inner,outer dielectric: 2.0 2.0 ionic strength (M): 0.0 total energy = 2892.0 kt giving a difference of -22.2 kT. The analytic value for D=2/80 from sph_anal.frc is -386.6543, while from Coulombic potentials for D=2/2 it is -364.11, yielding a difference of -22.5 kT, for a 1% error. Thus fields at all the charges can be calculated simul- taneously with all charges present, with one run. The total solvation energy can be calculated with all charges present, with two runs, the first of these, with Do = Di, being a reference state run. To calculate the individual pairwise interactions, three sets of runs must be done, each with only one charge present in turn (file prefixes sph1, sph2, sph3), with D=2/80 (suffix _2_80). Analytical results are from *_anal.frc files. For charge 1, FDPB calculated fields (sph1_2_80.frc) and analytical fields are: DelPhi V3.0 -51- FDPB || TK analytical Ex | Ey | Ez || Ex | Ey | Ez ------|--------|-----||-------|-------|------ 1.01 | -0.002 | 0. || 0.988| 0. | 0. -7.36 | -0.0007| 0. || -7.32 | 0. | 0. -4.93 | -2.76 | 0. || -4.95 | -2.67 | 0. ------|--------|-----||-------|-------|------ for charge 2: FDPB || TK analytical Ex | Ey | Ez || Ex | Ey | Ez ------|----------|----||-------|-------|------ 14.7 | -0.0014 | 0. || 14.6 | 0. | 0. -2.01 | -0.004 | 0. || -1.98 | 0. | 0. -1.99 | -69.0 | 0. || -1.95 | -69.9 | 0. ------|----------|----||-------|-------|------ for charge 3: FDPB || TK analytical Ex | Ey | Ez || Ex | Ey | Ez ------|--------|----||-------|--------|---- -9.76 | -4.38 | 0. || -9.80 | -4.20 | 0. 1.79 | -67.0 | 0. || 1.75 | -68.0 | 0. 2.50 | 2.48 | 0. || 2.42 | 2.42 | 0. ------|--------|----||-------|--------|---- Interaction potentials between different charges are also obtained from these runs: charge | potential (kT/e) pair | FDPB | TK -------|----------|--------- 1-3 | 16.3 | 16.4 2-1 | 42.7 | 42.5 2-3 | -129. | -126. -------|----------|--------- For individual self energy or image interactions of each charge it is necessary to do 3 pairs of runs, each pair with only one charge present, with D=2/80 (suffix _2_80.frc) and D=2/2 (suffix _2_2.frc), and subtract the latter potential at the charge itself from the former (TK results are from sph_anal_slf.frc): | Dielectric | Energy difference Charge | D=2/80 | D=2/2 | FDPB | TK -------|---------|---------|---------|--------- 1 | 694.3 | 723.8 | -29.52 | -30.4 2 | 1388.5 | 1447.6 | -59.1 | -60.9 3 | -1381.8 | -1447.6 | 65.8 | 67.4 -------|---------|---------|---------|--------- DelPhi V3.0 -52- _9._3. SOLVATION ENERGY OF THE ACETATE ION OBJECTIVE To calculate the electrostatic contribution to the sol- vation energy of the acetate ion in water. The acetate molecule is assumed to have a dielectric of 2, account- ing for electronic polarizability, and the water to have dielectric 80 (actually 78.6). REFERENCES Gilson, Sharp and Honig, J. Comp. Chem. 9:327 (1988). For general method. Rashin and Namboodri, J. Phys. Chem., 91:6000 (1987), and Gilson and Honig, Proteins, 3:32 (1988) for solva- tion energy calculations- the values for acetate are taken from the last reference. Kang, Nemethy and Scheraga, J.Phys.Chem. 91:4118 (1987). Wolfenden, Anderson and Cullis, Biochem. 20:849 (1981), for experimental solvation energies. FILES ace.crg charge file with CHARMM partial charges for acetate. ace.pdb input atom coordinates for acetate built by CHARMM. ace.prm parameter file for qdiff. ace.siz radius file, with charmm radii. ace_2_1.frc reference run with vacuum dielectric outside (D=2/1). ace_2_80.frc solvation energy run, with solvent (dielectric 80 present). (D=2/80). ace_2_80.log log file. ace_2_1.log ". aceout.pdb output atom coordinate file including charges and radii. NOTES The basic idea here is to calculate the sum of (charge*potential)/2. at all the charged atoms in DelPhi V3.0 -53- acetate. In the case with solvent present (two dielec- tric, D=2/80), this energy contains four terms: the energy of interaction of each charge with all the oth- ers, the interaction of each charge with the reaction field induced by its own field, the interaction with the reaction field of other charges, and the self energy of the grid. For the vacuum dielectric case D=2/1, only the grid self energy and the direct charge charge interaction are present. Subtracting the energy in the D=2/1 run from the energy in the D=2/80 run, we are left with only the charge/solvent interaction energy, or solvent energy. RESULTS Total energy (kT) Run from *.frc or *.log file ----------------------------- D=2/80 1673.44 D=2/1 1793.24 Difference -119.8 == -71.0 kcal/mol ----------------------------- Alternatively, a more accurate solvation energy may be obtained using QDIFFXS, in which case the values for the 'corrected reaction field energy', written in the log file, are used: Reaction field energy (kT) Run from *.log file ----------------------------- D=2/80 -60.60 D=2/1 58.11 Difference -118.7 == -70.3 kcal/mol ----------------------------- Note that in this example, the first method is almost as accurate as the second, since with a small molecule the scale is > 2 grids/angstrom but in general this will not be the case. Experimental values range from -79 to -87 kcal/mole. Previously calculated value from Gilson and Honig was -70 kcal/mole. DelPhi V3.0 -54- _9._4. PK SHIFTS IN SUBTILISIN OBJECTIVE Calculate the expected change in pK of the active site histidine in subtilisin due to site directed mutagenesis of asp99->ser and glu156->ser at two dif- ferent ionic strengths. REFERENCES Gilson and Honig, Nature 330:84 (1987) Sternberg et al, Fersht, Nature 330:86 (1987) for experimental data and similar theoretical calculations (but not including ionic strength). FILES sbt.crg atom charges sbt.pdb atom coordinates from brookhaven data bank sbt.siz atom radii sbt1.prm run at 0.1M ionic strength sbt1.log " sbt1_99_156.frc " sbt2.prm run at 0.01M ionic strength sbt2.log " sbt2_99_156.frc " sbt_99_156.pdb atomic coordinates where potentials outputted NOTES The change in pK of a group due to to the electrostatic potential of another group is the change in work neces- sary to bring a proton from solution to that group, ie dpK = phi/2.303, where phi is the potential at one group due to the other, in units of kT/e. Since reciprocity holds, the charge can be put on either group, and the potential calculated at the other group. Thus for this case, it is more efficient to put the charge on the histidine 64, and get the potential at all other charged groups simultaneously. Note that where site-directed mutagenesis being performed, the change in bulk of the side chain could also have an effect on the potential at the titrating group, and so runs should be done with both side chains present, and the difference in potential taken. Also this procedure only yields CHANGES in pK, since the intrinsic pK of the group is not known (unless measured or calculated independently). The intrinsic pK being the pK of the group with all the other titratable groups neutral, ie the difference in intrinsic pK of an isolated side DelPhi V3.0 -55- chain and one in a folded protein depends on the difference in solvation energy of the charged specie when incorporated into the protein. The brookhaven data bank file for subtilisin contains oxygen coordinates for several water oxygens. In this example these waters are included as part of the molecule (ie with dielectric 2). What the dielectric behavior of crystallographically bound waters is, and whether they shold be included as part of the molecule is an open question. In this example the centre of the protein is not set at the grid centre, but is offset 0,5,0.5,0.5, angstroms. This is merely to bring attention to the fact that when doing quantitative work, it is advisable to check that the results are not sensitive to the precise position of the molecule with respect to the grid. This applies particularly when conclusions are based on calculations from a few potential values at selected coordinates, such as for the pK changes in this example, and with the solvation eenrgies in example 3. If sensitivity to grid position is found results from runs with different positions of the molecule upon the grid should be aver- aged (eg see discussion of rotational averaging in Gilson,Sharp and Honig, (1988) J.Comp Chem, 9:327). RESULTS The data is taken from sbt1_99_156.frc (0.1M) and sbt2_99_156.frc (0.01M) and average potentials at the asp 99 and glu 156 carboxyl oxygens computed. This gives the histidine pK changes when these groups are mutated to serines: | | pK Change ------|--------|--------------------------------- |Ionic |Experimental |Gilson |calculated Case |Strength|(Fersht et al)|& Honig|(example 4) ------|--------|--------------|-------|---------- Asp99 |(0.1M) |0.23-0.29 |0.18 | 0.17 |(0.01M) |0.42 |0.29 | 0.25 ------|--------|--------------|-------|---------- Glu156|(0.1M) |0.25 |0.20 | .30 |(0.01M) |0.42 |0.37 | 0.38 ------|--------|--------------|-------|---------- Differences from calculations of Gilson & Honig arise since focussing boundary conditions and rotational averaging were not used here. DelPhi V3.0 -56- _9._5. ELECTROSTATIC POTENTIALS AND ION CONCENTRATIONS AROUND B-DNA OBJECTIVE To calculate the electrostatic potential and mobile ion charge density distributions around DNA, using the non-linear PB equation, periodic boundary conditions along the axis of DNA, and focussing. REFERENCES Gilson, Sharp and Honig, J. Comp. Chem. 9:327 (1988), for general method. Jayaram, Sharp and Honig, Biopolymers, 28:975 (1988), for non-linear PB, periodic boundary conditions and application to DNA.. FILES contour.cnc contour values for concentration profile. contour.kt contour values for potential pro- file. dna.pdb input atom coordinates. dna.siz atom radii. dna_out.log run with coarse grid to get boun- dary conditions. dna_out.phi ". dna_out.prm ". dna_in.log run with fine grid to get poten- tials. dna_in.phi ". dna_in.prm ". dna_in_conc.log run with fine grid to get charge concentrations. dna_in_conc.phi ". dna_in_conc.prm ". nuca_full.crg full atom charges (-0.5 on each PO4 oxygen only). DelPhi V3.0 -57- NOTES The atom coordinate file contains 3 turns of poly A, poly T DNA in the B form. The first run ( file suffix _out) is done with a coarse grid (of 25), including all 3 turns, the helix axis aligned along the z- direction, with 100% of the box filled in this direction. This puts the x,y edges of the box about 40A from the dna, where the coulombic boundary condition should be fairly good, even for highly charged DNA. Boundary conditions on the Z ends are periodic, simulating an infinite piece of DNA. These potentials are only used to produce more accurate boundary conditions for a finer run. For the inner runs (file suffix _in), a finer grid is used (45), and one turn included in the box only (per- cent fill 300%). Boundary conditions are taken from the coarse run ('focussing', option #3). Periodic boun- dary conditions are still applied to Z edge, although this is not strictly necessary since the focussing boundary conditions also impose Z periodicity. Note that linear PB iterations are then followed by non- linear iterations-this provides good convergence for less iterations than just non-linear iterations alone. In the first fine run the potentials are outputted, on the second, the concentration flag is set to true (t), and net ion concentrations (ie. sum of positive and negative charges) are outputted (file suffix _conc). In the latter case, the concentrations are 0 within the molecule (the region occupied by the van de Waals spheres plus the ion exclusion radius, 2A). When con- tour output of this concentration map is produced, closely spaced contour lines appear at the ion- exclusion surface due to the abrupt drop in concentra- tion to 0. Note that the ion exclusion surface lies outside the the solvent acessible surface depicted on the plots. PLOTS Plot files in dna_pot.plt and dna_conc.plt for the HP75 plotter show potential and charge density contours plotted on top of the solvent accessible surface of DNA. Plots are produced by CONTOUR2D. The plot range is set at 170., the probe radius at 2A. For the poten- tials, contours are at -0.5,-1.,-2.,-3.,-4 kT/e. For the charge densities, contours are at -0.5,-1,-2,-3 moles/litre. DelPhi V3.0 -58- _9._6. INTRINSIC PK CALCULATION IN TRYPSIN OBJECTIVE Calculate the the intrinsic pK of the active site his- tidine of the catalytic triad in rat trypsin, his 40 (res # 57 in bovine numbering). Incorporates desolva- tion effects, protein dipoles and charges. REFERENCES Soman, Yang, Honig and Fletterick, Biochem. 28:9918 (1989) for calculations, Sprang et al Science 237:905 (1987) for structure. FILES tryp.prm parameters tryp.out summary of key numbers def.siz radii for whole protein his40.siz radii for his40 in reference state (alone in solution) his40.crg charges for his+40 his40_neut.crg charges for unprotonated his40 tryp_all.crg dipoles and charges for protein tryp_full.crg formal charges for protein rat_ptb.pdb protein coordinates rat_ptb_his40.pdb his40 coords for input charges rat_ptb_his40n.pdb his40 coords for output poten- tials tryp0_his40.frc his+40 in reference state tryp0_his40n.frc neutral his40 in reference state tryp_his40.frc his+40 in chargeless, dipoless protein tryp_his40n.frc neutral his40 in chargeless, dipoless protein tryp_all.frc potential at his+40 from all pro- tein dipoles, charges tryp_alln.frc potential at neut. his40 from all protein dipoles, charges tryp_full.frc potential at his+40 from all pro- tein charges tryp_fulln.frc potential at neut. his40 from all protein charges NOTES The pK of a group is given by dGdeprotonate/1.366 (dG in kcal, or dG/2303 in kT). With respect to the refer- ence state pKa, for the isolated histidine in water (pK6.8), the change in dG on putting the histidine into the folded protein is ddGsolv + dGdip + dGcharge, where ddGsolv is the difference in solvation energy of the charged v. uncharged form of the histidine in position DelPhi V3.0 -59- in the protein, wrt. the reference, isolated state. dGdip is the difference in energies of the charged and neutral histidines arising from the protein dipole potentials. Similarly, dGcharge comes from the poten- tials of the other formal charged groups, including of course the other titratable groups. Note that dGcharge depends on the protonation state of all the other groups, and thus is pH dependent. To calculate ddG, the solvation energy of the histidine charges is calcu- lated for the isolated histidine G1 (tryp0_his40), ie assigning radii and charges only to his40; for the his- tidine in the protein G2 (tryp_his40), ie. assigning charges to his 40 only, and radii to both his and the protein. The same calculation is done on the neutral (unprotonated) form of his40 to give G3 (tryp0_his40n) and G4 (tryp_his40n). Then ddGsolv = (G2-G1) - (G4- G3). Energies G2-G1 and G4-G3 are examples of self energy differences, and are calculated from the total energy, sum(q.phi/2), where the sum is over all the histidine atomic charges, and the potentials phi are those produced by the very same charges. Note that the scale and position of the charges must be exactly the same for each set of runs for these differ- ences to be meaningful- remember that the 'artifactual' grid energy must be subtracted out. One way to hold the scale constant is to place a pair of dummy atoms of zero radius and charge in all the pdb files to fix the min and max extents which control the scaling and posi- tioning of the molecule (see the first two entries in the pdb files rat_ptb.pdb or rat_ptb_his40.pdb). When QDIFFXS becomes available, instead of using sum(q.phi/2), the program can calculate the reaction field energy as sum(q.phix/2) over the histidine charges, where phix is now the reaction potential, arising from the induced surface charge at the dielec- tric boundary of the molecule. In this case the scales need not be the same and one can maximize the percent box fill to achieve the best accuracy for each molecule dGcharge is obtained (assuming usual protonation states for all other groups at pH 7 (glu, asp =-1) (arg,lys = +1) (his=0.5)), by assigning charges to all ionizable groups except his 40, radii to all protein atoms, and evaluating the energy of the his+ charges G5 (tryp_all) v. the neutral his charges G6 (tryp_alln) in the resulting potential. dGcharge = G5-G6. Since G5,G6 are charge-charge interaction energies they are given by sum(Qhis.Phi) where Phi is the potential arising from all the other protein charges. Similarly, dGdip is obtained by an identical calculation where only the DelPhi V3.0 -60- protein dipoles are charged. In practice, since charge sets for neutral forms of ionizable groups may not be available, it is easier to calulate the sum of dGcharge and dGdip by assigning all charges to the protein atoms, and to get dGdip if required, by subtracting dGcharge. Thus in these runs (tryp_all, tyrp_full), charges are assigned to the protein for the input, using rat_ptb.pdb, and potentials are read out at the histidine using rat_ptb_his40n.pdb. RESULTS For ddGsolv, 1/2sum q.phi from either the log file (as total energy) or from the bottom of the frc file gives: file energy (kT) ------------------------------- tryp0_his40 -55.326 tryp0_his40n 37.420 tryp_his40 74.870 tryp_his40n -39.315 ------------------------------- ddGsolv 17.64 kT = 10.58 kcal/mol With QDIFFXS, the reaction field energies from the log files give: file energy (kT) ------------------------------- tryp0_his40 31.39 tryp0_his40n -2.68 tryp_his40 -12.53 tryp_his40n 0.56 ------------------------------- ddGsolv 16.83 kT = 9.98 kcal/mol For dGcharge and dGcharge+dGdip, 1/2sum q.phi from the bottom of the frc file (x2) gives: file energy (kT) ------------------------------- tryp_full