UNC Computational Structural Biophysics Group

DOWSER manual

PROGRAM WAS LAST UPDATED in February 2003 to correct problems with the dowserx script (which places water in crevices).

by: Jan Hermans, Xinfu Xia, Li Zhang, Dave Cavanaugh
Department of Biochemistry and Biophysics
School of Medicine
University of North Carolina
Chapel Hill, NC 27599-7260

For additional information see "About the Dowser program"

Nota bene: This manual describes how to use the new Dowser developed in February 1998, with modifications thru March 1999.

Before running Dowser:

Before running Dowser execute "source /usr/local/initial/dowserinit". This initializes the environment variable "DOWSER" and places the Dowser executables in your "path". (The exact location of the file dowserinit depends on the installation.)

Most common USAGE:

dowser inputfile.pdb [-probe RADIUS] [-hetero] [-atomtypes FILE_t] [-atomparms FILE_p] [-separation SEPARATION] [-onlyxtalwater] [-noxtalwater]

Execute the dowser script with as input the specified pdb-formatted file of protein coordinates, and as output a file of low-energy water molecules that could be placed in cavities inside the protein (also in PDB format; filename "dowserwat.pdb").

-probe RADIUS: sets the radius of the probe (representing a solvent molecule) for the MS program (default is 0.2 Å).

-hetero: hetero atoms (HETATM records in the PDB file) will be included, with the exception of crystallographic water molecules identified as having atom name "O" and residue name "HOH". (The default is to use only ATOM records.)

-atomtypes: the named file will be appended to the file provided with dowser (DATA/atomdict.db).

-atomparms: the named file will be appended to the file provided with dowser (DATA/atomparms.db).

-separation SEPARATION: the surface points obtained with the ms program will be pruned to a set of points at least SEPARATION Angstroms apart.

-onlyxtalwater: the only positions that will be tested for optimal placement of a water molecule are the coordinates of the internal water molecules given in the pdb file.

-noxtalwater: the coordinates of the internal water molecules given in the pdb file will not be considered as additional test points for placement of water molecules.

The dowser script will perform the following steps:

  1. CLEAN UP FILES: remove files with same names as those that will be created by dowser.
  2. Process the crystallographic water molecules provided in the pdb file. First, hydrogen atoms are added. Then, rotate each molecule to get the lowest energy, first each alone against the protein as the environment, then all together against the protein plus each other as the environment. The result will be saved (file xtal_hoh.pdb) and he coordinates compared with those of the water molecules found independently.
  3. REFORMAT THE INPUT PDB FILE (execute: reformatPDB; with input inputfile.pdb and with output reform.pdb)
    The output pdb file will contain protein atoms and the polar hydrogens. The positions of missing atoms are computed from the coordinates of the other atoms and geometric information in a file "atomdict.db". The output file will contain for each atom also its atomic charge and Lennard-Jones parameters. (Charges are found in the file "atomdict.db" and LJ parameters in a file called "atomparms.db".)
  4. COMPUTE THE MOLECULAR SURFACE with Connolly's MS program, or with qms, a fast alternative that produces only the places at which a solvent probe touches three protein atoms simultaneously (output file is ms.pro)
    • CONVERT PDB to MS format (execute: pdb2ms with input reform.pdb and output ms.dow)
    • RUN THE MS program (execute: xms (or qms) with input ms.dow and output xms.dow)
      The. MS program requires also file of run parameters and atomic radii ("ms.param" and "ms.rad").
    • CONVERT OUTPUT FROM MS to PDB format (execute: ms2pdb with input xms.dow and output pdbms.dow)
  5. SORT SURFACE INTO BURIED AND EXPOSED (execute: drain with input reform.pdb and pdbms.pro, output surface.wat and intsurf.pdb)
  6. COMPUTE ENERGIES FOR BEST WATER PLACEMENT STARTING FROM EACH INPUT SURFACE POSITION. First, hydrogen atoms are added. Then, each molecule is rotated to get the lowest energy, first each alone against the protein as the environment, then all together against the protein plus each other as the environment. (Execute: placeWat with input reform.pdb and intsurf.pdb and output setwat.pro. Normally, the crystal water positions are first added to the set of surface points.)
  7. SORT WATER MOLECULES AND RETAIN LOW-ENERGY ONES, ELIMINATING HIGHER-ENERGY SITES WHEN TOO CLOSE (execute: chooser with input setwat.pro and output dowserwat.pdb)
  8. The crystal and dowser water positions are compared in the dowser_report. (with program CompareWat). An example of dowser_report for BPTI is avlailable.

Alternate USAGE (dowserx did not work correctly in versions released before February 2003):

dowserx inputfile.pdb [-probe RADIUS] [-hetero] [-atomtypes FILE_t] [-atomparms FILE_p]

When Dowser is used in this alternate manner, surface points loacted in reasonably deep crevices, but which would be considered as surface points with the normal use of Dowser, are retained. Hence, dowserx can find low-energy water molecules in crevices. The script performs the same steps, except that step 4 is modified, and output from step 4c becomes input for step 5.

  • step 4a. A second molecular surface is calculated with a large probe radius (5 A instead of 0.4 A). (execute: xms with input ms.pro and output bigms.pro, with parameters from ms_largeR.param).
  • step 4b. CONVERT OUTPUT FROM MS to PDB format (execute: ms2pdb with input bigms.dow and output pdbbigms.dow)
  • step 4c. Points in the first molecular surface lying within 5 A; of any one point on the second surface are eliminated (execute scrape with input xms.wat and pdbbigms.wat, and output intsurf.pdb)
  • step 5. COMPUTE ENERGIES FOR BEST WATER PLACEMENT STARTING FROM EACH INPUT SURFACE POSITION (execute: placeWat with input reform.pdb and intsurf.pdb and output setwat.wat)

dowser-repeat inputfile.pdb [-probe RADIUS] [-hetero] [-atomtypes FILE_t] [-atomparms FILE_p]

The water molecules found by dowser in a fist pass are added to the protein atoms and constitute part of the permanent environment for placing additional water molecules.

Viewing the results

The following files created by dowser are in pdb format and can be viewed with programs such as RasMol and vmd:

  • reform.pdb = protein with polar hydrogens
  • intsurf.pdb = internal surface points
  • xtalwat.pdb = internal water sites in the set of crystallographic waters (if any)
  • dowserwat.pdb = internal water sites found by dowser

When dowser is run a second time in the same directory, the first three of these files are removed, while the fourth file is renamed dowserwat.pdb_1.

Reformatting of the PDB file: Description of data files and method

atomdict.db: describes the atom types for each residue (located in the DATA subdirectory).

Each residue type is introduced with a RESIDUE record containing the name of the residue and an optional TERM specification followed by the names of the preferred modifications to be applied at chain beginnings and ends. This is followed by the atoms that will be represented in the reformatted pdb file.

Termini: the atoms describing a residue at a chain end are constructed by combining those for the residue itself and the selected terminal residue; when an atom occurs in each list, the description given for the terminus is used.

ATOM descriptors: for each atom the file gives the atom's name, the name of the atom to which it is bonded earlier in the list (= backchain), the name of the atom farthest down the list to which it is bonded (= forward chain), bondlength (in Å), bondangle and dihedral angle (in degrees), atomic partial charge and atom type (type provides a cross-reference to the file "atomparms.db").

Bondlength, bond angle and dihedral angle are defined for (successive) backchains. (E.g., backchain of H is N, backchain of N is C of preceding residue, backchain of C is CA. The angle value of 123° given for atom H is for C-N-H, and the dihedral of 0° is for CA-C-N-H, corresponding to a planar peptide group.)

Wherever possible, the coordinates of a missing atom are computed using the atom, its backchain and double backchain, and the forward chain of the backchain. This is best explained with an example: for H the backchain is N, the double backchain is C of the preceding residue, and the forward chain of N is CA. The atom will be placed so that the angle between the planes formed by C-N-CA and C-N-H is equal to the difference of the ideal dihedrals given in the table for H and CA (= 0-180).

ATOM ALA  N    C    CA    1.320  114.0  180.0  -0.280 N
ATOM ALA  H    N    NOT   1.000  123.0    0.0   0.280 H
ATOM ALA  CA   N    C     1.470  123.0  180.0   0.000 CH1
ATOM ALA  CB   CA   NOT   1.530  110.0   60.0   0.000 CH3
ATOM ALA  C    CA   N     1.530  110.0  180.0   0.380 CR
ATOM ALA  O    C    NOT   1.240  121.0    0.0  -0.380 O
ATOM NH3  H    NOT  N     0.000    0.0    0.0   0.248 H
ATOM NH3  N    H    CA    1.000    0.0    0.0   0.129 N
ATOM NH3  H2   N    NOT   1.000  109.5  -60.0   0.248 H
ATOM NH3  H3   N    NOT   1.000  109.5   60.0   0.248 H

atomparms.db: Describes the Lennard Jones parameter for each atom type. These values are the Gromos-Cedar parameters from Biopolymers 23: 1513-1518, 1984. The LJ parameters for a pair are obtained by multiplication of the values for each type in the table. E. g., the LJ parameters for type N interacting with type CH1 are obtained as: -49.36*111.80 *10-6
kJ/(mole.nm6) and 1300*8470.4*10-12 kJ/(mole.nm12).

Note that the unit of energy in Dowser output is the kcal, and the unit of distance in PDB files is the Å.

REMARK atomtype LJ-a LJ-b
TYPE   N      49.36   1300.0
TYPE   H       0.00      0.0
TYPE   CH1   111.80   8470.4

Description of data files needed with Connolly's MS program:

ms.param: parameters for the MS program residue (located in the DATA subdirectory). One record specifies: surface point density, probe radius, bury flag and output format; default input supplied with Dowser is: 3.0, 0.4, 0, 2. (The dowser script creates, and then again removes, a new file ms.param in the working directory if the probe radius is specified.)

ms.rad: atomic radii for the MS program
default supplied with Dowser in the DATA subdirectory is:

    1   2.89000 # carbon without hydrogens
    2   3.00000 # carbon with hydrogens
    3   2.40000 # nitrogen
    4   2.20000 # oxygen
    5   2.70000 # sulfur
    6   2.66666 # phosphorus
    7   1.49000 # type "z", i.e., Zn

The indices are linked to atom type in the pdb2ms step.

Note. The command "xms help" gives information how to use the MS program outside dowser.

How to use Dowser with non-protein components.

The current version of the file atomdict.db covers all amino acid residues. (Dowser chooses a disulfide-bridged type of cysteine residue for pairs of residues selected on the basis of a distance criterion.)

Water molecules can be included in the calculation when all three atoms are given. These must then have names "OW", "H1" and "H2", and the residue name must be "HOH". (Use this to fill large cavities by iteratively applying dowser, each time appending the newly found water molecules to the pdb file.)

A similar description of nucleic acid residues is in preparation.

Dowser includes hetero atoms (HETATM records) in the input if the -hetero argument was specified. In order to include a molecule containing hetero atoms in the dowser calculation, a supplementary dictionary file (similar to atomdict.db) must be prepared in which the molecule is described as a "residue". Each atom in this file must have a charge and an atom type. If the molecule has polar hydrogen atoms, then these must be represented, and values of the appropriate bondlengths, bondangles and dihedrals and the necessary backchains and forward chains must be provided so that the coordinates of these polar hydrogens can be computed.

Example: molecule of phenol

If all coordinates of the phenol molecule are in the PDB file with the exception of the hydrogen, then only a few of the items are needed. Atom types and partial charges use here are the same as for Tyrosine in the protein dictionary. The H-atom will (arbitrarily) lie trans to C2. (The residue name "PHL" and the atom names must be the same as those used in the pdb file.)

ATOM PHL  C6   NOT  NOT   0.000  000.0    0.0   0.000 CHR
ATOM PHL  C5   NOT  NOT   0.000  0.000    0.0   0.000 CHR
ATOM PHL  C3   NOT  NOT   0.000  0.000    0.0   0.000 CHR
ATOM PHL  C4   NOT  NOT   0.000  0.000    0.0   0.000 CHR
ATOM PHL  C2   NOT  C6    0.000  0.000    0.0   0.000 CHR
ATOM PHL  C1   C2   O     0.000  0.000    0.0   0.150 CR
ATOM PHL  O1   C1   H     0.000  0.000    0.0  -0.548 OA
ATOM PHL  H    O1   NOT   1.000  105.0  180.0   0.398 H

If this information is in a file called "phenol.db", then use the following command to run the dowser calculation:

dowser phenolcomplex.pdb -hetero -atomtypes phenol.db