Skip to main content

io (c42b1)

Input-Output Commands

The commands described here are used for reading and writing
data structures used in the main part of CHARMM. Some of data structures
used in the analysis facility may also be read and written.


* Read | Reading data from external sources
* Write | Writing data structures in machine readable form
* Print | Writing data structures in a human readable form on unit 6
* Titles | Specifying and manipulating titles
* IOFORM | Specify PSF file format

Top
READ - Reads Data from External Sources

This command reads data into the data structures from external
sources. The external sources can be either card image files or binary
files. The fortran unit number from which the information is read, is
specified with the unit-spec.

The precise format of all these files is described only in the
source code as that serves as the only definitive, accurate, and up to
date description of these formats. The description of the data
structures provides pointers to the subroutines which should be
consulted, » usage Data Structures.


* Read Syntax | Syntax of the READ command
* Segments | Reading segments'sequences and coordinates
* Sequence | Reading a segment's sequence
* Coordinate | Reading coordinates
* Universal | Reading coordinates from nonstandard formats
* Param Files | The formats used in parameter files
* RTF File format | The format used in topology files
* Other files | Reading all other file types

Top
Syntax of READ Command

[SYNTAX READ]

READ { RTF { CARD [APPEnd] [PRINt] } } [ UNIT integer
| NAME filename ]
{ { [ FILE ] } }

{ PARAmeter { CARD [PRINt] [APPEnd] [FLEX] } }
{ { [FILE] [NBON ] [MMFF] } }

{ IC { [ CARD ] } [ APPEnd ] [ SAVEd ] }
{ { FILE } }

{ SEQUence { [ CARD ] } }
{ { COOR [ RESId ] [SEGI <segid>] } }
{ { PDB [ RESId ] [CHAIn <char>] } }
[SEGI <segid>] } }
[NCHAin <int] } }
[repeat(SKIP <resn>)] } }
[repeat(ALIAs <resold> <resnew>)] } }
[SEQRes] } }
[FIRStresid <int>] } }
[HETAtm] } }
[NOATom] } }
{ { TIPS integer } } ! TIP3P water model
{ { ST2 integer } } ! ST2 water model
{ { WATEr integer } } ! OH2 residue model
{ { DUM integer } } ! Dummy atoms
{ { resname integer } } ! Any RESI in the RTF

{ HBONd { [ FILE ] } }
{ { CARD } }

{ PSF { [ CARD ] [ XPLOR/OLDPSF ] } [APPEend] }

{ CONStraint { [ CARD ] } }

{ NBONd [ FILE ] }

{ TABLe [ FILE ] }

{ BTABle [ FILE ] }

{ TRAJectory [ COMP ] }

{ IMAGes [ CARD ] [ INIT ] }

{ XRAY }

{ UNIVersal-coordinate-format }

{ COORdinate coor-spec [ COMP ] }

{ SEGId segment { PDB } [BUILd [SETUp]] }
{ { CARD } }
{ { FREE } }
{ NAMD FILE "filename" ** }
** no other options are available
{NM }

coor-spec ::= { FILE [IFILE int [CONTinue]] } coor-option
{ CARD [OFFS int] [ RESI ] }
{ PDB [OFFS int] [MODEL int] [OFFI] }
{ UNIVersal [OFFS int] [ RESI ] }
{ IGNOre }
{ DYNR CURR|DELT|VEL }
{ TARG }
{ TAR2 }

coor-option ::= [APPEnd] [INITial] [FREEfield] atom-selection

Syntactic ordering: The second field must be specified as shown.
The file to be read can be specified either through the UNIT number (the
same number as in a preceding OPEN statement) or through the NAME keyword.

Options exist to read/write the PSF in two alternative formats, the "standard"
writing of PSFs use the XPLOR format (which replaces the atom type code for
for each atom (IAC(I)) with the character string for the atom type from
the current RTF/PARAMETER pair (ATC(IAC(I)). This allows atom type code values
to be generated at time of reading RTF or PARAMETERs, via use of a -1 in the
"code" column of the MASS statement.

Note: Caution should be use with some of the existing pert scripts as they
reassign the atom type code using scalar commands and unless one is using
a RTF/PARAMETER file with fixed atom types this will cause uncontrolled
assignments of atom type codes, and corresponding parameters.


Top
Reading the sequence and coordinates of segments simultanously


This command provides convenient way to transform a system in PDB file
format into new CHARMM segments with given coordinates. When read in segments
from a PDB file, one can specify BUILd to generate all atom connectivities and
atom types. If there are missing atoms in the PDB file, one can specify SETUp
to generate an internal coordinate table of the segments to be used to
generate the coordiantes of those missing atoms. Each chain in the PDB file
will form a new segment named as the given SEGId followed by its segment
number. These generated segments are well quialified CHARMM segments and
can be used for atom based simulation. This is a very convenient way to
generate simulation systems from PDB files. However, It requires that all
residue and atom names in the input file are consistent with that in the
For example:
open read unit 10 card name 1b5s.pdb
read segid b5s PDB build setup unit 10
This command can be used to create a new segment from either a
PDB file (PDB), a CHARMM coordinate file (CARD), or a free format coordinate
file (FREE). If BUILd option is not specified, the generated
segment contains only atoms listed in the input PDB file but no atomic
connectivities are generated. Such a segment can be used to generate a map
object needed in the EMAP module (» emap ). With this command, a map
object can be quickly converted from a PDB structure.


Top
Specifying a sequence of residues for a segment


The specification of SEQUence causes the program to accept a
sequence of residue names to be used to generate the next segment in the
molecule. Unless the WATEr, TIPS, or ST2 option is used, the sequence is
specified as follows:

title
number of residues
repeat(residue names)

The form of the title is defined in the syntactic glossary,
» usage Syntactic Glossary. The number of
residues is specified on the line following the title in free field
format. If the number of residues you specify is less than zero,
file. If the number is greater than zero, it will also stop once it
has read at least as many residues as you've specified. If the number
you specify is zero, you will get a warning message as one common
error is to forget the number entirely. In this case, the first
residue name will be consumed as the number and converted to zero.
The residue names are specified as separate words, each no
longer than 4 characters, on as many lines as are required for all the
residues. This sequence may be placed immediately following the READ
command if the unit number is the stream or may be placed in a separate file.
When reading is complete, CHARMM will list all the residues it
has read, and tell you which residues it thinks can be titrated.

The WATEr option allows a sequence of water molecules to be
specified. This will give the old 3-center water model (not recommended).
The integer which follows the keyword gives the number of waters. The TIP3P
water model may be specified with the TIPS option. Likewise, the ST2
option allows ST2 waters to be specified. Obviously, no sequence on
separate lines need be given. The topology file must contain the residue
named (OH2,TIP3,ST2); otherwise, the GENErate command invoked subsequently
will fail.

The COOR option will read the sequence from a CHARMM format card
coordinate file. The residue numbers are ignored except that when a change
occurs, a new residue is added. If the RESId keyword is also present,
then the resid's are obtained from the resid field of the coordinate file.
The SEGI <segid> option allows the sequence to be read only for the residues
belonging to the corresponding segid in the coordinate file.
For the PDB option resids are always read from the resSeq (resid) field.
This is useful when one wants to specify residue names (rather than use
the number representation). No other information is read from the coordinate
file during this process. To read the sequence for a specific chain in a PDB
file the CHAIn <char> option can be used; <char> is the one letter PDB chain
id in position 22 of ATOM/HETATM records. If the SEGI <segid> option is used
the sequence is read for the atoms that have the corresponding segid in
columns 73-76 of the PDB file. NCHAin <int> starts reading from chain number
<int> as defined by TER separator records. Variables SQNRES and SQRESID are
set to the number of residues read and the SEGId used.
By default only ATOM records are read. The HETAtm keyword allows
HETATM records as well, and the NOATom keyword turns off reading of ATOM
records. SKIP specifies residue names that will be ignored, and the ALIAs
keyword provides a simple residue name translation facility:
Each instance of <resold> will be replaced by <resnew>.
With the keyword SEQRes the sequence will be
read from SEQRES records, which is useful when there are missing residues
in the ATOM records. Here FIRStresid (default 1) is used for the RESId
numbering. See examples in test/c39test/readpdb.inp.

Top
Reading coordinates


The reading of coordinates is done with the READ COOR command,
and there are several options (which may change over in future versions).
Coordinates may be read into the main set or the comparison coordinate set
using the COMP keyword.

There are three possible file formats that can be used to read
in coordinates. They are coordinate binary files, dynamics coordinate
trajectories, and coordinate card images. In addition, NAMD program
binary restart coordinates(and velocities) files can be read (only
into main set). Protein Data Bank (PDB) formatted files can also be
read. PDB files do however require some editing first. All the HEADER
and other junk before the actual coordinate section has to be removed
and optionally replaced by a standard CHARMM title. There should be no
line with NATOM (= number of atoms) preceding the actual coordinates.
would either have to rename some entries in the PSF or in the
coordinate file in case there are differences. The MODEL option reads
the specified MODEL number from an NMR style multiple coordinate set
PDB file.

For all formats, a subset of the atoms in the PSF may be selected
using the standard atom selection syntax. For binary files, This is a
risky maneuver, and warning messages are given when this is attempted.
Only coordinates of selected atoms may be modified. When reading binary
files, or using the IGNOre keyword, coordinate values are mapped into
the selected atoms sequentially (NO checking is done!).
Selection of atoms does not work with NAMD binary files (example:
read namd file "myfile.coor.rst"


The reading of the first two file formats is specified with the
FILE option. The program reads the file header to tell which format it
is dealing with. The coordinate binary files have a file header of
'COOR' and contain only one set of coordinates. These are created with a
WRIT COOR FILE command. The dynamics coordinate trajectories have a file
header of 'CORD' and have multiple coordinate sets. These files are
created by the dynamics function of the program. To specify which
coordinate set in the trajectory to be read, the IFILE option is
provided. One specifies the coordinates position within the file. The
default value for this option will cause the first coordinate set to be
read. If the IFILE value is negative, then the next file (other than
the first one) will be read. This will only work if a set has already been
read from the file with a positive IFILE value. The CONTinue keyword specifies
that frame counting will continue from the current position in the file, not from
the start. The sequence
READ COOR FILE IFILE 10 UNIT 51
READ COOR FILE CONT IFILE 10 UNIT 51
will first read frame 10 and then frame 20 from unit 51. In this way possibly
expensive re-reading of the file from the beginning every time is avoided.

For binary files, the APPEnd command will 'deselect' all atoms
up to the highest one with a known position. This is done in addition
to the normal atom selection. This is useful for structures with several
distinct segments where it is desireable to keep separate coordinate
modules.

The CARD file format is the standard means in CHARMM for
providing a human readable and writable coordinate file. The format is
as follows:

* Normal format for less than 100000 atoms and PSF IDs with less than
five characters
title
NATOM (I5)
ATOMNO RESNO RES TYPE X Y Z SEGID RESID Weighting
I5 I5 1X A4 1X A4 F10.5 F10.5 F10.5 1X A4 1X A4 F10.5

* Expanded format for more than 100000 atoms (upto 10**10) and with
upto 8 character PSF IDs. (versions c31a1 and later)
title
NATOM (I10)
ATOMNO RESNO RES TYPE X Y Z SEGID RESID Weighting
I10 I10 2X A8 2X A8 3F20.10 2X A8 2X A8 F20.10

The title is a title for the coordinates, syn:
» usage Syntactic Glossary, for details. Next comes the
number of coordinates. If this number is zero or too large, the entire
file will be read. Finally, there is one line for each coordinate.

ATOMNO gives the number of the atom in the file. It is ignored
on reading. RESNO gives the residue number of the atom. It must be
specified relative to the first residue in the PSF. The OFFSet option
should be specified if one wishes to read coordinates into other positions.
The APPEnd option adds an additional offset which points to the
the residue just beyond the highest one with known positions. This option
also 'deselects' all atoms below this residue (inclusive).
For example, if one is reading in coordinates for the second segment of a
two chain protein using two card files, and the APPEnd option is used,
RESNO must start at 1 in both files for the file reading to work
correctly.

It should also be remembered that for card images, residues are
identified by RESIDUE NUMBER. This number can be modified by using the
OFFSet feature, which allows coordinates to be read from a different PSF.
Both positive and negative values are allowed. The RESId option will
cause the residue number field to be ignored and map atoms from SEGID
and RESID labels instead.

RES gives the residue type of the atom. RES is checked against
the residue type in the PSF for consistency. TYPE gives the IUPAC name
of the atom. The coordinates of an atom within a residue need not be
specified in any particular order. A search is made within each residue
in the PSF for an atom whose IUPAC name is given in the coordinate file.

The RESId option overrides the residue number and fills coordinates
based on the SEGID and RESID identifiers in the coordinate file.
This is the recommended method where different PSF's are used.

The IGNORE option allows one to read in a card coordinate file
while bypassing the normal tests of the residue name, number, and atom
name. When IGNORE is specified in place of card, the identifying
information is ignored completely. Starting from the first selected
atom, the coordinates are copied sequentially from the file.

The PDB option works very much like the CARD option, but expects the
actual file format to be according to Protein Data Bank standards:

text IATOM TYPE RES IRES X Y Z W
A6 I5 2X A4 A4 I5 4X 3F8.3 6X F6.2

The OFFI option enforces the official pdb format. The segid (chain id)
has to be one character in length on read and it is truncated
to one character on write.
Normally, the coordinates are not reinitialized before new values
are read, but if this is desired, the INITialize keyword, will cause the
coordinate values for all selected atoms to be initialized. Note that only
atoms that have been selected, will be initialized (9999.0). The COOR INIT
command provides a more general way to initialize coordinates.

The READ COOR DYNR variant reads a full coordinate set from a dynamics
RESTart file. It REQUIRES a matching PSF and allows no selections or
other manipulations. A restart file (usually) contains three sets of
atom data, and you chose which one to read in with keywords:
CURR the current coordinates
DELT the displacement to be taken from the current coordinates
VEL the current velocities (in AKMA units)

NOTE: The restart file written after a crash may be sligthly different,
at present (c28a2) it contains the previous coordinates instead of velocities.

The READ COOR TARG and READ COOR TAR2 commands read in the coordinates of the
target for Targeted Molecular Dynamics (TMD; » tmd )

Top
Reading coordinates from nonstandard formats


The reading of coordinates is done with the READ COOR command,
and there are several options. One such option is the READ COOR
UNIVersal command which will read using a previously specified format.
The Universal format is specified by the READ UNIVersal command. This
reads the specification from the input stream or from a specified
file.

READ UNIVersal

The following commands clear the translation table and sets up
default specifications for the file format.

CHARMM - setup standard CHARMM format (default)
PDB - setup brookhaven format
AMBER - setup standard AMBER format
UNKNown - setup null format (everything must be specified)

The following commands specify the field locations of various items
When reading free-of-field, the starting values are sorted to determine
the ordering of parsing.

SEGID start length
RESID start length
TYPE start length
RESN start length
IRES start length
ISEQ start length
X start length
Y start length
Z start length
W start length

The following commands specify how input lines should be considered.

PICK start length string - choose only line that match one or
more of these
EXCL start length string - exclude any line that contains one
of these
TITL start length string - add any line containing one of these
to the title

The following commands specify character translation upon reading the file.

TRANslate { SEGID external-segid internal-segid }
{ RESID external-resid internal-resid match-segid }
{ RESN external-resn internal-resn match-segid }
{ TYPE external-type internal-type match-resn match-segid }

END - terminate reading universal file format

Top
The Format of Parameter Files

[SYNTAX Parameter read command format]

READ { PARAmeter { CARD [PRINt] [APPEnd] [FLEX] } }
{ { [FILE] [NBON ] [MMFF] } }

The CARD/FILE keywords specify a card (readable) or binary file format.

The PRINT and NBON options determine the extend of printing while
reading parameters. The NBON will list the NATVDW*(NATVDW+1)/2 vdw table.

The APPEnd keyword will add the new paramters to the existing parameter
set. APPEnd does not work with binary files, MMFF, CFF, SPAS. Also,
only paramters of the same type (e.g. both FLEXible) may be appended.

The MMFF keyword invokes the Merck Force Field paramter reader.
» mmff

The FLEX keyword specifies the new flexible parameter format. This is
the same as the standard CHARMM parameter format, but;
(1) allows general wildcarding for all terms
(2) allows parameter substitution for missing paramters
(3) does not require a previously read RTF (no global MASSES list required)
(4) allows the definition of paramter equivalence groups.

[SYNTAX Parameter file format]

Parameters can be read from cards or binary modules by the routine
PARRDR. After the title, card file data is divided into sections beginning
with a keyword line and followed by data lines read free field:

ATOM (Flexible paramters only)
MASS code type mass (Flexible paramters only)

EQUIvalence (Flexible paramters only)
group atom [ repeat(atom) ] (Flexible paramters only)

BOND
atom atom force_constant distance

ANGLe or THETA
atom atom atom force_constant theta_min UB_force_constant UB_rmin

DIHE or PHI
atom atom atom atom force_constant periodicity phase

IMPRoper or IMPHI
atom atom atom atom force_constant periodicity phase

CMAP
atom atom atom atom atom atom atom atom resolution
<...cmap data...>

NBONd or NONB [nonbond-defaults]
atom* polarizability e vdW_radius -
[1-4 polarizability e vdW_radius]

NBFIX
atom_i* atom_j* emin rmin [ emin14 [ rmin14 ]]


HBOND [AEXP ia] [REXP ir] [AHEX ih] [AAEX iaa] [hbond-defaults]
donor-heavy-atom* acceptor-heavy-atom* well_depth distance


( SPAS only paramter types )

FLUC
atom chi_value zeta_value prin_integer chma_value

KAPPa
atom atom atom atom atom atom force_constant

LCH2
atom atom atom atom atom force_constant

14TG
atom atom atom atom trans_const gauche_const


PRINt [ON ]
[OFF]


where '*' allows wildcard specifications:
* matches any string of characters (including none),
% matches any single character,
# matches any string of digits (including none),
+ matches any single digit.

---------------------------------------------------------------------------
nonbond-defaults::= [NBXMod int] [CUTNB real] [CTOFNB real] [CTONNB real]
[WMIN real] [E14Fac real] [EPS real]

[ATOM ] [CDIElectric] [SHIFt ] [VATOm ] [VSWItch ] [BYGRoup] [GEOMetric ]
[GROUp] [RDIElectric] [SWITch ] [VGROup] [VSHIft ] [BYCUbe ] [ARIThmetic]
[FSWITch] [VFSWitch]
[FSHIft ]

hbond-defaults::= [ ACCEptor ] [ HBEXclude ] [ BEST ]
[ NOACceptor ] [ HBNOexclude ] [ ALL ]

[CUTHB real] [CTOFHB real] [CTONHB real]
[CUTHA real] [CTOFHA real] [CTONHA real]
[REXP int(def12)] [AEXP int(def10)]
[HAEX int(def4)] [AAEX int(def2)]
---------------------------------------------------------------------------

Sections end with the occurence of the next keyword line, or a line with
the word END, the latter terminating parameter reading.

Errors in the input file will result in warning messages but not
termination of the run.

For angles, if theta_min is given as a negative number, then a
cosine style angle potential function being used for those angles, rather
than CHARMM's usual angle potential energy function.

No wildcard usage is allowed for bonds and angles. For dihedrals,
two types are allowed; A - B - C - D (all four atoms specified) and
X - A - B - X (only middle two atoms specified). Double dihedral
specifications may be specified for the four atom type by listing a
given set twice. When specifying this type in the topology file, specify
a dihedral twice (with nothing intervening) and both forms will be used.

There are five choices for wildcard usage for improper dihedrals;
1) A - B - C - D (all four atoms, double specification allowed)
2) A - X - X - B
3) X - A - B - C
4) X - A - B - X
5) X - X - A - B
When classifying an improper dihedral, the first acceptable match (from
the above order) is chosen. The match may be made in either direction
( A - B - C - D = D - C - B - A).

The periodicity value for dihedrals and improper dihedral terms
must be an integer. If it is positive, then a cosine functional form is used.
Only positive values of 1,2,3,4,5 and 6 are allowed for the vector, parallel
vector and cray routines. Slow and scalar routines can use any positive
integer and thus dihedral constrains can be of any periodicity.
Reference angle 0.0 and 180.0 degree correspond to minimum in staggered
and eclipsed respectively. Any reference angle is allowed. The value
180 should be prefered over -180 since it is parsed faster and more
accuratly. When the periodicity is given as zero, for OTHER THAN THE
FIRST dihdral in a multiple dihedral set, then a the amplitude is a
constant added to the energy. This is needed to effect the
Ryckaert-Bellemans potential for hydrocarbons (see below).

The normal dihedral energy equation is:
E = K * ( 1.0 + cos( periodicity * phi - phase ) )

When the periodicity is given as zero, then a harmonic restoring potential
in (phi - phi_min) is used. The phase value gives phi_min for this option.
This functional form is identical to that reported in the CHARMM paper,
except that either functional form (referred to as proper and improper)
may be used for dihedrals and improper dihedrals. The distinction between
these terms is that seperate lookup tables are kept and the default atom
choices are still different. For dihedrals, the selection is usually based
on the middle two atoms, and for improper dihedrals, the selection is based on
the outer two atoms. For either terms, all 4 atoms may be required.

The HBOND line can be used to specify exponents for the hbond function,
with ia and ir being the attractive and repulsive radial terms and
ih and iaa the cosine exponents on the angular terms at the h and a
respectively. Defaults 4, 6, 4, and 2 respectively.

For atom types with no NBOND parameters given, no van der Waals
interactions will be calculated. You will be warned, but be careful.

The nbond parameters for 1-4 interactions can be specified by placing the
extra set of parameters after the first. By default the same parameters
will be used for 1-4 and all other interactions.

NON-BOND parameter combination rules depend on how the parameters are listed.
If the second number is negative, it is used as Emin, and
Emin(ij)=-sqrt(Emin(i)*Emin(j)).
If the second number is positive, it is used as Neff, and the Slater Kirkwood
formula is used to compute Emin(ij).

The PARRDR card field ,NBFIX, allows individual atom type
van der Waal pair interactions to be specified. Subsequent lines must have;

atom_i atom_j emin rmin [ emin14 [ rmin14 ]]

If emin is positive, a severe warning is issued. The wildcard "X" may
be given. In the case where both atoms are wildcards, the entire
nbond parameter set will be modified.
If emin14 and rmin14 are not specified, then the value of emin
and rmin will be used. NOTE: The previous value will not be used.
NBFIXes are processed in order. For that reason, wildcard
usage should come first. In the case of duplicate specifications,
there is no check, and the last specification will be used.
The maximun number of NBFIX entries is currently set at 150.
The space for this is allocated in PARMIO.

PARAMETER I/O ADDENDUM:
In order to calculate the Ryckaert-Bellemans torsional potential for butane
and other extended atom hydrocarbons, the following terms should be included
in the parameter file:

V = gamma[1.116 - 1.462cos(phi) - 1.578 cos**2(phi) + 0.368 cos**3(phi)
+ 3.156 cos**4(phi) - 3.788 cos**5(phi)]
and gamma = 1.987 kcal/mol

J. P. Ryckaert and A. Bellemans, Chem. Phys. Lett. 30, 123 (1975).
J. P. Ryckaert and A. Bellemans, Disc. Farad. Soc. 66, 95 (1978).

PHI
! Ryckaert Bellemans has trans = 0.0
! since cos is an even function cos(-phi)=cos(phi), invert the
! sign of the coefficients with odd power of cos(phi)
CH3E CH2E CH2E CH3E 0.470467 5 0.0
CH3E CH2E CH2E CH3E 0.783947 4 0.0
CH3E CH2E CH2E CH3E 2.53516 3 0.0
CH3E CH2E CH2E CH3E 1.56789 2 0.0
CH3E CH2E CH2E CH3E 2.34787 1 0.0
CH3E CH2E CH2E CH3E -4.70368 0 0.0

The potential should be used with SHAKE bonds and angles or bonds only
as required. The zero periodicity (constant) term should NOT be the
first in the set, otherwise it will be treated as an improper torsion.


Top
[SYNTAX RTF file format]

The Format of a Residue Topology File

Here is a description of what is currently (24-May-1982) in
residue topology files (as they are stored in ascii files). You may use
this format if you specify the CARD option in the READ command. The
format of binary files depends on the current implementation of the RTF
data structure (see RTF.FCM).

The purpose of residue topology files is to store the
information for generating a representation of macromolecule from its
sequence. These files are read by RTFRDR a subroutine in RTFIO which
should be be consulted for formats and the final word on what is
actually done with these files.

The residue topology files are named TOP... . There are two
forms, binary module (.MOD) and card format (usually .INP or .RTF)
although the binary is typically no longer used. The card format files
are structured as input files for CHARMM, beginning with a run title
and the command READ RTF CARD, followed by the actual topology file.

The first section of the topology files is a title section in
the usual format of up to ten lines delimited by a line containing only
a * in column 1.

The remaining information is read in free field format as
commands to define the RTF. The ordering of the commands is important
in that some information is needed to define others (i.e. the atoms
of a residue must be defined before the bonds between them).
The recommended structure of this file is:

Initial setup:
MASS specification for each atom type
DECLarations of out of segment definitions
DEFAults for patching on the fist and last residues
AUTOgenerate angles dihedrals patch
For each residue:
RESIdue name and total charge specification
(or PRESidue if this is a patch)
ATOM definitions within this residue
GROUping dividers between atom definitions
BOND specification
ANGLe specifications
DIHEdral angle specifications
IMPRoper dihedral angle specifications
CMAP dihedral angle specifications, resolution
DONOr specifications
ACCEptor specifications
IC information
PATChing residues to use if defaults are not desired
Closing:
END statement
Display control:
PRINT option

The format above is not rigid. In particular, The 'out of
residue declarations' may be augmented and redefined at any point.
These declarations are checked against all 'out of segment' atom
references. This is done to avoid potential problems where atom names
are misspelled. The number following the declaration is ignored, and
is for the users own reference (or debugging).

The syntax of all subcommands are as follows:


MASS atom-type-code atom-type-name mass
As of c40a1 the atom-type-code can be input as a -1 and this
generates atom-type-codes internally in a sequential manner
defining the atom-type-code for the current mass declaration
as that which increments NATC by one. One can mix both explicitly
specified atom-type-codes and -1 values, the atom-type-codes are
generated internally based on the atom-type-name.

DECLare out-of-residue-name
This adds names to be considered for possible connections
to the previous or next residues. This is done as a spelling
check. Any atoms names not contained with in the residue nor
on this list of declarations will be flagged as an error.
Use the symbol "-" as an atom name prefix to denote the previous
residue, use "+" for the subsequent residue. Use "#" as a prefix
for the (n+2) residue.

DEFAults [ FIRSt { name } ] [ LAST { name } ]
{ NONE } { NONE }

AUTOgenerate [ ANGLes ] [ DIHEdrals ] [PATCh] [DRUDe]
[NOANgles] [NODIhedrals]
[OFF]

{ RESIdue } name [total-charge]
{ PRESidue }
Residues labled PRES may only be used for patching. Residues
defined with RESI may not be used as a patch.

ATOM iupac atom-type-name charge repeat(exclusion-names)

GROUp
This keyword divides the structure into specific electrostatic
groups. These are used with explicit group-group electrostatic
options and are used to make the atom-atom list generation
more efficient. If a RESIdue does not start with a GROUp command,
then any ATOMs defined will belong to the last group of the
previous residue. Also, the maximum number of atoms allowed in
any group is currently set at 1000 (MAXING in dimens.fcm).
As a general guide, and electrostatic group should be roughly neutral
or have unit charge. A group should generally be a rigid group of
atoms, and should not have heavy (non-hydrogen) atoms in a 1-5
arrangement. Hydrogens should always be in the same group as its
bonded partner. A group should NEVER include two or more groups
of atoms that are not covalently linked.

BOND repeat(iupac iupac)

{ ANGLe } repeat(iupac iupac iupac)
{ THETa }

{ DIHEdral } repeat(iupac iupac iupac iupac)
{ PHI }

{ IMPRoper } repeat(iupac iupac iupac iupac)
{ IMPHi }

{ CMAP } repeat(iupac iupac iupac iupac iupac iupac iupac iupac)

DONOr [ hydrogen ] [ heavy-atom ] [ antecedent-1 antecedent-2 ]
[ BLNK ] [ hydrogen ]

The antecedents are not required unless hydrogen position
generation is desired.

ACCEptor iupac [iupac [iupac] ]
The first antecedents is required if and angle dependence about
the acceptor atom is desired. The second antecedent is unused.

{ IC }
{ BILD } name name name name bond angle phi angle bond
{ BUILd }
BLNK may be used to indicate a missing atom name.

DELEte { ATOM } iupac [COMBine iupac]
{ BOND } (iupac iupac)
{ THETa | ANGLe } (iupac iupac iupac)
{ DIHEdral | PHI } (iupac iupac iupac iupac)
{ IMPHi | IMPRoper } (iupac iupac iupac iupac)
Deletions are allowed only in patch residues (PRES); the optional
COMBine keyword for ATOM deletions allows passing part of the IC
data for the deleted atom to the "combine" atom, i.e. stereochemistry
of atoms bonded to the deleted atom. In order to use the COMBine
option, both atoms must be present in the PSF and it must be invoked
from the PATCh command (not the GENErate command).

PATChing [ FIRSt { name } ] [ LAST { name } ]
{ NONE } { NONE }

PRINt { ON }
{ OFF }

The PRINt command may be used to control the display of lines as
they are read by the RTF reader. The initial setting for printing is
controlled by the READ command itself. If PRINT is specified, then
printing will initially be enabled; otherwise, the commands will not
be echoed. PRINT ON turns on echoing of RTF specifications; PRINT OFF
turns them off. This command is useful for debugging an addition to a
previously tested topology file.

A small sample RTF card file follows:

* title for documentation example

18 1
MASS 1 H 1.00800
MASS 11 C 12.01100
MASS 12 CH1E 13.01900
MASS 13 CH2E 14.02700
MASS 14 CH3E 15.03500
MASS 31 N 14.00670
MASS 38 NH1 14.00670
MASS 51 O 15.99940
MASS 56 OH2 15.99940

DECL -C
DECL -O
DECL +N
DECL +H
DECL +CA

DEFA FIRS NTER LAST CTER

RESI ALA 0.00000
GROU
ATOM N NH1 -0.35
ATOM H H 0.25
ATOM CA CH1E 0.10
GROU
ATOM CB CH3E 0.00
GROU
ATOM C C 0.45
ATOM O O -0.45
BOND N CA CA C C +N C O N H
BOND CA CB
THET -C N CA N CA C CA C +N
THET CA C O O C +N -C N H
THET H N CA N CA CB C CA CB
DIHE -C N CA C N CA C +N CA C +N +CA
IMPH N -C CA H C CA +N O CA N C CB
CMAP -C N CA C N CA C +N
DONO H N -C CA
ACCE O C
BILD -C CA *N H 0.0000 0.00 180.00 0.00 0.0000
BILD -C N CA C 0.0000 0.00 180.00 0.00 0.0000
BILD N CA C +N 0.0000 0.00 180.00 0.00 0.0000
BILD +N CA *C O 0.0000 0.00 180.00 0.00 0.0000
BILD CA C +N +CA 0.0000 0.00 180.00 0.00 0.0000
BILD N C *CA CB 0.0000 0.00 120.00 0.00 0.0000


RESI OH2 0.00000
GROUP
ATOM OH2 OH2 -0.40000 H1 H2
ATOM H1 H 0.20000 H2
ATOM H2 H 0.20000
BOND OH2 H1 OH2 H2
THET H1 OH2 H2
DONO H1 OH2 -O -O
DONO H2 OH2 -O -O
ACCE OH2
PATC FIRS NONE LAST NONE

END

NOTES::

The use of improper dihedrals for the PSF is unrelated
to the use of improper dihedrals for the internal coordinate tables.


L
PSF usage: |
|
I
/ \
/ \
-----J---- K------


IC table usage:

I L
\ /
\ /
*K
|
|
J


Note that for PSF usage the first atom is the central atom,
and the last atom is the atom to be restained relative to
the axis defined by the middle pair of atoms. For the IC table
usage, the central atom is in the third position, but the
axis is again defined by the middle pair of atoms.

Also note that as of c40a the atom-type-code that follows the MASS
statement in the RTF (as described above for the PARAMETER file)
can be given as -1, which will cause a sequential atom-type-code,
array ATCT in the RTF, to be placed at NTCT+1, where NTCT is the
value of the largest atom-type-code specified to date.


Top
Reading data other than the sequence or coordinates


The parameter files (PARA) and internal coordinate files (IC)
and hydrogen bond (HBONd) data files can be read as card images or binary
files. Specifying CARD signifies card image input; specifying FILE
signfies binary file input. Please note that topology file must be read
in before the parameters can be read.

Protein structure files (PSF) files can be read and written in
card/ascii format either with the "old" CHARMM format of using the "XPLOR"
format, which replaces the atom-type code with the atom type name. This
has been implemented both for ascii (card) and binary (file) type reads/writes,
but reading a binary (file) formated psf file is no longer supported in

The non bonded list (NBONd) can only be read as a binary file.
The constraints (CONStraint) which includes dihedral restraints may only
be read as formatted file (card).

There are two types of IC card files (residue number vs. resid's).
The residue number option is the default, and atom assignments are based
on residue number. This is the low precision form. The resid option
is the high precision form and atom assignments are based on SIGID's and
RESID's. This is also useful where different homologies are used.

The Image file (IMAGes) containing transformation information can
only be read in card image format (» images ).
The INIT keyword will remove all existing image data. Without the
INIT keyword, any existing image items (such as bonds) would be kept.
This allows one to modify the crystal geometry without the necessity
of regenerating all image items.

The TABLe file contains the nonbond energy lookup information.
Once read in, The effects cannot be reversed. The nonbond energy
evaluation is now under control of the table routines.

The BTABle file contains the bonded energy lookup information.
Once read in, the effects cannot be reversed. The bonded energy
evaluation is now under control of the table routines. The table must
be read after the structure (psf) of the system is set up. The format
of the table is the following:

BOND
A B
104
1.86999999999998e+00 -3.74058562118872e+02 1.90794423875772e+02
1.88999999999998e+00 -3.66707803850426e+02 1.83386760216079e+02
...

Here 'BOND' is the type of the bonded interaction. The other choices
are 'ANGLE' and 'DIHE', for angular and dihedral interactions.
Second line 'A B' defines the two atom types between which the bonded
interaction is set. For 'ANGLE' three atom types are required, and for
'DIHE' four atom types are required. Third line defines the number of
entries in the lookup table. The lookup table itself starts on the
fourth line with three real numbers per line. The first column is the
distance between the atoms for 'BOND', or the angle in radians for
'ANGLE' and 'DIHE'. The second column is the gradient of the
potential energy (-1 x force) and the third column is the potential
energy.

The NM file contains normal modes from previous *note
DIMS:» dims runs. With the help of this file those modes can be
avoided on subsequent runs, which increases the diversity of
trajectories. This is part of the 'self avoidance' algorithm in the
DIMS module and used with the DIMS keywords MTRA, NWIND, and NMUN.


Top
WRITe - Writes Data Structures to External Files

[SYNTAX WRITe]

Syntax

WRITe { { PSF } [FILE] } UNIT unit-number |
NAME filename
{ [CARD] [XPLOr] }
{ { RTF } }
{ { PARAmeter } }
{ { NBONd }* }
{ { TABLe } }
{ }
{ { COORdinate coor-spec } [CARD] }
{ [PDB [MODEL int [FIRSt|LAST]] [OFFI]}
{ [DUMB] }
{ [XYZ] }
{ { IC [RESId] [SAVEd] } [FILE] [RTF]}
{ { HBONd [ANAL] } }
{ }
{ { IMAGes imag-spec} [CARD] }
{ { ENERgy } }
{ { CONStraint [PSF 0] } }
{ { TITLe } }

title

{ NAMD FILE "filename" ** }
** no other options are available

coor-spec:== [COMP] [OFFS int] [IMAGes] atom-selection

imag-spec::= [ TRANsformations ] [ FORCes ] [ PSF ]

*: The NBOND list can only be WRITten in binary (FILE) form. Use PRINt to get
formatted output.

Function

The primary purpose of this command to save some of CHARMM's
data structures. The coordinate and internal coordinate data structures
can be written in formatted form so that they be edited independent
of CHARMM using a text editor. The option, FILE, specifies that a file
is to be written in unformatted form (binary). The option, CARD,
specifies that a file is to written in formatted form. For the
coordinate and internal coordinate file, CARD is the default. The
coordinate option PDB gives a file in Protein Data Bank format, with
just the ATOM records; the MODEL N option writes a PDB file in the NMR-style
multiple coordinate set format (note that for this to work the file has to
be specified as UNIT <int>, not as NAME <string>):

MODEL 0 (or no MODEL keyword) just write standard PDB file
MODEL 1 writes beginning of multicoordinate file (title, MODEL 1,
coor, TER, ENDMDL)
MODEL N (N>1) appends just coordinates for MODEL N (MODEL N, coor,
TER, ENDMDL)
MODEL N (N<0) appends last coordinate set, and END (MODEL |N|, coor,
TER, ENDMDL, END)

Keyword FIRSt forces writing of title even if N.NE.1, LAST forces
writing of END line.

The XPLOr option of WRITe PSF produces an XPLOR style PSF file (atom
names are used instead of atom numbers)

The selection of "PSF 0" in the WRITe CONS only works with PERT and
writes data for the lambda=0 PSF.

A set of title lines must follow the WRIT command. This title
will be written at the start of the file and serves to document the
file. For your protection, one should always make good use of this
title, as it may be the only documentation for the file.

The UNIT keyword specifes what Fortran unit the output should be
written to. It cannot be omitted unless the filename is provided with the NAME
keyword.

The XYZ keyword writes a simple .xyz format file with (A8,3F11.5),
as an export format for other programs; the first title line is used for
the comment record.


Top
PRINt - writes information to output file (unit 6)

[SYNTAX PRINt]

Syntax

PRINt { PSF [XPLOr] }
{ RTF }
{ CONStraint [PSF 0] }
{ PARAmeter [USED] }
{ RESIdue }
{ COORdinate coor-spec }
{ IC [ SAVEd ] [RTF] }
{ HBONd [ ANAL ] }
{ NBOND }
{ IMAGes imag-spec }
{ TITLe }
{ ENERgy }

coor-spec::= [COMP] [OFFS int] [IMAGes] atom-selection

imag-spec::= [ TRANsformations ] [ FORCes ] [ PSF ]


Syntactic ordering: All commands must be typed in the order shown.


Function

This command is used to list information contained in data
structures used by the program. The information must already have been
created through use of a READ, GENE, HBON, etc., command. The printable
output is sent to unit 6.

The XPLOr option of PRINt PSF produces an XPLOR type PSF
listing. Atom names are printed instead of atom numbers.

The selection of "PSF 0" in the PRINt CONS only works with PERT
and prints data for the lambda=0 PSF.

For printing paramters, the USED option causes the print of only
the paramters that were used in the most recent energy evaluation. This
option is PSF dependent.

For hydrogen bonds, ANAL gives a geometrical and energy analysis
of the hydrogen bonds. Representing the hydrogen bond as
A2-A1-X-H....Y-, the distances X-Y, H-Y, the angle (180 - <X-H-Y ), the
dihedral angle A2-A1-X-H and the hydrogen bond energy contribution are
listed. A more versatile hbond analysis facility is provided by
COOR HBOND (» corman ).


Top
[SYNTAX TITLe]

Titles - Specifying and manipulating

Titles are optional. All title lines MUST begin with a "*".
If no title is specified, the title will be untouched. This is useful when
a series of titles are needed. Titles are terminated with a line containing
only a "*" in the first colunm. There may be up to 32 lines contained
in any title.
The titles are read using RDCMND, thus parameter substitutions are
allowed.
A command TITLe has been added to CHARMM which can be used to specify
a title to be used by subsequent write commands.
For interactive use, A title is always required (no backspace can
be done) when RDTITL is called.

The date,time, and user is added at the end of the title when
a title is written to a file. If a date and time is already present,
it will be superceeded. For the print option, the date and time
information is left as it was.
A second title array TITLEB has been added to CTITLA.FCM
TITLEA is to be used for writing, and TITLEB must be used for reading
from data files. In this way, the main title is never destroyed by reading
a data file. For any write command, TITLEA can be modified by specifying
a title. Any further writes will use that title, unless a new title is
specified.
As it is now, title lines should not end in "-" and any characters
beyond a "!" will not be included in the title.
Titles may begin with a "#" as well as "*". The pound sign
is converted to a "*" upon reading. When the first title line begins
with "#", the old title is not destroyed. All entered title lines
superceed any previous title lines. Obviously, if more title lines are entered
than were previously present, then there will be no difference in the two
methods. This option was added for cases where a series of identical
titles, except for a different first line, was needed.
The COPY keyword of the TITLe command will copy the current TITLB
(the reading title) to TITLA (the writing title) before reading the
subsequent title. If there is no subsequent title, then just a copy is done.

Normally, when titles are written to card files, the first column
"*"s are retained. With the WRITe TITLe command, several changes are made.
First, the first colunm of "*"s is suppressed. Second, no date and time
and username is added. Third, the file is not closed. This command is
primarily used for creating files for plotting. It is often used in
conjunction with looping and energy terms. Here is an example of possible
applications;

OPEN WRITE CARD UNIT 23 NAME ENERGY.DAT ! Open the file for plot data
WRITE TITLE UNIT 23
* this file contains .....
* more message data .....

SET 1 -180.0 ! Set the initial dihedral angle value
LABEL LOOP ! Here is the loop return point
CONS DIHE ....... MIN @1 ! Introduce the desired dihedral constraint
MINIMIZE ..... ! Minimize
CONS CLDH ! Remove the dihedral constraint
SET 2 @1 ! copy parameter one to parameter two
TRIM 2 FROM 1 TO 10 ! Pad parameter two with blanks for formatting
! It will now be 10 characters long
WRITE TITLE UNIT 23
* DIHEDRAL = @2 ENERGY = ?ENER ! write this only this line to unit 23

INCREMENT 1 BY 15.0 ! Add 15 to parameter one
IF 1 LT 180.1 GOTO LOOP
CLOSE UNIT 23
STOP


Top
I/O Format Control

IOFOrmat [ EXTEnded | NOEXtended ]

In c30a2, the PSF entries are extended for I10 atom numbers and
character*8 PSF IDs (SEGID, RESID,RES and TYPE). Atom numbers take
I5 in coordinate files and I8 in psf files and CHARACTER*4 PSF IDs are
used for Normal (noextended) I/O operation. These are expanded to I10 and A8
respectively. Noextended format is the default and the extended
format is used only when the number of atoms is greater than 100000 or
any PSF ID is longer than 4 characters. The IOFOrmat command overrides the
default: IOFOrmat EXTEnded enforces the extended format and
IOFOrmat NOEXtended does the normal (old) format.