- Home
- CHARMM Documentation
- Version c41b2
- fitparam

# fitparam (c41b2)

Parameter Fitting Procedure

By Victor Anisimov (victor@outerbanks.umaryland.edu)

and Alex MacKerell Jr. (alex@outerbanks.umaryland.edu); December 2007

FITPARAM is a parameter fitting procedure that is primarily designed to fit

partial atomic charges and atomic polarziabilities based on the Drude

oscillator model to interaction energy data and dipole moments, though it may

be applied to fitting of other parameters (see below). It supports optimization

of multiple parameters for a series of model compounds sharing common parameter

sets. Different weights can be assigned to different target data. Optimized

parameters can be restrained to their corresponding initial values by using

a parabolic penalty function. FITPARAM performs non-linear least square fitting

using the Levenberg-Marquardt algorithm.

* Introduction | Overview of functionality

* Syntax | Syntax of commands

* Keywords | Description of keywords

* Format | File format

* Example | Input examples

* Limitations | Known limitations

By Victor Anisimov (victor@outerbanks.umaryland.edu)

and Alex MacKerell Jr. (alex@outerbanks.umaryland.edu); December 2007

FITPARAM is a parameter fitting procedure that is primarily designed to fit

partial atomic charges and atomic polarziabilities based on the Drude

oscillator model to interaction energy data and dipole moments, though it may

be applied to fitting of other parameters (see below). It supports optimization

of multiple parameters for a series of model compounds sharing common parameter

sets. Different weights can be assigned to different target data. Optimized

parameters can be restrained to their corresponding initial values by using

a parabolic penalty function. FITPARAM performs non-linear least square fitting

using the Levenberg-Marquardt algorithm.

* Introduction | Overview of functionality

* Syntax | Syntax of commands

* Keywords | Description of keywords

* Format | File format

* Example | Input examples

* Limitations | Known limitations

Top

Overview of functionality

The primary purpose of FITPARAM is charge derivation using various types of

interaction energy as a target data. In this goal it complements the

functionality of the FITCHARGE module which derives charges solely from

fitting to electrostatic potentials (» fitcharge ). However, FITPARAM

is not limited to charge derivation and it can be used to optimize any sort of

parameters using any sort of relevant target data.

Although implemented in CHARMM the FITPARAM module is in fact a stand-alone

procedure. The user is not required to load topology and generate a molecule

and if loaded this information will be ignored. The command FITPARAM operates

with generic parameters without the awareness whether these are point charges,

atomic polarizabilities or dihedral force constants. Thus, any parameter may

be optimized using FITPARAM based on basic least square fitting.

Correspondingly, calculation of the target property and its gradient has to be

performed separately by external routines. Preparing the necessary scripts

to accomplish this task becomes the user assignment. Despite this obvious

inconvenience, such architectural design is necessary to allow for flexibility

in the target data used, which otherwise would not be accessible for the

parameter optimization. It also makes possible parameter fitting for a variety

of molecules to be performed simultaneously.

In order to start parameter optimization FITPARAM needs an initial guess for

parameters. It reads the parameter values from the input file, performs a

cycle of least square optimization and saves the updated parameters in the

output file. Then FITPARAM executes the external job, which performs target

property and gradient calculations using the current parameter values and

returns the control back to FITPARAM. These iterations are continued until

convergence criterion is satisfied.

Using FITPARAM to optimize charges requires the user to preserve the total

charge of the optimized molecule. This can be accomplished by excluding one

charge from the fitting and computing the value of the excluded charge in the

external procedure as the difference in total charge of the molecule and the

sum of optimized charges. In general it is not important which atomic charge

to exclude from the fitting. The general suggestion is to exclude that charge

which appears least important in reproducing the value of the target data.

Dipole moment information can be provided to FITPARAM in two different ways. A

dipole may be presented in scalar form as the total dipole moment or in vector

form representing three dipole moment components. These two ways of dipole

moment specification are handled independently by FITPARAM. When dealing with

multiple molecules each molecule may be represented by its own total dipole

moment and dipole components. The total dipole moment value may be defined as

the gas-phase experimental value, or QM computed value, or manually defined

value the user is targeting as the result of the fitting. Providing manually

increased (or HF/6-31G* computed) dipole moments is a standard practice when

deriving charges for a non-polarizable model. Providing the dipole moment in

vector form has the purpose of minimizing the angle between the model and

target dipole moment vectors.

Overview of functionality

The primary purpose of FITPARAM is charge derivation using various types of

interaction energy as a target data. In this goal it complements the

functionality of the FITCHARGE module which derives charges solely from

fitting to electrostatic potentials (» fitcharge ). However, FITPARAM

is not limited to charge derivation and it can be used to optimize any sort of

parameters using any sort of relevant target data.

Although implemented in CHARMM the FITPARAM module is in fact a stand-alone

procedure. The user is not required to load topology and generate a molecule

and if loaded this information will be ignored. The command FITPARAM operates

with generic parameters without the awareness whether these are point charges,

atomic polarizabilities or dihedral force constants. Thus, any parameter may

be optimized using FITPARAM based on basic least square fitting.

Correspondingly, calculation of the target property and its gradient has to be

performed separately by external routines. Preparing the necessary scripts

to accomplish this task becomes the user assignment. Despite this obvious

inconvenience, such architectural design is necessary to allow for flexibility

in the target data used, which otherwise would not be accessible for the

parameter optimization. It also makes possible parameter fitting for a variety

of molecules to be performed simultaneously.

In order to start parameter optimization FITPARAM needs an initial guess for

parameters. It reads the parameter values from the input file, performs a

cycle of least square optimization and saves the updated parameters in the

output file. Then FITPARAM executes the external job, which performs target

property and gradient calculations using the current parameter values and

returns the control back to FITPARAM. These iterations are continued until

convergence criterion is satisfied.

Using FITPARAM to optimize charges requires the user to preserve the total

charge of the optimized molecule. This can be accomplished by excluding one

charge from the fitting and computing the value of the excluded charge in the

external procedure as the difference in total charge of the molecule and the

sum of optimized charges. In general it is not important which atomic charge

to exclude from the fitting. The general suggestion is to exclude that charge

which appears least important in reproducing the value of the target data.

Dipole moment information can be provided to FITPARAM in two different ways. A

dipole may be presented in scalar form as the total dipole moment or in vector

form representing three dipole moment components. These two ways of dipole

moment specification are handled independently by FITPARAM. When dealing with

multiple molecules each molecule may be represented by its own total dipole

moment and dipole components. The total dipole moment value may be defined as

the gas-phase experimental value, or QM computed value, or manually defined

value the user is targeting as the result of the fitting. Providing manually

increased (or HF/6-31G* computed) dipole moments is a standard practice when

deriving charges for a non-polarizable model. Providing the dipole moment in

vector form has the purpose of minimizing the angle between the model and

target dipole moment vectors.

Top

Syntax of commands

[SYNTAX FITPARAM - parameter optimization]

FITPARAM {

[NITEr int] [TOLErance real] [COUNter int]

NEXPeriments int NPARameters int

GUESs int EXPData int PARM int

[EXPWeight int] [RESTraints int]

{

[ANTOINE]

[ [NDIPoles int] [FVALues int REXternal str [MULTiplicity int] ]

}

}

Syntax of commands

[SYNTAX FITPARAM - parameter optimization]

FITPARAM {

[NITEr int] [TOLErance real] [COUNter int]

NEXPeriments int NPARameters int

GUESs int EXPData int PARM int

[EXPWeight int] [RESTraints int]

{

[ANTOINE]

[ [NDIPoles int] [FVALues int REXternal str [MULTiplicity int] ]

}

}

Top

Description of keywords

NITEr - (optional) maximum number of Levenberg-Marquardt (least square)

iterations. Default value NITE=100.

TOLErance - (optional) relative tolerance on the convergence of the

minimized function. Default value TOLE=1.0E-4.

COUNter - (optional) is number of iterations under the tolerance it needs to

converge. Default value COUN=2. Greater values can be used to assure the

convergence.

NEXPeriments - (mandatory) number of target data. Care must be taken when

handling dipole moments which can be provided in scalar (single value for

total dipole) or vector form (three dipole component values in a row). Three

components of a vector are considered as one target value for NEXP keyword.

Total dipole moment value counts independently as one experimental value. This

is also discussed in the examples section.

NPARameters - (mandatory) number of parameters to optimize.

GUESs - (mandatory) file unit number for initial parameter guess.

EXPData - (mandatory) file unit number for target ("experimental") data.

PARM - (mandatory) file unit number to save optimized parameter values.

EXPWeight - (optional) file unit number for individual weights assigned to

each target value. Default value is 1.0 for each target value.

RESTraints int - (optional) file unit number for individual restraint

factors assigned to each optimized parameter. Default value is 0.0, which

means free fitting without restraints.

Following keywords are exclusive.

ANTOINE - (mandatory if Antoine Constant calculation is expected) switches

on Antoine parameter fitting to find analytical dependence in experimental

vapor pressure - temperature dependence. This information is useful for

calculation of experimental heat of vaporization.

or

NDIPoles - (mandatory, if dipoles are provided in vector form) number of

dipole moments specified in vector form (X Y Z form). Default value is 0. Note,

NDIP keyword counts only the number of dipoles provided in vector form (X Y Z

components). Correspondingly, NDIP does not count the number of dipoles

provided in scalar form (total dipole moment).

FVALues - file unit number. This file provides function values and gradients

to FITPARAM. The data in this file have to be computed externally using updated

parameter values (which are saved in PARM unit by FITPARAM). FITPARAM needs

current target function values and gradients to perform next parameter

optimization iteration.

REXternal - external procedure provided in the form of a string enclosed in

double quotes. This string will be executed by CHARMM. This external procedure

is in charge of calculating current function values and gradients.

MULTiplicity - (optional) file unit number. The data in this file specify th

multiplicity of the optimized parameters. Default value of multiplicity for

each parameter value is 1. FITPARAM computes the total charge using the

information about charge multiplicity. This data makes no influence on the

progress of parameter optimization and is implemented for debugging purpose

only. For example, three hydrogen atoms in methyl group carrying the same

charge value imply a multiplicity of 3. Correct accounting for charge

multiplicity will help FITPARAM to provide meaningful information about total

charge of the optimized molecule. Non-charge parameters should be assigned

multiplicity 0. This is particularly helpful to when FITPARAM optimizes charges

simultaneously with non-charge parameters.

Description of keywords

NITEr - (optional) maximum number of Levenberg-Marquardt (least square)

iterations. Default value NITE=100.

TOLErance - (optional) relative tolerance on the convergence of the

minimized function. Default value TOLE=1.0E-4.

COUNter - (optional) is number of iterations under the tolerance it needs to

converge. Default value COUN=2. Greater values can be used to assure the

convergence.

NEXPeriments - (mandatory) number of target data. Care must be taken when

handling dipole moments which can be provided in scalar (single value for

total dipole) or vector form (three dipole component values in a row). Three

components of a vector are considered as one target value for NEXP keyword.

Total dipole moment value counts independently as one experimental value. This

is also discussed in the examples section.

NPARameters - (mandatory) number of parameters to optimize.

GUESs - (mandatory) file unit number for initial parameter guess.

EXPData - (mandatory) file unit number for target ("experimental") data.

PARM - (mandatory) file unit number to save optimized parameter values.

EXPWeight - (optional) file unit number for individual weights assigned to

each target value. Default value is 1.0 for each target value.

RESTraints int - (optional) file unit number for individual restraint

factors assigned to each optimized parameter. Default value is 0.0, which

means free fitting without restraints.

Following keywords are exclusive.

ANTOINE - (mandatory if Antoine Constant calculation is expected) switches

on Antoine parameter fitting to find analytical dependence in experimental

vapor pressure - temperature dependence. This information is useful for

calculation of experimental heat of vaporization.

or

NDIPoles - (mandatory, if dipoles are provided in vector form) number of

dipole moments specified in vector form (X Y Z form). Default value is 0. Note,

NDIP keyword counts only the number of dipoles provided in vector form (X Y Z

components). Correspondingly, NDIP does not count the number of dipoles

provided in scalar form (total dipole moment).

FVALues - file unit number. This file provides function values and gradients

to FITPARAM. The data in this file have to be computed externally using updated

parameter values (which are saved in PARM unit by FITPARAM). FITPARAM needs

current target function values and gradients to perform next parameter

optimization iteration.

REXternal - external procedure provided in the form of a string enclosed in

double quotes. This string will be executed by CHARMM. This external procedure

is in charge of calculating current function values and gradients.

MULTiplicity - (optional) file unit number. The data in this file specify th

multiplicity of the optimized parameters. Default value of multiplicity for

each parameter value is 1. FITPARAM computes the total charge using the

information about charge multiplicity. This data makes no influence on the

progress of parameter optimization and is implemented for debugging purpose

only. For example, three hydrogen atoms in methyl group carrying the same

charge value imply a multiplicity of 3. Correct accounting for charge

multiplicity will help FITPARAM to provide meaningful information about total

charge of the optimized molecule. Non-charge parameters should be assigned

multiplicity 0. This is particularly helpful to when FITPARAM optimizes charges

simultaneously with non-charge parameters.

Top

File format

GUESs unit

Includes NPAR number of lines.

Format: string (up to 20 characters before real number is encountered), real

(recognized by decimal point). Example: "A 17.81671". The string portion of

the data will be printed by FITPARAM as the parameter name. One may use the

string portion to do some useful work for external job, e.g. "set A 17.81671"

would turn the parameter file into a functional CHARMM script file.

EXPD unit

Includes NEXP number of lines.

Format: one real value per string. (Exception is Antoine input file which

contains two real numbers in free format. First value is temperature; the

second one is vapor pressure.) Total dipole moment is defined by one value.

Dipole moment components are specified by three numbers in free format.

Example:

1.63 ! dipoleTotal MP2 value

-1.5155 0.1868 -0.5583 ! dipoleXYZ

Comments can be placed after exclamation mark. This information will be

ignored by FITPARAM.

PARM unit

Includes NPAR number of lines.

Format: A20,F16.8

This is an output file which will be created by FITPARAM.

FVAL unit

Includes NEXP + NEXP * NPAR number of lines.

Format: The format is free. This file is created by external job and read by

FITPARAM. The external job creating this file has to take care of the following

requirements. One real number per string is expected for scalar values. Three

numbers are provided for a vector value. First NEXP data are the function

values which are computed for NEXP target data using the current parameter

values. The order of computed values must be the same as in the EXPD file.

Next NPAR * NEXP data are partial derivatives computed in the order of

experimental data (first running index) and parameter data (second running

index). Example:

E1

E2

E3

dE1/dp1

dE2/dp1

dE3/dp1

dE1/dp2

dE2/dp2

dE3/dp2

Where E1, E2, E3 are target (experimental) data; p1 and p2 are optimized

parameters. Derivatives of vector properties are provided by three real

numbers in a row (see example below).

EXPW unit

Includes NEXP number of lines.

Format: one real number in a row.

MULT unit

Includes NPAR number of lines.

Format: one integer number in a row.

REST unit

Includes NPAR number of lines.

Format: one real number in a row.

File format

GUESs unit

Includes NPAR number of lines.

Format: string (up to 20 characters before real number is encountered), real

(recognized by decimal point). Example: "A 17.81671". The string portion of

the data will be printed by FITPARAM as the parameter name. One may use the

string portion to do some useful work for external job, e.g. "set A 17.81671"

would turn the parameter file into a functional CHARMM script file.

EXPD unit

Includes NEXP number of lines.

Format: one real value per string. (Exception is Antoine input file which

contains two real numbers in free format. First value is temperature; the

second one is vapor pressure.) Total dipole moment is defined by one value.

Dipole moment components are specified by three numbers in free format.

Example:

1.63 ! dipoleTotal MP2 value

-1.5155 0.1868 -0.5583 ! dipoleXYZ

Comments can be placed after exclamation mark. This information will be

ignored by FITPARAM.

PARM unit

Includes NPAR number of lines.

Format: A20,F16.8

This is an output file which will be created by FITPARAM.

FVAL unit

Includes NEXP + NEXP * NPAR number of lines.

Format: The format is free. This file is created by external job and read by

FITPARAM. The external job creating this file has to take care of the following

requirements. One real number per string is expected for scalar values. Three

numbers are provided for a vector value. First NEXP data are the function

values which are computed for NEXP target data using the current parameter

values. The order of computed values must be the same as in the EXPD file.

Next NPAR * NEXP data are partial derivatives computed in the order of

experimental data (first running index) and parameter data (second running

index). Example:

E1

E2

E3

dE1/dp1

dE2/dp1

dE3/dp1

dE1/dp2

dE2/dp2

dE3/dp2

Where E1, E2, E3 are target (experimental) data; p1 and p2 are optimized

parameters. Derivatives of vector properties are provided by three real

numbers in a row (see example below).

EXPW unit

Includes NEXP number of lines.

Format: one real number in a row.

MULT unit

Includes NPAR number of lines.

Format: one integer number in a row.

REST unit

Includes NPAR number of lines.

Format: one real number in a row.

Top

Input Examples

Two examples given in this section illustrate basic functionality of FITPARAM.

First example covers optimization of Antoine function parameters. This is an

example where the function value and gradient computations are implemented

inside the FITPARAM module so there is no need to call an external procedure.

Therefore this is an exception to the standard use of FITPARAM. This example

is discussed here because of its simplicity and because it is included in

The Antoine function is a simple parametric analytical function which is used

to describe experimental vapor pressure - temperature dependence. It has the

following form: lnP = A + [B / (T + C)], where A,B, and C are fitted

parameters, P - pressure, T- temperature.

Having this function gives the opportunity to compute derivative of pressure

over temperature which is necessary to compute the heat of vaporization of the

pure liquid. Following is the content of the antoine.inp script:

open unit 11 read form name antoine.ini ! initial guess for parameters

open unit 12 read form name antoine.exp ! experimental (target data)

open unit 13 write form name antoine.prm ! storage for optimized parameters

FITPARAM -

NITE 50 - ! maximum number of iterations

TOLE 0.001 - ! chi^2 convergence threshold

COUN 2 - ! number of consecutive successful steps before convergence

NEXP 8 - ! number of experimental data

NPAR 3 - ! number of parameters to fit (2 or 3 Antoine coefficients)

ANTOINE - ! Antoine coefficient fitting

GUES 11 - ! input: initial guess for parameters

EXPD 12 - ! input: data to fit to

PARM 13 ! output: file to store optimized parameters

stop

The computation starts from opening two input files antoine.ini and

antoine.exp, which contain initial guess for parameters and target experimental

data, respectively. Following is the content of antoine.ini file:

A 17.81671

B 4705.03330

C -60.75000

Three parameters, NEXP=3, will be optimized starting form the above values.

The antoine.exp file contains the following data:

393.15 3.649359

398.15 3.877432

403.15 4.076690

408.15 4.264087

413.15 4.461877

418.15 4.651099

423.15 4.825109

428.15 5.018603

Here we have 8 experimental data, NEXP=8, representing temperature - vapor

pressure data.

Following parameter values are obtained after the execution of the above

script:

A 17.84653417

B 4706.72855119

C -61.45937115

The second example illustrates how to set up FITPARAM calculation for a charge

parameter optimization calculation. The example is based on hydroxyl charge

optimization in ethanol targeting interactions with water, which is a standard

charge derivation procedure in the additive CHARMM force field. In the example

the alkane charges in methyl group are constrained to their standard values.

The methyl group is also kept electro-neutral. Two lone pairs (OLP) are

assigned to oxygen atom in this example; therefore the oxygen atom is

represented by two point charges (qOLP). Note, the central O atom carries zero

charge. The charges to be determined are qC, two qHC, two qOLP, and qHO.

From these variables one should be excluded. This charge should be assigned

manually to maintain a total charge of zero. In the present example we exclude

the methylene group hydrogen atom charge: qH = -1/2 * (2*qOLP + qC + qHO).

Remaining variables define NPARM=3. The lone pairs are equivalent therefore

they contribute as one parameter qOLP. Correspondingly, the initial guess

parameter file "parameters.ini" contains the following data:

set qOLP -0.23

set qHO 0.36

set qC -0.06

Because the qOLP parameter has the multiplicity of 2 we declare this in the

"multiplicity" file:

group

Here the parameter multiplicity is preceded with the "group" keyword which

indicates the program that an electro-neutral group is being treated. If the

methyl group charges were also included in the fitting, then the

"multiplicity" file would contain the following data:

group

group

Here we optimize the charge on methyl hydrogens (which has multiplicity 3),

whereas the charge on methyl carbon has to be computed externally to keep the

CH3 group electro-neutral.

The charges specified in the "parameters.ini" file are taken from the CHARMM22

force field. To restrain the charges to their initial values we specify

"restraints" file using the following restraint factors:

0.1

0.1

0.1

The smaller the number the weaker the restraint. To choose a particular value

one needs to do some experimenting. Next we define the target data in

"expdata"

file:

-4.89 ! C-O-H angle bisector

-2.51 ! along C-O line

-4.80 ! lone-pair position

-4.36 ! along O-H line

1.8 ! dipoleTotal MP2 = 1.6259

-1.5155 0.1868 -0.5583 ! dipoleXYZ

Presented is ethanol - water QM interaction energies and the ethanol dipole.

First four values in the expdata file are interaction energies for the

corresponding four water orientations. The fifth value is the ethanol dipole

moment set to 1.8 Debye. Note, that the MP2 dipole is 1.63. The dipole moment

is purposefully increased to provide an illustrative example that we are free

to define any target data we want our model to reproduce. While we are

experimenting with the magnitude of the total dipole moment we certainly want

to minimize the angle between the target and empirical dipole moment vectors.

Therefore the sixth line of data is the dipole moment specified in vector form.

Overall, we have six target data, NEXP=6, and one dipole vector, NDIP=1.

Weighting of data is performed via the "expweight" file that instructs

FITPARAM of the level of priority assigned to the target data:

1.

1.

1.

1.

100.

100000.

Each line in this file applies to corresponding target data in the "expdata"

file. The value 1.0 means a standard weight. The larger the value we specify

the stronger FITPARAM will try to match the target data during the parameter

optimization. One needs to make some empirical adjustments in order to define

the optimal value for the each weighting factor.

The FITPARAM script to run the optimization has the following form:

open unit 11 read form name parameters.ini

open unit 12 read form name expdata

open unit 13 write form name parameters

open unit 14 write form name fvalues

open unit 15 read form name expweight

open unit 16 read form name multiplicity

open unit 17 read form name restraints

FITPARAM -

NITE 100 - ! max number of iterations

TOLE 0.0001 - ! convergence tolerance

COUN 4 - ! steps to test the convergence

NEXP 6 - ! number of target data

NDIP 1 - ! number of dipole moments among the target data

NPAR 3 - ! number of parameters to optimize

GUES 11 - ! initial guess for parameters

EXPD 12 - ! target data

PARM 13 - ! place to store optimized parameters between loops

FVAL 14 - ! function values and derivatives

REXT "./run.pl ./parameters ./fvalues" - ! script to compute derivatives

EXPW 15 - ! weight of individual target data, optional

MULT 16 - ! parameter multiplicity, optional

REST 17 ! weight of parameter restraint, optional

stop

After reading initial guesses for parameters specified in the "parameters.ini"

file FITPARAM will write updated parameter values in the "parameters" file

(see unit PARM 13). The "parameters" file can be streamed into a CHARMM script

to be run externally to compute energies and gradients using the current

parameter values. This is done with the help of the REXT keyword (which stands

for Run EXTernal process), which tells FITPARAM how to invoke the external

process to compute the function values and derivatives. In this particular

example we invoke the string "./run.pl ./parameters ./fvalues". This says that

computation will be managed by perl script "run.pl" which we assume will invoke

individual computations. The perl script takes two arguments. The "parameter"

file contains current parameter values computed by FITPARAM. The "fvalues"

file instructs "run.pl" script to save the function values and gradients in

the "fvalues" file, because FITPARAM will be reading it (see FVAL 14 unit)

after "run.pl" script finishes its computation. The perl script or other user

selected external routine has to be written individually for each optimization

job. Due to the nature of individual computations and because of the multitude

of possibilities it is difficult to provide one set of rules on how to do this

in the best way. Preparing such script the user should take care about writing

the "fvalues" file in the right format (see above). The data in this file are

rewritten with each optimization cycle. For the example shown above the

"fvalues" file had the following intermediate values:

-5.34805

-2.72626

-5.07762

-4.62209

1.86600

-1.72300 0.03500 -0.71400

30.85000

27.06500

30.33000

-15.22500

-14.00000

0.89123 -6.56414 -2.47438

5.55000

9.83500

5.61500

-25.45000

-4.00000

0.55868 -4.51683 -1.57192

2.59500

1.12500

2.67500

-1.37500

-3.00000

-0.09208 -0.23845 0.21026

Relating these data to the data in "expdata" file helps to reinforce the

understanding of the format of the "fvalues" file.

Input Examples

Two examples given in this section illustrate basic functionality of FITPARAM.

First example covers optimization of Antoine function parameters. This is an

example where the function value and gradient computations are implemented

inside the FITPARAM module so there is no need to call an external procedure.

Therefore this is an exception to the standard use of FITPARAM. This example

is discussed here because of its simplicity and because it is included in

The Antoine function is a simple parametric analytical function which is used

to describe experimental vapor pressure - temperature dependence. It has the

following form: lnP = A + [B / (T + C)], where A,B, and C are fitted

parameters, P - pressure, T- temperature.

Having this function gives the opportunity to compute derivative of pressure

over temperature which is necessary to compute the heat of vaporization of the

pure liquid. Following is the content of the antoine.inp script:

open unit 11 read form name antoine.ini ! initial guess for parameters

open unit 12 read form name antoine.exp ! experimental (target data)

open unit 13 write form name antoine.prm ! storage for optimized parameters

FITPARAM -

NITE 50 - ! maximum number of iterations

TOLE 0.001 - ! chi^2 convergence threshold

COUN 2 - ! number of consecutive successful steps before convergence

NEXP 8 - ! number of experimental data

NPAR 3 - ! number of parameters to fit (2 or 3 Antoine coefficients)

ANTOINE - ! Antoine coefficient fitting

GUES 11 - ! input: initial guess for parameters

EXPD 12 - ! input: data to fit to

PARM 13 ! output: file to store optimized parameters

stop

The computation starts from opening two input files antoine.ini and

antoine.exp, which contain initial guess for parameters and target experimental

data, respectively. Following is the content of antoine.ini file:

A 17.81671

B 4705.03330

C -60.75000

Three parameters, NEXP=3, will be optimized starting form the above values.

The antoine.exp file contains the following data:

393.15 3.649359

398.15 3.877432

403.15 4.076690

408.15 4.264087

413.15 4.461877

418.15 4.651099

423.15 4.825109

428.15 5.018603

Here we have 8 experimental data, NEXP=8, representing temperature - vapor

pressure data.

Following parameter values are obtained after the execution of the above

script:

A 17.84653417

B 4706.72855119

C -61.45937115

The second example illustrates how to set up FITPARAM calculation for a charge

parameter optimization calculation. The example is based on hydroxyl charge

optimization in ethanol targeting interactions with water, which is a standard

charge derivation procedure in the additive CHARMM force field. In the example

the alkane charges in methyl group are constrained to their standard values.

The methyl group is also kept electro-neutral. Two lone pairs (OLP) are

assigned to oxygen atom in this example; therefore the oxygen atom is

represented by two point charges (qOLP). Note, the central O atom carries zero

charge. The charges to be determined are qC, two qHC, two qOLP, and qHO.

From these variables one should be excluded. This charge should be assigned

manually to maintain a total charge of zero. In the present example we exclude

the methylene group hydrogen atom charge: qH = -1/2 * (2*qOLP + qC + qHO).

Remaining variables define NPARM=3. The lone pairs are equivalent therefore

they contribute as one parameter qOLP. Correspondingly, the initial guess

parameter file "parameters.ini" contains the following data:

set qOLP -0.23

set qHO 0.36

set qC -0.06

Because the qOLP parameter has the multiplicity of 2 we declare this in the

"multiplicity" file:

group

Here the parameter multiplicity is preceded with the "group" keyword which

indicates the program that an electro-neutral group is being treated. If the

methyl group charges were also included in the fitting, then the

"multiplicity" file would contain the following data:

group

group

Here we optimize the charge on methyl hydrogens (which has multiplicity 3),

whereas the charge on methyl carbon has to be computed externally to keep the

CH3 group electro-neutral.

The charges specified in the "parameters.ini" file are taken from the CHARMM22

force field. To restrain the charges to their initial values we specify

"restraints" file using the following restraint factors:

0.1

0.1

0.1

The smaller the number the weaker the restraint. To choose a particular value

one needs to do some experimenting. Next we define the target data in

"expdata"

file:

-4.89 ! C-O-H angle bisector

-2.51 ! along C-O line

-4.80 ! lone-pair position

-4.36 ! along O-H line

1.8 ! dipoleTotal MP2 = 1.6259

-1.5155 0.1868 -0.5583 ! dipoleXYZ

Presented is ethanol - water QM interaction energies and the ethanol dipole.

First four values in the expdata file are interaction energies for the

corresponding four water orientations. The fifth value is the ethanol dipole

moment set to 1.8 Debye. Note, that the MP2 dipole is 1.63. The dipole moment

is purposefully increased to provide an illustrative example that we are free

to define any target data we want our model to reproduce. While we are

experimenting with the magnitude of the total dipole moment we certainly want

to minimize the angle between the target and empirical dipole moment vectors.

Therefore the sixth line of data is the dipole moment specified in vector form.

Overall, we have six target data, NEXP=6, and one dipole vector, NDIP=1.

Weighting of data is performed via the "expweight" file that instructs

FITPARAM of the level of priority assigned to the target data:

1.

1.

1.

1.

100.

100000.

Each line in this file applies to corresponding target data in the "expdata"

file. The value 1.0 means a standard weight. The larger the value we specify

the stronger FITPARAM will try to match the target data during the parameter

optimization. One needs to make some empirical adjustments in order to define

the optimal value for the each weighting factor.

The FITPARAM script to run the optimization has the following form:

open unit 11 read form name parameters.ini

open unit 12 read form name expdata

open unit 13 write form name parameters

open unit 14 write form name fvalues

open unit 15 read form name expweight

open unit 16 read form name multiplicity

open unit 17 read form name restraints

FITPARAM -

NITE 100 - ! max number of iterations

TOLE 0.0001 - ! convergence tolerance

COUN 4 - ! steps to test the convergence

NEXP 6 - ! number of target data

NDIP 1 - ! number of dipole moments among the target data

NPAR 3 - ! number of parameters to optimize

GUES 11 - ! initial guess for parameters

EXPD 12 - ! target data

PARM 13 - ! place to store optimized parameters between loops

FVAL 14 - ! function values and derivatives

REXT "./run.pl ./parameters ./fvalues" - ! script to compute derivatives

EXPW 15 - ! weight of individual target data, optional

MULT 16 - ! parameter multiplicity, optional

REST 17 ! weight of parameter restraint, optional

stop

After reading initial guesses for parameters specified in the "parameters.ini"

file FITPARAM will write updated parameter values in the "parameters" file

(see unit PARM 13). The "parameters" file can be streamed into a CHARMM script

to be run externally to compute energies and gradients using the current

parameter values. This is done with the help of the REXT keyword (which stands

for Run EXTernal process), which tells FITPARAM how to invoke the external

process to compute the function values and derivatives. In this particular

example we invoke the string "./run.pl ./parameters ./fvalues". This says that

computation will be managed by perl script "run.pl" which we assume will invoke

individual computations. The perl script takes two arguments. The "parameter"

file contains current parameter values computed by FITPARAM. The "fvalues"

file instructs "run.pl" script to save the function values and gradients in

the "fvalues" file, because FITPARAM will be reading it (see FVAL 14 unit)

after "run.pl" script finishes its computation. The perl script or other user

selected external routine has to be written individually for each optimization

job. Due to the nature of individual computations and because of the multitude

of possibilities it is difficult to provide one set of rules on how to do this

in the best way. Preparing such script the user should take care about writing

the "fvalues" file in the right format (see above). The data in this file are

rewritten with each optimization cycle. For the example shown above the

"fvalues" file had the following intermediate values:

-5.34805

-2.72626

-5.07762

-4.62209

1.86600

-1.72300 0.03500 -0.71400

30.85000

27.06500

30.33000

-15.22500

-14.00000

0.89123 -6.56414 -2.47438

5.55000

9.83500

5.61500

-25.45000

-4.00000

0.55868 -4.51683 -1.57192

2.59500

1.12500

2.67500

-1.37500

-3.00000

-0.09208 -0.23845 0.21026

Relating these data to the data in "expdata" file helps to reinforce the

understanding of the format of the "fvalues" file.

Top

Known limitations

The present implementation does not preserve the total charge. Therefore, the

user is responsible to manage this problem in designing the external procedure.

Implementing the charge constraint mechanism is in future development plans.

Known limitations

The present implementation does not preserve the total charge. Therefore, the

user is responsible to manage this problem in designing the external procedure.

Implementing the charge constraint mechanism is in future development plans.