BatchKi Reference Manual
next up previous contents index home

Subsections


Run Summary Files

BatchKi generates two different output files that are newly created or overwritten at each program run:

  • file ./output/index.html
  • file ./output/index.xls

Both of these files are simple ASCII text files. The file extension *.XLS was chosen merely for convenience of users who have installed Microsoft Excel with the default file extension enabled.


HTML Output File

The main HTML output file, summarizing each program run is located in the directory ./output and is named index.html. An example of a run summary after processing four plate-reader data sets is shown in Figure 6.1.

Figure 6.1: Run summary page.
\includegraphics[scale=0.5]{eps/run_summary.eps}

Each processed plate-reader data file (corresponding to a row in the run summary table) is represented as a hyperlink showing the plate I.D. The run summary page contains a table of negative logarithms of the apparent inhibition constant ( ${\rm p}K_i^{\rm app}$). In this color scheme, the most potent inhibitors ( ${\rm p}K_i^{\rm app} \approx 9$) are shown in red, the least potent inhibitors ( ${\rm p}K_i^{\rm app} \approx 3$) are shown in green, and the intermediate inhibitors ( ${\rm p}K_i^{\rm app} \approx 6$) are shown in blue. The color scheme for the display of inhibition constants is summarized in Figure 6.2.

Figure 6.2: Color scheme for apparent inhibition constants on the run summary page.
\includegraphics[scale=0.5]{eps/pki_scheme.eps}

The abbreviation rsdf in the summary table heading stands for relative standard deviation of fit, defined by equation 6.8 below. The rsdf column contains the lowest and the highest value of the relative standard deviation of fit that were found on the given plate. If the highest value listed in the max colum is larger than approximately $10$, it means that at least one inhibitor on the given plate is associated with a suspect value of apparent inhibition constant. The cause of suspicion is usually an outlying data point, or a systematic deviation from the fitting model (equation 4.3 or 4.4).


ASCII Delimited (Excel) File

The main tab-delimited spreadsheet file generated in each program run is named index.xls and it is also located in the main output directory ./output. If the Microsoft Excel software is installed on the user's computer, clicking on the Excel Window link (red rectangle in Figure 6.1) will open the file ./output/index.xls as a Excel spreadsheet (see Figure 6.3).

Figure 6.3: Run summary spreadsheet.
\includegraphics[scale=0.5]{eps/run_summary_excel.eps}

The spreadsheet summary file contains as many rows as there were unique dose-response curves on all the analyzed plates. Apparent inhibition constants $K_i^{\rm app}$ are listed in the column labeled kiapp. Depending on the regression model for reaction rates, the $K_i^{\rm app}$ values are obtained either directly, as an optimized parameter in equation 4.3, or indirectly, as the sum $K_i^{\rm app} = IC_{50} + [E]_0/2$. The latter scenario is applicable when the regression model for dose-response curves was the four-parameter logistic equation 4.4.

NOTE Most typically, the identity and the concentration of both the substrate and the enzyme are the same throughout the plate. In this case the number of rows is the same as the number of different inhibitors on the given plate. However, BatchKi can accommodate any number of different enzymes (substrates) and any number of different enzyme (substrate) concentrations. In that case the number of rows in the generated spreadsheet file is larger than the number of inhibitors.


Table 6.1: Field labels in the tab-delimited output file (Part 1).
platid Plate ID
enz Enzyme ID
econc Enzyme concentration (M)
sub Substrate ID
sconc Substrate concentration (M)
inh Inhibitor ID



Table 6.2: Field labels in the tab-delimited output file (Part 2).
kiapp Apparent inhibition constant (M)
sterr Formal standard error of apparent inhibition constant (M)
cvar Coefficient of variation (%)
cnflos Linear approximation of the lower limit for the confidence interval
cnfhis Linear approximation of the upper limit for the confidence interval
cnflo Lower limit of the confidence interval
cnfhi Upper limit of the confidence interval
prblo Probability level reached at the lower limit
prbhi Probability level reached at the upper limit
prlev Desired probability level
nfree Degrees of freedom
sdfit Standard deviation of fit
persd Percentage of sdfit relative to control velocity
eo Fitted enzyme concentration
eoer Standard error of enzyme concentration
vo Fitted control velocity
voer Standard error of control velocity
vb Baseline velocity
vber Standard error of baseline velocity
r2adj Adjusted regression coefficient


The first row in the spreadsheet contains the field labels listed in Tables 6.1 and 6.2. Detailed explanation of all quantities contained in the tab-delimited output file is given below.


platid - Plate ID

Unique plate I.D. as entered in the input file in the <Plate id"..." ...> parameter.


enz - Enzyme ID

Unique textual identifier for the enzyme (e.g., HIV-PROT, THROMBIN, etc.).


econc - Enzyme concentration (M)

Enzyme concentration in moles per liter, characteristic for the given dose-response curve.


sub - Substrate ID

Unique identifier for the substrate, characteristic for the given dose-response curve (e.g., BACHEM-B1234-LOT#5678).


sconc - Substrate concentration (M)

Substrate concentration in moles per liter, characteristic for the given dose-response curve.


inh - Inhibitor ID

Unique identifier for the inhibitor, for example, a corporate I.D.


kiapp - Apparent inhibition constant (M)

Apparent inhibition constant $K_i^{\rm app}$ obtained by the least-squares fit of the dose-response data to equation 4.3 explained in section 4.5.11. In the case of a competitive enzyme inhibitor, the apparent inhibition constant $K_i^{\rm app}$ relates to the true inhibition constant $K_i$ via equation 6.1, where $[S]$ is the substrate concentration and $K_m$ is the Michaelis constant. For other types of enzyme inhibition (e.g., uncompetitive or mixed-type) see ref. [9].


\begin{displaymath}
K_i = K_i^{\rm app} / (1 + [S] / K_m)
\end{displaymath} (6.1)

sterr - Formal standard error of apparent inhibition constant (M)

The formal standard error of the apparent inhibition constant is computed by using equation 6.2,


\begin{displaymath}
s_K = \sqrt {A_{ii}^{-1} S_{\rm min} / (n - m)}
\end{displaymath} (6.2)

where $A_{ii}^{-1}$ is the $i$th diagonal element of the inverse Hessian matrix ([12], p. 22; [13], p. 685, eq. 15.5.15); $S_{\rm min}$ is the sum of squared deviations between the fitting model (equation 4.3) and the experimental data; $n$ is the number of data points in the dose-response curve; and $m$ is the number of optimized parameters (typically two parameters, namely, the apparent inhibition constant $K_i^{\rm app}$ and the control velocity $v_0$). The difference $(m - n)$ represents the number of degrees of freedom (see below).


cvar - Coefficient of variation (%)

The coefficient of variation (CV) for the apparent inhibition constant is defined in terms of the formal standard error as is shown in equation 6.3.


\begin{displaymath}
CV = 100 \times s_K / K_i^{\rm app}
\end{displaymath} (6.3)


cnflos - Linear approximation of the lower limit for the confidence interval

The formal standard error of the apparent inhibition constant defined by equation 6.2 can be used to compute in a very simple fashion a linear approximation of the confidence interval at the given probability level (see parameter prlev below). In particular, the $1-\alpha$ confidence interval is defined in terms of the standard error $s_K$ as is shown in equation 6.4,


\begin{displaymath}
K^{\rm app}_{\pm} = K_i^{\rm app} \pm s_K \times t (n - m, \alpha / 2)
\end{displaymath} (6.4)

where $t(n-m, \alpha/2)$ is the Student $t$-statistic at $\alpha/2$ probability level with $n-m$ degrees of freedom ([3], p. 6, eq. 1.12). For example, for nine data points $(n = 9)$, two optimized parameters $(m = 2)$, and the desired 95% probability level, the relevant Student $t$-statistic is $t(7, 0.025) = 2.67$.

When discussing linear confidence intervals, many elementary textbooks on data analysis confusingly (and incorrectly) refer to a ``two-tailed'' $t$-statistics, where the probability level is $\alpha = 0.05$ for the 95% confidence interval.

Thus the approximate lower limit for the apparent inhibition constant is defined by equation 6.5.


\begin{displaymath}
K^{\rm app}_{-} = K_i^{\rm app} - s_K \times t (n - m, \alpha / 2)
\end{displaymath} (6.5)

cnfhis - Linear approximation of the upper limit for the confidence interval

The upper limit of the approximate confidence interval at the given probability level is defined by equation 6.6 explained in the preceding paragraph.


\begin{displaymath}
K^{\rm app}_{+} = K_i^{\rm app} + s_K \times t (n - m, \alpha / 2)
\end{displaymath} (6.6)


cnflo - Lower limit of the confidence interval

A significantly better estimate of the confidence interval for the apparent inhibition constant can be obtained by using an exhaustive search of the least-squares surface in the parameter space. BatchKi uses a modification of the profile-t search method described by Bates and Watts ([3], pp. 302-303). Parameter cnflo is the lower limit of the confidence interval for $K_i^{\rm app}$ at the given probability level, obtained by the profile-t method [3].


cnfhi - Upper limit of the confidence interval

Upper limit of the confidence interval for $K_i^{\rm app}$ at the given probability level, obtained by the profile-t method [3].


prblo - Probability level reached at the lower limit

Under normal circumstances, i.e., when the confidence interval for the apparent inhibition constant can be easily determined from the data, the probability level reached at the end of the profile-t search algorithm ([3], p. 302) will be identical to the desired probability level (e.g., 95%). In this case, the prblo parameter will be numerically equal to prlev (see below).

When the lower limit of the confidence interval for $K_i^{\rm app}$ cannot be determined from the available data, the prblo parameter will not be numerically equal to prlev (e.g., 95.0%), but instead it will be somewhat lower. If this condition is diagnosed in the output file, the lower limit of the confidence interval (cnflo) should be ignored as not reliable.

prbhi - Probability level reached at the upper limit

See the explanation of prblo in the preceding paragraph.


prlev - Desired probability level

This is the desired probability level (as percentage points) for the computation of confidence intervals in BatchKi. A typical value is 95%. The actual value of the desired confidence interval is defined in the BatchKi initialization file (see section 4.1.3).


nfree - Degrees of freedom

Defined as the difference $n-m$, where $n$ is the number of data points in the dose-response curve and $m$ is the number of optimized parameters (typically two parameters).


sdfit - Standard deviation of fit

The standard deviation of fit is defined by equation 6.7,


\begin{displaymath}
s = \sqrt {S_{\rm min} / (n - m)}
\end{displaymath} (6.7)

where $S_{\rm min}$ is the sum of squared deviations between the fitting model and the experimental reaction velocities, $n$ is the number of data points on each dose-response curve and $m$ is the number of optimized parameters.


persd - Percentage of sdfit relative to control velocity

This parameter is defined as a percentage of standard deviation of fit relative to the best-fit value of the control velocity $v_0$, as defined by equation 6.8. It measures the overall goodness of fit of initial velocities.

Values of relative standard deviation of fit below five $s_{rel} < 5$ indicate good agreement between the fitting model and the experimental data. In contrast, value higher than ten $s_{rel} > 10$ indicate that the experimental data either contain a severely outlying data point, or there is a systematic deviation from the assumed mathematical model. In either case, values of apparent inhibition constant associated with $s_{rel} > 10$ should be regarded as suspect.


\begin{displaymath}
s_{rel} = 100 \times \sqrt {S_{\rm min} / (n - m)} / v_0
\end{displaymath} (6.8)


eo - Fitted enzyme concentration

Either the best-fit value of enzyme concentration (if this value were treated as an optimized parameter), or the nominal value, if $[E]$ were treated as a constant. The units are moles per liter.


eoer - Standard error of enzyme concentration

If the enzyme concentration $[E]$ were treated as an optimized parameter, this value contains the formal standard error. The parameter is set to zero if $[E]$ were treated as a constant.


vo - Fitted control velocity

The best-fit value of the reaction velocity observed in the absence of inhibitors. This value generally will be different for different dose-response curves, even though the experimental data (control wells) are the same. This is because the value is obtained by fitting each dose-response curve separately to equation 4.3.


voer - Standard error of control velocity

Formal standard error associated with the best-fit value of the reaction velocity observed in the absence of inhibitors (see preceding paragraph).


vb - Baseline velocity

Either the best-fit value of the baseline velocity $v_b$ in equation 4.3, (if this value were in fact treated as an optimized parameter), or zero value if $v_b$ were treated as a constant.


vber - Standard error of baseline velocity

If the baseline velocity $v_b$ were treated as an optimized parameter, this value contains the formal standard error. The parameter is set to zero if $v_b$ were considered as a constant.


r2adj - Adjusted regression coefficient

The adjusted regression coefficient $R_{adj}$ (also known as adjusted coefficient of determination) is computed from equation 6.10, where $n$ is the number of data points, $m$ is the number of optimized model parameters, and $R^2$ is the coefficient of determination defined by equation 6.9. In equation 6.9, $y_i$ is the experimental value of reaction velocity for the $i$th data point, $\hat{y}_i$ is the best-fit value of reaction velocity computed according to the appropriate regression model (equation 4.3 or 4.4), and $\bar{y}$ is the average of all $y_i$ values.


$\displaystyle R^2$ $\textstyle =$ $\displaystyle 1 - \frac{\displaystyle \sum_{i = 1}^{n}{\left ( y_i - \hat{y}_i \right )^2}}{\displaystyle \sum_{i = 1}^{n}{\left ( y_i - \bar{y} \right )^2}}$ (6.9)
$\displaystyle R^2_{adj}$ $\textstyle =$ $\displaystyle 1 - (1 - R^2)\frac{n - 1}{n - m}$ (6.10)


next up previous contents index home
biokin.com/batchki/manual/reference/html/node29.html
Petr Kuzmic | Jul 12 2008