Catalog of the Effects of Nonlinearity on Multivariate Calibrations

Catalog of the Effects of Nonlinearity on Multivariate Calibrations

H O W A R D MARK Technicon Industrial Systems, 511 Benedict Ave., Tarrytown, New York 10591

A catalog of the visible effects on the data is presented, for the case when curvature (nonlinearity) of the independent variable, the dependent variable, or both exists, and when univariate, multivariate, or principal component methods are used for calibration.

Index Headings: Computer applications; Reflectance spectroscopy; Cal- ibration methods; Chemometrics; Near-infrared.

INTRODUCTION

The current state of chemometrics is a burgeoning field of study? ,Comparison of the periodic reviews of this discipline shows an increase in all aspects, 1,2 but of par- ticular interest to us is the increase in the area of calibration teci~niques. While a majority of multivariate calibrations are performed with the use of the linear multiple regression method [or the method of least-squares, sometimes called the K-matrix or P-matrix method in spectroscopic applications (depending upon whether the op- tical data or constituent values are treated as the dependent variable)], other, more exotic and sophisticated methods, such as Principal Component Analysis, 3,4 are becoming more available and widespread.

These calibration methods are based on results obtained from the science of statistics and are subject to having to satisfy several assumptions which are well known to statisticians; but the need to satisfy these assumptions, and the effect of failing to satisfy them, seems to be largely ignored by chemists/chemometricians.

Of the four key assumptions that underlie all calibration methc,ds that use the least-squares approach either explicitly cr implicitly--and that covers just about every- thing that uses computerized calibration--the one that is always violated is the requirement that there is to be no error in the independent (X) variables. The violation of that fundamental assumption is inevitable, because of the facl: that measurements are used to obtain the data used for both the dependent and independent variables. The closest that anyone comes to meeting this requirement is when synthetic samples are prepared gravimetrically, and the Beer's law formulation is used. However, in many important cases, that procedure cannot be used. For example, analysis of natural products precludes use of synthetic samples, and lack of complete knowledge of all the possible components in a complex mixture dictates t h e use of the inverse Beer's law approach.

Thus the study of the effect of errors in the independent variables is important in understanding the behavior of calibration algorithms. In the general case, errors in the independent variables are compensated for by the

Received 28. :December 1987.

least-squares algorithm, ~ at least insofar as it is possible for the algorithm to do such compensation without losing all predictive capability [e.g., setting all the calibration coefficients equal to zero would certainly make the result immune to all possible errors of the data(!)]. In real calibrations however, such compensation cannot be perfect, and errors in the data translate into errors in the constituent value generated by applying a given calibration equation to the data containing these errors. When these errors are random, it is possible to measure them by using a suitable experimental design to collect the data, 4 and some studies of the effects of random errors have been performed. 4-6

However, the prohibition against errors of the independent variables includes systematic errors as well as random errors. One of the systematic errors of most con- cern in the use of least-squares algorithms as an instrument calibration technique is what would be classed as "nonlinearities" of the measurement. In this study we use synthetic data, with various types of artificially induced nonlinearities, to study the effect of such nonlinearities on calibration techniques.

THEORY

The fundamentals of calibration techniques, both "simple" ones such as multiple regression v,8 and more sophisticated ones such as principal component regression, 3,7 have been previously discussed in the literature, and there would be little point to repeating that material here. What is important to bring up are the parts that are rarely treated in the chemical/spectroscopic literature.

Another fundamental assumption that least-squares calibration algorithms are based on is the requirement that there must be no nonlinearities in the model. Thus the linear model:

= bo + blXl + b2X2 + . . . + b,X,, (1)

often written in matrix notation:

-- [b][X] T (2)

(where [X] T is the transpose of the row vector [X]) is normally assumed to be correct, and the physical theory of spectroscopic analysis, based on Beer's law, confirms that this model should be the correct one to use.

Even when more sophisticated techniques, such as principal component calibration, are used, an equivalent assumption is made.

Linearity of the model, however, does not preclude the possibility of nonlinearities in the data, but these nonlinearities are acceptable only as long as they do not give rise to errors in the independent variables. It has been

832 Volume 42, Number 5, 1988 0003-7028/88/4205-083252.00/0 APPLIED SPECTROSCOPY © 1988 Society for Applied Spectroscopy

T A B L E I. Partial list of synthetic data used for all calculations.

X1

Y= X4 X5 X6 (X, +

X3 (Xl -[- (Xl ~- (Xl "~ Vl X2 -[- (0.1 x 0.1 x 0.1 x 0.05 x ( X ~ + 0.1 x

X2 Xl 2) Xl 2) X2 2) X2) X2) Xl 2)

1 1 0.1 1.1 1.1 1.05 2 2.1 1 2 0.1 1.1 1.2 1.05 3 3.1 1 3 0.1 1.1 1.3 1.05 4 4.1 1 4 0.1 1.1 1.4 1.05 5 5.1 1 5 0.1 1.1 1.5 1.05 6 6.1 1 6 0.1 1.1 1.6 1.05 7 7.1 1 7 0.1 1.1 1.7 1.05 8 8.1 1 8 0.1 1.1 1.8 1.05 9 9.1 1 9 0.1 1.1 1.9 1.05 10 10.1 1 10 0.1 1.1 2 1.05 11 11.1 2 1 0.4 2.4 2.1 2.2 3 3.4 2 2 0.4 2.4 2.2 2.2 4 4.4 2 3 0.4 2.4 2.3 2.2 5 5.4 2 4 0.4 2.4 2.4 2.2 6 6.4 2 5 0.4 2.4 2.5 2.2 7 7.4

9 9 8.1 17.1 9.9 13.05 18 26.1 9 10 8.1 17.1 10 13.05 19 27.1

10 1 10 20 10.1 15 11 21 10 2 10 20 10.2 15 12 22 10 3 10 20 10.3 15 13 23 10 4 10 20 10.4 15 14 24 10 5 10 20 10.5 15 15 25 10 6 10 20 10.6 15 16 26 10 7 10 20 10.7 15 17 27 10 8 10 20 10.8 15 18 28 10 9 10 20 10.9 15 19 29 10 10 10 20 11 15 20 30

pointed out 9 that when multivariate mathematics are used, nonlinearities in different independent variables can sometimes compensate for, and thus cancel out the effect of, each other, so that the resulting calibration gives a net linear response.

The problem then becomes that of evaluating ways to detect nonlinearities, so that estimates can be made of the degree to which the calibration is indeed compen- sating and whether the residual nonlinearity is large enough to be troublesome. This will allow the user of the multivariate techniques to determine when to stop trying to improve the fit of the data and look toward reducing other sources of error of the analysis.

In order for one to be able to detect nonlinearities in the face of noise or other random variations of the data, the statisticians' recommendations are to make plots of the dataY ,s In practice, of course, only two variables can be plotted at a time, no matter how many variables are involved in the analytical method. The most common recommendation is to plot the residuals of the calibration (i.e., the differences between the values arrived at by

applying the calibration equation to the multivariate data and the reference values to which the instrument is being calibrated). While chemists usually like to plot the values of the analytical method under test against the reference (or "standard" or "known" values), this plotting technique is less sensitive to detection and evaluation of small errors; plotting residuals is equivalent to using a mag- nifying glass to examine the data, allowing small devia- tions from straight-line behavior to be seen.

The usual recommendation is to plot the residuals against the predicted analyte value, but other recommendations include plotting the residuals against the actual value, against various independent variables, and, indeed, against any other variable that may prove useful and informative (e.g., plotting the residuals against the order of reading the samples will provide information about possible drift of the readings).

However, while plotting residuals does indeed give more sensitive testing of possible patterns in the data, it does have drawbacks of its own, particularly the fact that the residuals contain only part of the information in the data set. Also, in some cases the sensitivity obtained with the use of the residual may be too great, showing phenomena that are at such a low level that they are of no practical importance. Thus, plotting of only residuals does not give the complete picture, and a thorough examination of the calibration data and results will include a variety of plots, such as actual vs. predicted (by the instrument calibration) values, or either of these values (and/or the residuals) against one or more of the independent variables, etc.

Thus, in this paper, we examine the effects to be seen when several of these plots are created when nonlinearities of known types are introduced into the data. This will allow identification of the cause, when any given effect shows up on real data in the future.

We also will examine the effect of having nonlinearities superimposed on data that is then subjected to principal component analysis. The effect is to create new components that represent the nonlinearity--but what do these components look like, and what is their effect on the principal component calibration equation produced? These are the questions we wish to examine.

EXPERIMENTAL

To examine the questions posed in the previous sec- tion, we created synthetic data sets representing various types of spectral data, then applied a predetermined calibration to the synthetic data to create an "analyte" whose values represented an exact fit to the equation. This process resulted in a table of numbers representing

T A B L E II. Correlafioncoefficientsbetweeneach pairofvariablesusedintheregressionresidualplo~.

Xl X2 X3 X, X~ X~ Y1 Y2

Xl 1.000 .0000 .9746 .9928 .9950 .9967 .7071 .8976 X2 .0000 1.0000 .0000 .0000 .0995 .0000 .7071 .4274 X3 .9746 .0000 1.0000 .9944 .9697 .9896 .6891 .8990 X4 .9928 .0000 .9944 1.0000 .9879 .9993 .7020 .9041 X~ .9950 .0995 .9697 .9879 1.0000 .9917 .7740 .9356 X6 .9967 .0000 .9896 .9993 .9917 1.0000 .7048 .9034 Y1 .7071 .7071 .6891 .7020 .7740 .7048 1.0000 .9369 Y2 .8976 .4274 .8990 .9041 .9356 .9034 .9369 1.0000

APPLIED SPECTROSCOPY 833

i . O 0 ~ 2,00 . . . . .

. . . . I

A

FIG. 1. Comparison of amount of nonlinearity induced into data with straight line.

spectra, which could be modified in known ways to sim- ulate the effect of interest. We introduced nonlinearity into the data by adding values proportional to the desired power of the table entries; thus the type and magnitude of the nonlinearity could be controlled.

All computations were performed on an IBM PC-AT personal computer. The data were generated with the use of the APL language (IBM version 1.00 APL), and the multiple regression and principal component calcu-

lations were done with the use of the Technicon ® IDAS ® program package.

Two data sets were generated. The first set, intended to represent data of the type used for multiple regression calibrations, consisted of 100 synthetical ly created "readings" of values representing several independent variables and two dependent variables. A partial listing of this synthetic data is presented in Table I, which was set up as follows:

I A

. • • • ° ° ° . °

° ° • ° • ° ° ° °

• • • o • • o ° •

° ° • • ° ° • • °

° ° • ° ° ° • • •

• ° ° * ° ° ° ° °

. • ° ° ° . . . .

• • • ° ° • ° ° •

. . ; . . . T ; . . x~

D Ill

R X?

G ° • ° o . ° ° , , •

R ° ° ° ° . ° ° ° . o

E • , • , • . ° ° ° °

iiiiiiiiii L I * , * • * ° * , • °

° . • , . • , , , ,

j . . . . . . . . . .

PREDICTED Y

x~ E

. . . . . . . . o

. . . . ° • . • °

• • ° • . . . . .

° . . . . . . . °

• . • . . . . . °

. . . . . . . ° .

• . ° . . . . . .

x?

H ° ° ° ° ° . ° ° . .

° . ° ° ~ ° ° . ° .

° • ° ° . ° . ° ° °

S ° . ~ * • ° ° ° • ,

I

U • ° ° ° • ° ° ° ° , A ° . ° . i t . t • ,

L " ' ' ' ' • ' ' ' "

~ ? ? t ? ? * T ?

ACTUAL Y

R

A C

A L

Y

C ° :

° o •

° ° • • • , ° . • ° •

• ° ° ° ° ° . ° • • . ° * , ,

, o . * . . . • • , o

i i : " X1

F

. ° o . . o .

. . . , ° . . . . ° .

o . . ° . ° , • ° ° . . ° ° °

• ° ° . . . . . . • ° ° . . . . . .

. . o . . .

! ! i : " • °

X? I

A " : C "::~

h , : : : : : : : : : L : : : : : : : : "

, * . , . , , y l , o o D , ~

PREDIZTED Y

Fro . 2. Plots resulting from use of calibration equation Y = X , + 5.5, where actual Y = X1 + X2.

834 Volume 42, Number 5, 1988

FIG. 3.

A

° ° ° °

° ° ° •

• ° ° °

° • ° •

° ° • •

° ° ° •

° ° ° •

° • • •

° • • •

° . ° °

Xl

° ;

• • o

• ! |

X2

t • ° ,

• ° , °

° , ° •

• • ° •

• ° ° •

V K P . U ' I ~ . i b _ l j y '

[]

P R E D I C T E D

Y

: P ; R

E

Y

|

R E S l D U A L

X1

E . • . . . . . . .

. . . . . . . ° •

. . . . . . ° . .

. . . . . . . ° •

X2

° °o

, ° ° ° o ° ° • ° • ° ° , ° ° ' i * °

...-.'.......-.'.... • . • • ° ,

. • . • • . ° • t ° • °

• . o • , • ° ° . t ° °

• . : . . . : . : • : . : , . • .

ACTUAL Y

II C

C T !

u i A i Y

F

A C T U A L

Y

X2

I

YA liiiii!!i PREDICTED Y

[]

P l o t s r e s u l t i n g f rom use of c a l i b r a t i o n e q u a t i o n Y = 2.1.X~ + 3.3, w h e r e a c t u a l Y = X , + X2 + 0 .1"Xj ~.

The first independent variable, X , had the value of unity for the first ten readings, the value of two for the next ten, and so forth, up to the value of ten for the last ten readings.

The second independent variable, X2, consisted of the values 1, 2, . . . 10 repeated ten times• Thus the two variables each evenly covered the range 1-10, and were perfectly uncorrelated.

For the third independent variable, X3 -- 0.1*X12 (this represents "pure nonlinearity").

For the fourth independent variable, X4 = X1 + 0.1. X 1 2 ,

For the fifth independent variable, X5 = X~ + 0.1. X 2 2 .

For the sixth independent variable, Xs = X1 + 0.05. X 1 2 .

FIG. 4.

A

i i i i i i i ° ° ° . • °

i : : : : : :

° -

° *

B P , R E D I C T E D !

0

0

II C

I . : ! i ° * ° ° °

, . ° ° . . ° . . * . . . •

, ° . • o °

Y

D I ° ° • • • ° °

• ° • • ° ° .

• ° • ° ° .

° o ° ° ° . L . •

° ° ° ° • ° °

• • • • • ° ° .

° • ° • • ° .

~ ° ° . • ° °

. • • • • ° . ° °

X4

G • ° ° ° • o °

p • . . . . °

• • ° • • ° °

• • • ° o • .

, • • ° • ~ °

° • • ° ° •

• • • ° ° °

, ° ° • o • •

° • • • ° •

[ ]

PREDICTED Y

XI

E

A q

H R ° ' ' •

E : : : : " " S : ° • . . . . - • I ; . . . . . . • . D : • • . . . . . . U : ' ' ' • ' • • • A :~. . . . . . • ° L : o o . . • . • -

ACTUAL Y

ai

P l o t s r e s u l t i n g f rom use of c a l i b r a t i o n e q u a t i o n Y = 0 . 4 6 9 , X

i ~ i : " XI

F

cA: . : i i T ~ • ° * ° •

U . : : : : : : A . : : : : : : : : L : : : : : : : : : •

Y : : : : : : • : : : •

X4

I

A . : i ! C T ' . : : " : : U A ! . : ! i ~ ~ i ! i

Y

PREDICTED Y

+ 6.611, w h e r e a c t u a l Y = X~ + X2.


FIG. 5.

m A

° , * ° o ° • ° °

. . . . . . ° ° •

° ° . . . . . . .

: ° • ° . . . . ° °

. . . . . . . • °

, ° . • . . . . .

. • . . . . ° • .

° , ° . . . . • .

° 4 ° • * ° , • •

XI

D [] ° ° ° * . . . . . .

: ° • . . . . . . . .

i . • . • . . . . . .

: ° . . . . . . . . .

, • . . . . . . . .

: . . . . . . , ° . •

: . ° . . • • • . • .

B []

A C T U A L

X4

G : • ° ° • . . . . . .

~ • * ° • • • . . . .

, ° ° ° ° ° ° • . .

• . • ° • • . . . .

, ° • • • ° • . • •

, • • * • • • ° ° •

• • • • • • * • • •

PREDICTED Y

Xl

P R I E D I c! T E

C

. ! i , } • t t "

t • " ,."

X1

iii $ •

,i i"

: I t •

X4 X4

$1 H R . . . . . . . . . .

E " ' ' ° • . . . . .

S . . . . . . . . . .

D . . . . . • . . . .

U • ° - * - . . . . .

A ° ° • , . . . . . .

L . . ° • • . . . . .

: . . • • . ° ° • • °

* , ~ , . ° ° • • •

"-"hi:TribE. Y . . . . . . . . . .

• A C T U A L

PREDICTED Y

Plots resul t ing f rom use of cal ibrat ion equa t ion Y = X4 + 5.5, where actual Y = X~ + X2 + 0.1*X, 2.

For the first dependent variable, Y1 = X1 + X2. For the second dependent variable, Y2 = X~ + X2 +

0•l-X,2• This set of variables constituted a data set that con-

tained both linear and nonlinear relationships between the independent and dependent variables, suitable for examining the effects seen under these varying condi- tions. While the original variables, XI and X2, are uncorrelated, not all the variables are uncorrelated. The correlation coefficients between each pair of variables are shown in Table II. The nonlinearity factor, 0.1 x X12, was chose:n to give a moderate amount of nonlinearity, an amoum~ that gives rise to easily discernible effects on the various plots. A comparison of this amount of nonlinearity with a straight line is shown in Fig. 1.

The second set of data that was generated, intended to represent data suitable for principal component regression, consisted of a set of Gaussian functions. These represented a spectral band. The intensity of the "band"

T A B L E III. Cal ibrat ion models corresponding to the figures. The cal- ibrat ion coefficients were generated by performing mult iple least-squares regression ot the indicated independent variables against the specified dependent v~Lriable.

Figure Cal ibrat ion equa t ion F, for regression

2 ~'1 = 5.5 + l , X , 98 3 ~ = 3.3 + 2.1*X, 406 4 ~', = 6.61 + 0.469-X4 95 5 ~'~ = 5.5 +*X4 438 6 ~'2 = - 2 . 2 + 2.1•Xj + l ' X 2 4099 7 ~ = 5.5 + l ,X1 + 0•X3 48 8 ~2 = 5.5 + I*X~ + l -X3 216 9 $fl = 2.17 + l •X2 + 0.863-X3 1882

10 Y2 = 2.17 + l ,X2 + 1.86.X~ 5236 11 ~2 = - 2 . 2 + 0.79,X2 + 2•1•X5 4099 12 )~1 = 0.741 + l •X2 + 0.641•X6 14,613 13 3"2 = -0 .741 + l -X2 + 1.359-X6 40,081

was allowed to vary linearly with the concentration of a "constituent" whose value covered the range 1-10% in ten uniform steps. The corresponding variation in the "band intensity" went from 0.55 to 1.0. (The curves corresponding to these samples are shown in Fig. 14A.) Su- perimposed on the bands for these ten different "samples" were curves corresponding to the squares of the bands. We generated these by adding, to each point of the "band," a value that was a multiple of 0.1 times the square of the value for that point. The multiplier went from -0.5 to +0.5; thus for each band, 11 "spectra" were generated, which contained different amounts of the square. (One set of these is shown in Fig. 14B.)

These spectral "bands" were used to investigate the effect of the nonlinear phenomena on calibrations based on the use of principal components.

RESULTS

M u l t i p l e R e g r e s s i o n R e s u l t s • The first subset of the data that we inspected comprised the simplest case: only one independent variable was used. These results are shown in Figs. 2 through 5. In Figs. 2 and 4 the dependent variable was Y1, containing only linear terms; in Figs. 3 and 5 the dependent variable was Y2- In each case we plot the values of the predicted (by the calibration equation) and the actual (the synthetic value used in the data set) dependent variable and the residuals of a calibration, as the ordinate, against various independent or dependent variables, as the abscissa• Note that the use of the term "independent variable" is not synonymous with the X~ being plotted. Some plots are presented in which residuals are plotted against an X from Table I that was not used in the calibration• The purpose of this is to enable future examination of residual plots to allow rec- ognition of those cases in which inclusion of the corresponding variable can improve the model•

836 V o l u m e 42, N u m b e r 5, 1988

FIG. 6.

R E

I ; • D: U : A ; L :

X1

• ° • ° °D° ° . • .

R ; E ; S : I : . . . . . . . . . D : U ; A : . . . . . . . . . L :

X2 . . . . . . . . . . .

G . • , ° ° • °H • • ° ° • . . . . . . .

i , ° , • ~ , , ° • ° ° . • ° ° , ° • ° °

i ~ •~ ° °H° , . , , , . ° • . . , •

PREDICTED Y

[]

I ,

B P R :i E - ~ : D . } i ~ ~ I : ~ : " C

E : : : " D ~ "

X1

p E R • • : : E ° • " •

D • " I • • °

c T - : E .

. . °

D • ; °

X2

H . . . . , . . . . . . . . . . . • . • °

R E S

DI , , • ° • , • • • • ° ° ° , • • ° • ° ,

u ~ • , ° , • • ° , . . . . , • • ° ° , ° .

ACTUAL Y

C~

L

I C

X1

A~ C~ T~

LI

YI X2

°

° • o t • ° °

" . " : ~ :

I °to

T lli;' u pt ~: , ~l~l ~l''~ Y

PREDICTED Y

[]

[ ]

Plots resulting from use of calibration equation Y = 2.1.X~ + X2 - 2.2, where actual Y = X~ + X2 + 0.1"X~ 2.

The models generated by the regressions performed are summarized in Table III. Since the data are synthetic, and there is no random error, the normal indicators of calibration performance do not have their usual mean- ings. Furthermore, since the dependent variable was computed from an arbitrary set of numbers, the S.E.E., being dependent on the units of the dependent variable, is meaningless. Consequently only the F for regression is reported, to allow comparison of the relative regression performances• Comparison of these results with results

from actual data can be subjectively performed by using the plot of ACTUAL Y versus PREDICTED Y in the lower-right-hand corner of each figure•

Clearly, for Figs• 2-5, the fact that only one of the X~ is used for an independent variable in the model gives rise to large errors. This is the cause of the large vertical spread of the plots in the cases where actual Y is plotted on the ordinate.

In Fig. 2, which contains only linear terms, note that, despite the large error, the fact that all the error is ef-

FIG. 7.

A f l

! : : : : : : : : : P R

. . . . . . . . . " E " * " . . . . . . D • " • . . . . . . I • . . . . . . . . C

. . . . . . . , - T • ° . . . . ' " " E

f . . . . . . . , .

X1

D

i . . . . . , • . .

X3

G M • , ° • • • . . . .

• . • . . . . . . .

D

Y

N P R E D

,T E D g

S

i. °

X~

X1

E

H ° • • ° • • ° ° ° °

° , • , • ° ° • o •

• ° • ° ° ° ° ° • °

o • • ° • ° • ° ° °

. .m

c

A . : i C ° ° • °

T - : : : : : U . : : : : : : : A : : : : : : : : : L : : : : : : : "

• . , ° ° ,

y : : : : °

F

. : i : " • : : : : : • : :

. , • • ° ° ° ° • ,

. . . . . • ° • . • • ° • °

x~

I o ° ° ° ° ° ° ° ° •

i ° • ° o ° ° ° , o •

i ° ° ° ° • • • ° • °

I . . . . . . . . . . I U I . . . . . . . . . . : i : : ~ ~ ~ ~ ~ ~ ~ : . . . . . . . . . . A . . . . . . . . . .

A . . . . . . . . . . A i " : : : : 1 1 : : : " L . 1 1 : : : : : : : " L . . . . . . . . . . L . . . . . . . . . . . y , * . . . . . . . . .

• • • • • . . . . . , , , , , , , , , t ° ° - . ° ° , , . ,

PR~IrTFnY ACTUALy PREDICT~Y

P l o t s r e s u l t i n g ~ o m u s e o f c a l i b r a t i o n e q u a t i o n Y = X l + 0 . 0 * X 3 + 5 . 5 , whereac tua l Y = X I ÷ X2.


FIG. 8.

A ° ° ° • . . . . .

° ° ° ° • . . . .

• ° • ° . . . . .

° • ° ° • o o ° o

• • • • • . . . .

° • • • ° • o ° •

o ° • • • • . ° .

• • • • ° • . . °

x1

D

R ' ' " " " " " " " E • • " " " " " " "

S ' * " • " • • " " °

I " " • • • • " " • D ° " ° • • • " "

~ ! : : : : : : : : • L o ° • • • • • •

. • ° • • . • • .

: • , ? ~ . ? , . . ¶ . ? . . .

X3

G

E • • • • • • • • o o

S • • • ° • • • • ° "

D • ° • • • • • • • "

U • • • • • • • • • •

A • • • • • ° ° • •

L • ° • • • • • • • •

• . • . o . • • • .

PREDICTED Y

II P R E D I C T E D

Y

in

E D I C T E D

g

II R E S I D U A k

B E L L

i • ~(1

E

F

X3

H ° • . . . . . . . .

° . o • . . . . . .

° o • . . . . . . .

. • ° • . . . . . .

• • ° ° ° • . . . .

° • . . . . . . . •

• ° ° • . . . . . .

: • • ° • " " • • ° o

I ." . ° . t . • . . " . ' - . t ' . . ° t _ . ~ . . . . . . .

ACTUAL Y

Pl

T U A L

Y

, i l i I

Xl ,

F

I : illi,, i: if•"

X3

I E

i i

ii!ii' PREDICTED Y

..E

i l t "

I

i

|

I

|

i

Plots result ing from use of cal ibration equation Y = X1 + X~ + 5.5, where actual Y = X , + X2 + 0 .1"X~ 2.

fectively due to the variation of the dependent variable satisfies the regression condition that there be no error in the independent variables, and the coefficient of X1, which is unity, is the correct value compared to the full, correct, model. Also of note is the difference between Fig. 2G and 2H. This is typical of the differences that occur when residuals are plotted against actual or predicted independent variables. Although these plots appear somewhat exaggerated, this is due only to the fact that the errors are comparable to the values of the variables. If the errors are small, so that the actual and predicted

Y values are (almost) the same, the plots will appear to be the same, particularly if the ordinate data are ex- panded so that they fill the height of the plot. Never- theless, this difference always exists. Note also that in Fig. 2D the ordinate represents residuals, not any variable. Thus the appearance of a straight line is the in- dication of the systematic error of the calibration• Each point of Fig. 2D is the superposition of ten individual residuals, each having the same value•

In Fig. 3 nonlinearity was introduced into the dependent variable; this now shows up differences in what

l~Ic. 9.

A

• o • . . . . . .

. . . . , . . . .

• . • • • . . . .

° . . • • . . . .

• o ° . , . ° - .

i - i.

D

X2

X3

G

• $ ~ 1 ~ 1 1 t l l •

. . . . i : i i i i i i l .

, , o o • • • . . .

J , o • ° . , • , ° • ° • . . . . . .

PREDICTED Y

M P R E D I C T E D

Y

Pl P R E D I C T

g

IPl

R E S I D U

B

: : : i i i . : : • : : ~ i l : ' ' : ~ i ' : ~ : i ' i i |

X2

E

: : . : : :

!i ":" ~ " : Ei ~ : •

X3

H

! ' ' E ! ! ! ! ! ! ! : °

i . . . . : : : : : : . . . . i • • ' - * o ° ° ° °

, • . • . . . . . .

• , • . • . . . . •

i • • . . . . - • •

ACTUAL Y

p m c

. . . o . ° . • ° • ° ° .

. . • o , . o .

: : : : : : : : : . . o o ° . . . , o o . . . . ° • o . . .

: : : : " : : "

X2

F

A I : : • J • : • •

; 1 . : i i i " :" l : : ! ! i i i i i i

. . . . . . . • o •

yi!!ii X3

A ,

C , T U '

Y o

D

I

.0.

~ °

~.0

~°° o. ° °

PREDICTED Y

Plots result ing from use of calibration equation Y = X2 + 0 .863oX~ + 2.17, where actual Y = X , + X2.


FiG. 10.

A t f t ! ! I ! ! ! : : : : : : : : :

° , . . . . . . .

, ° ° . . . . . .

• ° . . . • . . .

° . * * . . . . .

X?

D

R E

D U o

A L

x~

G

R i "''~J.I°'JA"... . , , . , , , . . .

S , " ' " ' " ' "

I , . . o . . . o . , °

U , ° ° , ° . • o * .

A '

L ' " " . . . . . . " "

57•o°.o°°° PREDICTED Y

fl

M

P

Di I

E D y i

|

• • - • •

°

: . : i : i i i i i

X2

° •

°

i i

E t

:l t i i

X3

H . • ~# i l i t ~ t • o

. . . . . ° . . . . ° • ° . . . . . . .

• . ° . . . . . . .

• . . . . . . . ° .

• • • . . o • • ° .

. . . . ACTUAL Y . . . .

nl

c

!A iiiiii i ,.x2,

F

A! ill T . !i .

A,,ii!. ' $"

. . . . ,t . . . . . . . . . .

X3

I A . . • "

Y ~,:~

P'I~I CTED V''

I

Plots resul t ing f rom use of cal ibrat ion equa t ion Y = X2 + 1.86.X,~ + 2.17, where actual Y = X, + X~ + 0 .1 .X, ~.

appeared to be similar plots in Fig. 2. The curvature evident on the residual plots is directly due to the nonlinear contribution to the dependent variable. Particu- larly striking are the differences between Fig. 3C and 3F. Whereas the corresponding linear case of Fig. 2 had ap- parently identical plots for these cases, the introduction of nonlinearity clearly distinguishes the situations where the nonlinearity is related to the variable used as the independent variable (X~) or the one not so used (X2).

In Fig. 3H we see that different phenomena combine essentially independently of each other, so that both show

up in the residual plot. A caveat however: with real data, these effects will be less clearly demarcated, because the effect of intercorrelation among the variables will change the regression behavior and tend to introduce characteristics of one phenomenon into the display of the oth- ers. In Fig. 3D the introduction of nonlinearity has split the tenfold overlap into five sets of twofold overlapping points.

In Fig. 4, the nonlinearity is present in the independent variable rather than in the dependent variable• Strange- ly, this gives rise to more complicated effects than are

FIG. 11.

A , • ° ~ • ° • ° • .

R E S I • " " . . . . " •

D U A . . . . . . . ° "

L . . ° . • • . ° .

x~ D

R - " - -

ii P R E D I C

D

i " "

i" r ° °

! "

P

i o : . . . . . ,

Y " X2

p E RI ; / E~

7 , ! / i / , . . , i l l - "

E i . / / / / : D

A~ C~

A: El Y i

• ° ° ° ° ° o

° ii il X?

E :

S: I : D: - - U: A: LI - -

:- . . . . xb . . . . . . . . .

G

R i ° ° ° • ° • ° ° ° " • • ° ' ° ° . . . .

i ° ° , • • • ° ° • • ° . . . . . . . . •

L: . . . . ° • ~ . . . . . . .

PREDICTED Y

X5

H

R! . . . . . . . . . . . . . . . . . . . . E: S: I . . . . . . . . * ° • * ° ' • ° ° * " °

D. u ! A i . • . . ° ° ° ° ° . , • ° ° ° , ° . , °

L : ° o . • ° . w ~ , , . • , .

ACTUAL Y

F

cAI - / , i i i

y~ ~:-- i i ' •

X5

il

i J ! ::

ii;itI st

I i~Lt

PREDICTED y

Plots resul t ing f rom use of cal ibrat ion equa t ion Y = 0.79.X2 + 2.1.X5 - 2.2, where actual Y = X~ + X2 + 0.1"X~ 2.

A P P L I E D S P E C T R O S C O P Y 839

FIG. 12.

A

, = t m l t s t = =

! . ° ° ° , . . . . . . ° ° . . . . .

X2

D

X6

G

ER • | = t t t s | .

S : . . . . . : : : : : . . . . . I :

, ° ° . . . . . ° . ° D , ° ° ° ° ° • • . • °

A, L , . ° _ • ° .°_'_•_°.° ~ . ° . . . . . ° ° o

PREDICTED Y

P R E D I

E D

Y

P R

D I C T

D

Y

R E S

U A k

B °

. : : : : : :

i : ? i i i ; ~ i ~

X2

E

i : . : . : i i i.: iiii!ii

.X6 ....

H . ° ° ° ° , ° . • • °

. . . . . . . . : : : . . . . . . .

ACTUAL Y

C

° ° . * ° . . • ° ° o • *

, . . • • . . °

F ° ° ° ,

X2

i .: ii ; . : : : : : : :

! : i ! i . " i i ~ ! : F * • • ° ° • - * , . . . • ° • .

A~ C L

Y [ _ - ~

X6

I °

w

r a P -

PREDICTED Y

Plots resulting from use of calibration equation Y = X2 + 0.641.X~ + 0.741, where actual Y = X, + X2.

apparent when the nonlinearity is present in both variables (see Fig. 5 for comparison). Now, we not only have curvature of the residual, as in Fig. 3 (although in Fig. 4 this curvature is due to nonlinearity in the predicted rather tharL the actual values), but in addition the non-

linearity of the predicted values generates nonuniform spacing along the corresponding axis.

In Fig. 5 we see that the nonlinearity of the independent variable compensates for the nonlinearity of the dependent variable. This allows the regression model to

FIG. 13.

A R . . . . . . . . .

E " ° ° ° • - ° ° °

S I, D, : : : : : ". -" : : U A ' . • • . ° • ° . °

L ' " . . . . * " " *

,: = = .- | s I | $ s • • • • • • • ~, •

,X2 . . . . . . . . . . D R

R P R' E D I C T E D

Y

P

o ° ,

° , ° ~

: : : i

i ~ i : :

. - ' : • • T

: : U; : : : A : •

i i ~ i ' Y

P E

.

• ° iiii I :

i x 2 . . . .

F A c i . iil i Y

U A L

X~ G

~ ° ° * ° . ° ° ° - o

E , ° ° ° ° ° • * ° ° S

• . ° ° ° ° °o ° ° D ~ ° ' * • ° • " ' " U A . . . . . . ° * ' ° ° ° ° ° , , , ° ° ° L

ERFDICTFD y

R

C

ii{ iJi' y : -

$ E! : |

X~

H R . . . . . . ° ° * •

E " ' ° ° ° ° ' ° ° •

s ° o ° . - • • ° ° °

D ° ° ° ' ° ° ° ' ° °

U A * • • • ° * ' ° ° ° . . . . ° ° ° . . °

L E • ° * • • ° ,i,~,%-.. ° ° ° . .

° ° . e . • • * , ~ Q ° .

ACTUAL Y

Plots resulting from use of calibration equation Y = X2 + 1.359-X

x~

I A ~ ° o • AT/I/" Y

PREDICTED Y

|

°

- 0.741, where actual Y = X1 + X2 + 0.1"X~ 2.

Fro. 14. Subsets of the data used for the calculation of principal components. (A) The Gaussians corresponding to the ten different values of the "const i tuent ," with no contribution from the square. (B) The curves corresponding to one sample, with the different amounts of the square of the Gaussian added.


W

i to

~.0000,

0.9000,

0 . 8 0 0 0

0 . 7 0 0 0

0 . 6 0 0 0

0 . 5 0 0 0

0.4000,

0 . 3 0 0 0

0 . 2 0 0 0

O.dO00

0 . 0 0 0 0

A

~7.00 . . . . . . . . .

~200.0 t400 .0 t600 .0 t800.O ~AVELENETH

3 9 . 0 0 - - ~ - 50 .00 . . . . .

2000.0 a200.O 2400.0

i

i ul

t 'O0007

0.90001

0.8000, !

0 . 7 0 0 0

0 . 6 0 0 0

O.BO00

0.40OO ~

0.3000 £

0 .2000 !

O. iO00-

0 .0000 ~

t . 0 0 - ~

1200.0

2 . 0 0 ~

1400.0

B

| 8 0 0 , 0

3 . 0 0 - -

| 800 .0 WAVEI.ENG'Ilt

4 . 0 0 ~ 5 . 0 0 - -

2000.0 2200.0 2400.0


0 . 2 0 0 .

O. 150.

O. LO0.

i 0 . 0 5 0 .

0 . 0 0 0

- 0 • 050

- 0 . tOO

- 0 . tBO I

- 0 • 200

t . 0 0 + - - - -

A 2 . 0 0 +

+ t

: " = I e I e I = I ' I = ; t 2 0 0 . 0 t.400 • 0 t 8 0 0 . 0 t l ,800.0 2000 • 0 2 2 0 0 . 0 2400 • 0

WAVELENGTH

, | .

B El

0 • 2 0 0 .

O. ~.BO,

O. t 0 0 .

~< O.OBO.

0 000 o

- 0 , 0 0 0 -

- 0 • t 0 0 .

- 0 . :l.BO •

- 0 . 2 0 0

1 , 0 0 , - - 2 . 0 0 - -

• . ' : " : ' : " I e I ' I t 2 0 0 . 0 t400 • 0 1 8 0 0 . 0 | 2 0 0 . 0 2000 • 0 2 2 0 0 . 0 2 4 0 0 . 0

NAVELENSTH

FIG. 15. (A) The two principal components that describe the data. (B) Comparison of the second principal component with the second derivative of the Gaussian function.


F I a . 1 6 .

A

PRIN, COMP, ~tl . . . .

D °

o °

G

PREDI CIF[)' Y .....

B ° °°

s

/ - F;R/'N : ~ 0 h V # Z : " . . . . . .

E

° ° ° ° o ° , . ° o° • • _-..--.......'..

\ . . . . - . . . - ' . . . . ' . . \ . . - . . . . . . . . . .

P R i N ;: C O i ~ ; # 2 . . . . . . . . . . . . .

i i i l l ! i.,

I I

i

AF:TIIAI Y

i A

U A L

Y

| A C T U A L

Y

Y

Plots resulting from using only principal component #1 to

C , ° • • . . . . . . . . , . ° ° ° . . . . . . .

° ° ° ° . = . ° . ~ . • . . . . .

i " ° ° • ° ° . . . . . ° . . ~ o ° - ° ° ° • ° ° . . ° . o o ° ° * ° o * • o ° l , o - °

P R i N , COMP. # i

F

i .:.':"::::'" I . . . . . , , . . , . i * ' ° °°° '***°

P R I N C•MP, # 2

I

EL . . . . . . . . . .

I . , . . . . . . . . .

PRF~ I£TF - ' [ ] Y

model the constituent values.

correctly account for the nonlinearity; therefore it does not contribute to the error. This is exhibited by the ab- sence, in Fig. 5A, B, G, and H, of the curvature of the residuals in the corresponding portion of Fig. 4.

Figures 6 through 13 represent the results of using two independent variables; and it is this case where it may take more than one plot to understand the relationships between the variables, as each plot represents one pro- jection of the multidimensional surface.

For each of the figures (Figs. 6-13), there are five possible variables to plot: the two independent variables, the actual values of the dependent variables, and the predicted values of the dependent variables and the residuals obtained by applying the calibration equation to the data. Figures 6 through 13 each contain nine individual plots, where various available variables are plotted against each other.

It will be noted that, when some combinations of the available variables were used as the independent variables, a model was created for only one of the dependent variables, while for other combinations of available variables, models were created for both dependent variables. The reason some models are missing is that the missing models were capable of fitting the dependent variable exactly, leaving no residuals: an uninteresting case.

In these cases, except for Fig. 7, the plot labeled I in the lower right-hand corner of each figure shows that the model is a relatively good fit to the data. As one would expect, the use of two independent variables makes the errors much smaller than those that appear when only one independent variable is used. For many of these combinations of variables, the plots can be seen as projections of trough-shaped surfaces viewed from different directions, which is exactly what they are. The better- fitting case makes the values of the errors smaller, but qualitatively, the nature of the situation can be deter- mined from the different plots. Otherwise, effects seen in these plots are similar to those seen in the one-inde-

pendent variable case, so no further discussion of them is needed.

Principal Component Results. Principal components, and their use in spectroscopy, have been well described in the literature (see Refs. 3 and 4, and the citations therein); those discussions will not be repeated here. Fig- ure 14 shows the spectra from which the data were created. Figure 14A shows ten Gaussians of different am- plitude; Fig. 14B shows eleven variations of one Gaussian, generated by adding different amounts of the nonlinear term to the "pure" Gaussian. Principal components of the whole set of 110 synthetic "spectra" were computed. Since each of the "spectra" was generated from a com- bination of only two mathematical functions, only two principal components are needed to account for all the variance of the data. These two principal components are shown in Fig. 15A.

The first principal component is itself a Gaussian. The second principal component appears remarkably similar to the second derivative of the Gaussian. It is not the second derivative of the Gaussian, however; this is dem- onstrated by Fig. 15B, where the second principal component of the data set is compared to the second derivative of the Gaussian. The second principal component of this data set is actually the d i f f e r e n c e between the square of the Gaussian and the Gaussian itself; this is hinted at by the lowest curve of Fig. 14B. The principal component algorithm clearly used the identity:

G 2 - G + (G 2 - G) (3)

(where G represents the Gaussian function) to model the spectra, even though they were created from the individual functions G and G 2. This is due to two facts: the Gaussian function was needed in any case to model the Gaussian component of the spectra, and the two functions themselves were not orthogonal, thus requiring the algorithm to generate a function that was orthogonal to the Gaussian. The Gaussian portion of Eq. 3 is measured


FIG. 17.

A

~,,.,,"

/ 'ORiN: COMP- #1 "

[]

D I °

, ° • ° ° . , • ° •

: : :

PR!~ Cg"P #?

G [ ]

• .-.'.-. . .-.-..-..:..:.:..

• , . . . . . -- . . . . . . . . . .

• . + . . . . . - . . . - , . . . . . . . . .

PREDICTED Y

° p~ * ° .

R " , ' ° : • E: " " "

D PRIN, COP'iP, '#1 ' ' '

E P R' °°

i l / / D

Y PRIN, = COMP. #2

II

° °

. : ~ ! : : . ~ : : : . . -

|

ACTUAL Y

[ ]

c

A f ° • ° . . . . . . . . c

U . . . . . . * . . . . A . . . . . . . . . . . L

y ' . . ° ° ° ° . . . . .

I PRIN: EOI ' IP . ' #'1' '

F

A! C ° •+ ° . . . . . . . T " ° ° ° . . . . . . U ~ " ' ° ° ° * ° ° ° ° °

L] - - - -

Y! - -

PRIN . -+CORP, #2

Q . . . . . Q . . . .

m

I

.°°•°.,,°,.

• °°.°° .....

...o ..... °°

,°.°°•°°o°°

° °+ . . . . ° . ° . T

L:

Yl ----- PREDICTED Y

Plots resulting from using only principal component #2 to model the constituent values.

together with the Gaussian par t of the spectra, leaving only the pa:renthesized expression of Eq. 3 to be re ta ined as the seco!ad principal component•

The residuals plots obta ined with the use of the two principal components separately to model the "consti t - uen t " are shown in Figs. 16 and 17, respectively• A sur- prising fact t ha t emerges from these two figures can be seen by comparing the plots of actual vs. predic ted values for the two cases. Even though the "cons t i tuen t" was c rea ted so tha t it would be direct ly proport ional to the Gaussian function, using the first principal component (which is a Gaussian) to model the const i tuent values gives poorer results than using the second principal com- ponen t (i.e., G2 - G) alone. This result is also due to the inclusion of the Gaussian port ion of Eq. 3 in with the par t of the funct ion tha t is proper ly the Gaussian tha t models the const i tuent; this contr ibut ion of the Gaussian squared causes a large error.

C O N C L U S I O N S

When o:ae is performing quant i ta t ive analysis using mult ivar ia te mathemat ica l methods, it is usually difficult or impossible to visualize the a r rangement of the data

in mult idimensional space. Plot t ing variates of the data, one against another , is equivalent to projecting the mul- t idimensional space onto the two-dimensional space of those variates. Even this l imited view of the data, however, can give useful informat ion regarding the arrange- men t of the data. If several such projections are created and compared with corresponding plots of data with known characteristics, such as those presented in this paper, it is possible to work backward and deduce what the mult ivar ia te a r rangement is.

1. L. S. Ramos, K. R. Beebe, W. P. Carey, E. Sanchez, B. C. Erikson, B. E. Wilson, L. E. Wangen, and B. R. Kowalski, Anal. Chem. 58, 294R (1986).

2. M. F. Delaney, Anal. Chem. 56, 261R (1984). 3. I. A. Cowe and J. W. McNichol, Appl. Spectrosc. 39, 257 (1985). 4. H. Mark, Anal. Chem. 58, 2814 (1986). 5. H. Mark and J. Workman, Anal. Chem. 58, 1454 (1986). 6. J. R. Hallowell and M. F. Delaney, Anal. Chem. 59, 1544 (1987). 7. N. Draper and H. Smith, Applied Regression Analysis (Wiley, New

York, 1981), 2nd ed. 8. C. Daniel, Fitting Equations to Data (Wiley-Interscience, New York,

1971). 9. E. Stark and T. B. Hirschfeld, l l th Annual Meeting of the Feder-

ation of Analytical Chemistry and Spectroscopy Societies, Phila- delphia (1984), Paper No. 515.


Date post:	06-Oct-2016
Category:	Documents
Upload:	howard
View:	215 times
Download:	1 times

Catalog of the Effects of Nonlinearity on Multivariate Calibrations

Documents