Date post: | 30-Dec-2015 |
Category: |
Documents |
Upload: | shanon-brown |
View: | 214 times |
Download: | 0 times |
Metrological Experiments inBiomarker Development (Mass Spectrometry—Statistical Issues)
Walter Liggett Statistical Engineering Division
Peter BarkerBiotechnology Division
National Institute of Standards and Technology
Biomarker(Clinical Pharmacology & Therapeutics, 2001)
A characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention.
Two parts of a biomarker– Execution of measurement protocol– Interpretation of measured response
Metrology
• Development and evaluation of a measurement protocol, the first part of a biomarker
• Diverse lessons learned from varied applications• Focus on general purpose protocols which may be
adequate for a particular purpose• The use of metrology in biomarker development is
the subject of this talk
Metrological Experiments
• Experimental units (specimens)– Knowledge of their characteristics
– Relation to unknowns of future interest
• Response– Univariate—interval-scale variable
– Multivariate/Functional
• Protocol parameters—parameter design• Cost of experimental runs—high throughput?
Outline
• Alternative statistical formulations– Classification based on cases and controls– Measurement of an interval-scale variable
• Aspects of protocol development– Property of interest– Realization of protocol
• Multivariate and functional measurements
Statistics for Classification
• Assume gold standard for disease status
• Evaluate marker on training data– Sensitivity—true positive rate– Specificity—1 – false positive rate
• Continuous test result—ROC curves
• Multivariate test result—classification, discriminant analysis
Pepe, et al., J. National Cancer Institute, 2001Specimen Selection
1. Wide spectrum of tumor and non-tumor tissue
2. Serum from cases and controls in a target screening population
3. Apparently healthy subjects monitored for development of cancer
4. Cohort from a population that might be targeted
5. Subjects randomly selected from populations in which the screening program is likely
Thinking Outside the Box
• Bottom line is prediction of disease status
• Definitive gold standard may not be available
• Including laboratory sources of error in training data is a problem
• There are metrological experiments that do not require a gold standard
The Role of Science
• Given valid training data, statisticians can proceed without scientific knowledge
• In the classification approach, scientific thought must go into specimen selection
• In the metrological approach, focus is on a property to be measured
• Scientific thought must go into the relation of the metrological property to biomarker goals
Statistics for Metrology
• Focus (as best one can) on the property to be measured, an interval- or ratio-scale variable
• Specify a baseline measurement protocol• Experiment with realizations of alternative
protocols• Optimize repeatability (at least) and then ask if the
measurement protocol is adequate for the purpose
Framework of Metrology
• Relation between property and protocol obtained scientifically or through realization
• Metrology explores faithfulness of realization before adequacy for the purpose
Property
Realization Protocol
Some Metrological Experiments
• Protocol development through classes of units known to differ in the property of interest
• Protocols linked to a scientific definition of the property of interest in such a way that all sources of error can be assessed (definitive methods)
• Sets of protocols that measure the same property but are based on different scientific principles (independent methods)
Aspects of Performance
• Repeatability• All manner of reproducibility
– Operator, equipment– Inter-laboratory
• Noise factors, effect of sample matrix• Calibration• Measurement assurance• Uncertainty components, type A and type B
uncertainties
Experimental Units(Reference Materials)
• Homogeneity (solution versus particles)
• Quantity (cost)
• Adaptable to high-throughput experiments
• Known value of the property of interest
• Classes with different values of the property of interest
From Univariate to Functional
• Carryover has been done for classification
• Extending measurement performance concepts to multivariate and functional responses is still a challenge
• Chemometrics is the key word for much of the literature in this area
Functional Principal Components Analysis (Ramsay and Silverman)
• Metrologists like to look at the spread of a batch of measurements (outliers, more than one mode)
• For functional measurements, functional PCA provides a way to look at the spread
• Consider results of functional PCA on Petricoin’s Lancet…/Normal Healthy (SPLUS, Ramsay’s software)
• Main purpose is to illustrate metrological thinking
1800 1900 2000 2100 2200 2300
M/Z
05
1015
20
INT
EN
SIT
YLancet ... Normal Healthy SELDI-TOF Mass Spectra
2200 2220 2240 2260 2280 2300
M/Z
0.00
0.05
0.10
0.15
0.20
0.25
VA
LUE
OF
PC
CU
RV
E
Functional Principal Components Analysis
2200 2220 2240 2260 2280 2300
M/Z
-0.2
-0.1
0.0
0.1
VA
LUE
OF
PC
CU
RV
E
2200 2220 2240 2260 2280 2300
M/Z
-0.1
0.0
0.1
0.2
0.3
VA
LUE
OF
PC
CU
RV
E
2200 2220 2240 2260 2280 2300
M/Z
-0.1
00.
000.
100.
20
VA
LUE
OF
PC
CU
RV
E
Functional Principal Components Analysis
2200 2220 2240 2260 2280 2300
0.0
0.5
1.0
1.5
2.0
2.5
3.0
INT
EN
SIT
Y
+++++
+++
++++++++++
++++
+++++
++
+++++++++
++++++
++++++
+
+
+
+
+
+
+
++++
+
+
++
+
+
++
+
+++
+++++++++
+++++
+++++++++
+++++
++++++
++++
++++
+
+++
+++
+++
+
++
--------
----------
----
-----
--
---------
------
-----
---
-
--
-
-
----
-
----
------
-------
-------
-----------
------
------
------
----
--
-
---
-
--
M/Z
Rotated Functional Principal Components Analysis
2200 2220 2240 2260 2280 2300
0.0
0.5
1.0
1.5
2.0
INT
EN
SIT
Y
++++++++
++++++++++
++++
++
+++
++
+++++++++
++++++
+++
++++
+
+
+
+
+
+
+
+++
+
+
+++
+++
+
+++++
+
+++++
+++++
++++++
++++
+++++
+++++
+++++
++++
+
++
+
++
+
++
+
+
++
------
--
----------
----
--
---
--
---------
------
---
----
-
-
-
-
-
-
--
--
-
-
--
-
-
--
-
-----
-----
-------
---------
---
----
------
--
---
--
--
-
--
-
--
-
-
--
M/Z
2200 2220 2240 2260 2280 2300
0.0
0.5
1.0
1.5
2.0
INT
EN
SIT
Y
+++++
+++
+++++++
+++++++
++
+++
++
+++++++++
++++++
+++
++++
+
+
+
+
+
+
++++
+
+
++
+
+++
+
++++++
++++
++++
+++++
++++++++++
++++
++++++
++++++
+
++
+
++
+
+++
+
++
--------
----------
----
--
---
--
--------
---
----
---
----
-
-
-
-
-
-
----
-
-
--
-
---
-
-----
----
-------
-------
--------
-----
-----
----
-
--
-
--
-
-
-
-
-
--
M/Z
2200 2220 2240 2260 2280 2300
0.0
0.5
1.0
1.5
2.0
INT
EN
SIT
Y
++++++++
++++++++++
++++
++
+++
++
+++++++++
++++++
+++
++++
+
+
+
+
+
+
++++
+
+
++
+
++++
+++++
+
+++
+++++++
++++
++++++
+++++
++++
++++
++
+++
++
++
+
++
+
++
+
+
++
------
--
----------
----
--
---
--
-----
------
----
---
----
-
-
-
-
-
-
----
-
-
--
-
----
-----
----
-----
------
------
---
----
--------
----
-
--
-
--
-
--
-
-
--
M/Z
2090 2100 2110 2120 2130
PCA function 1 (Percentage of variability 92.2 )
01
23
45
6
INT
EN
SIT
Y
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++++++
+++++++
++++++++++++++++++++++++++++++++++++++
+++++++++
+
----------------------------------------------
-------------------------
----
------
--------------------------------------------
---
M/Z
Functional Principal Components Analysis (Not Rotated)
2090 2100 2110 2120 2130
PCA function 2 (Percentage of variability 2.7 )
1.0
1.5
2.0
2.5
3.0
3.5
INT
EN
SIT
Y
++++++++++++++++++++
++++++++++++++++
++++++++++
+++++++++++++++
+++++
++++++++++
+++
+++++
+++++++++++++
+++++++++
+++++++
++++
++++
++++
++
+
------------------
-----
-------------
----------
-------------------
----
----------
----------
------------------
----
------
-------
---
-
M/Z
2090 2100 2110 2120 2130
PCA function 3 (Percentage of variability 2.5 )
1.0
1.5
2.0
2.5
3.0
INT
EN
SIT
Y
++++++++++++++++++++++++++++++++
+++++++++++++
++++++
++++++++++
+++++++++++
+++++++
++++++++++++
++++++++++++
+++
++++
+++++++
++++
+++
+
++
+
------------------------
----------------------
------------------
--------
-
-
---
-
-
-
-----
-------------
-
-
-----
-
-
------
----
----
----
--
-
M/Z
2090 2100 2110 2120 2130
PCA function 4 (Percentage of variability 0.6 )
1.0
1.5
2.0
2.5
3.0
INT
EN
SIT
Y
+++++++++++++++++++++++++++++++++++
++++++++++
+++
++++++++
++++++++
++++++
++
+
+++++
+++++
+++++
+++++++++++
+++++
++++++
+++++++
++++
+++
+
++
+
------------------------
---------
-------------
----------
---
------------
--
----
--
-
----
-------
---------------
-
------
----
----
---
---
-
M/Z
Conclusion
• Producing large data sets has become easier except perhaps for selecting individuals with a particular disease status
• With scientific and statistical reasoning, the advances in experimentation technology can be used to speed biomarker development
• Statisticians have a role in formulating overall experimental strategy, allocating effort among different approaches