+ All Categories
Home > Documents > Representative sampling of large kernel lots III. General considerations on sampling heterogeneous...

Representative sampling of large kernel lots III. General considerations on sampling heterogeneous...

Date post: 19-Nov-2023
Category:
Upload: independent
View: 0 times
Download: 0 times
Share this document with a friend
8
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright
Transcript

This article appeared in a journal published by Elsevier. The attachedcopy is furnished to the author for internal non-commercial researchand education use, including for instruction at the authors institution

and sharing with colleagues.

Other uses, including reproduction and distribution, or selling orlicensing copies, or posting to personal, institutional or third party

websites are prohibited.

In most cases authors are permitted to post their version of thearticle (e.g. in Word or Tex form) to their personal website orinstitutional repository. Authors requiring further information

regarding Elsevier’s archiving and manuscript policies areencouraged to visit:

http://www.elsevier.com/copyright

Author's personal copy

Representative sampling of largekernel lots III. Generalconsiderations on samplingheterogeneous foodsKim H. Esbensen, Claudia Paoletti, Pentti Minkkinen

Part I reviewed the Theory of Sampling (TOS) as applied to quantitation of genetically-modified organisms (GMOs). Part II re-

analyzed KeLDA data from a variographic analysis perspective and estimated Total Sampling Error (TSE) versus Total Analytical

Error (TAE).

Results from this analysis are here used as a basis for developing a general approach to optimization of kernel-sampling

protocols that are fit for purpose (i.e. scaled with respect to the effective heterogeneity while simultaneously sufficiently accurate

to detect critically low concentration levels). While TAE is significantly large for GMO quantitation, TSE can still be up two orders

of magnitude larger, signifying that efforts to reduce GMO-analysis uncertainties should focus on improving or optimizing

sampling plans and not on further refinements of analytical precision.

For GMO testing based on the current labeling threshold (0.9%) of European Union regulations, KeLDA re-analysis results

show that the number of increments needed (Q) for reliable characterization of lots with significant heterogeneities range

between 42 (highly heterogeneous lots) and 17 (close to uniform materials). We outline how it is always possible to estimate TSE

from a simple variographic experiment based on TOS� process-sampling requirements. This approach is universal and can be

carried over from the GMO case to all other (static or dynamic) sampling scenarios and materials dealing with impurities,

contaminants, or trace concentrations, without any loss of generality. A proper basis for TOS-based process sampling is essential

for any meaningful definition of ‘‘appropriate sampling plans’’ (i.e. sampling plans minimizing TSE as function of the specific

heterogeneity of any given lot). If unit-operation costs for sampling and analysis are known, sampling plans can also be optimized

with respect to overall costs.

We discuss the degree to which the present results can be generalized regarding official monitoring and inspection of food and

feed materials. What is presented here in effect constitutes a contribution towards a comprehensive, horizontal process-sampling

standard for heterogeneous materials in general.

ª 2011 Elsevier Ltd. All rights reserved.

Keywords: Bulk commodity; Contaminant; Fit-for-purpose sampling plan; Genetically-modified organism (GMO); Kernel sampling; Process

sampling; Representative sampling; Theory of Sampling (TOS); Trace constituent; Variographic analysis

1. Introduction

Data from 15 soybean KeLDA lots were re-analyzed from the perspective of the The-ory of Sampling (TOS) [1] in order to pro-vide reliable estimates of Total SamplingError (TSE) and Total Analytical Error(TAE) for the case of GMO quantitation.Part II presented extensive results [2], thesignificance of which we discuss here.Based on these results, we propose strate-gies to reduce Global Estimation Error(GEE) and outline a systematic approach

Kim H. Esbensen*

Geological Survey of Denmark and Greenland (GEUS),

Ø. Voldgade 10, DK-1350 Copenhagen K., Denmark

ACABS Research Group, Aalborg University, Campus Esbjerg (AAUE), Denmark

Claudia Paoletti

European Food Safety Authority (EFSA), Parma, Italy

Pentti Minkkinen

Lappeenranta University of Technology, Lappeenranta, Finland

ACABS Research Group, Aalborg University, Campus Esbjerg (AAUE), Denmark

*Corresponding author.

E-mail: [email protected]

Trends Trends in Analytical Chemistry, Vol. 32, 2012

178 0165-9936/$ - see front matter ª 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.trac.2011.12.002

Author's personal copy

for optimization of existing (or new) sampling plans forkernels and similar aggregate materials with trace andultra-trace constituents.

1.1. Relevance and significance of the KeLDA data setA fundamental prerequisite for assessing the general-ization potential of the present results is to establish towhat degree the particular set of KeLDA lots are repre-sentative of soybean shipments coming into Europe on aroutine basis. Every year over 15 · 106 t of soybeans areimported within the European Union (EU) [3]. In thislight, the KeLDA lot, totaling some 421,000 t, corre-sponds to only �1/3 per mill of the total arriving ton-nage. It is possible to argue that this minisculeproportion, in itself, cannot constitute a valid basis forgeneralizations.

Against such an argument, the following can be lev-eled. The KeLDA GMO heterogeneities were found tospan an extreme range with anything from an essen-tially uniform (a few) to very severely heterogeneousdispositions (many) (Part II [2]). The internal heteroge-neities revealed in the original KeLDA data are highlysignificant and contradict all earlier assumptions ofrandomness. Furthermore, the KeLDA data cover theentire range of GMO concentrations of possible interest(from significantly below the EU legal labeling thresholdof 0.9% GMO to well above 90% GMO content) [2].KeLDA therefore constitutes a reliable dataset uponwhich to make meaningful general inferences on kernel-lot-distribution patterns. In this context, it is logical andscientifically correct to expect comparable intrinsic het-erogeneities, as in the range revealed, all of which will

deviate significantly from randomness (unless the con-trary is demonstrated on a case-by-case basis).

The conclusion is unavoidable: any sampling strategybased on assumed, but unsubstantiated, randomness isfatally wrong and should no longer be perpetuated,otherwise unacceptable TSE will per force go unnoticed.The present baseline evaluation therefore provides a veryreasonable estimate of the within-lot and between-lotvariations that can be expected in real-life situations.However, to err on the safe side, the TSE estimates pro-vided in this series must be considered as minimumestimates of the magnitude of the total sampling errorthat can be expected among lots. If one has to set onecommon standard, it would be only prudent and com-mensurate with the policy of preventive action to as-sume that the heterogeneity met with in any futureunspecified cargo corresponds to the most heterogeneouslot encountered in KeLDA (Lot #1).

1.2. Optimal number of increments for compliance withEU legal requirementsFig. 1 shows the end result of the re-analysis of theKeLDA data in Part II.

Widely different heterogeneities need a differentnumber of increments in each composite sample in orderto comply with the official EU lower limit of 0.9% GMOlabeling, as evidenced by the increment number intervalin Fig. 1. A model based on the least heterogeneous lot(Lot #8) requires only 17 increments, whereas the modelbased on the most heterogeneous lot (Lot #1) requires42 increments in each composite sample, and a theo-retical random distribution requires 12 increments.

0 20 40 60 80 100

0.4

0.5

0.6

0.7

0.8

0.9

1

Low

er c

onf.

Inte

rval

(%)

No. of increments

0.3

Figure 1. Estimation of the required number of increments needed to respect a lower 95% confidence interval equal to 0.9%, if the mean GMOcontent aL = 1% and the lot heterogeneity characteristic is commensurate with: (i) a random binomial mixture distribution, (ii) KeLDA Lot #8(most uniform occurrence) and Lot #1 (KeLDA�s most heterogeneous lot). The required number of increments is found as the intersection ofthe three curves based on the characteristics developed in Part II with the horizontal line at level: 0.9%. The resulting number of incrementsis 12, 17 and 42, respectively, on which there is further comment in the text.

Trends in Analytical Chemistry, Vol. 32, 2012 Trends

http://www.elsevier.com/locate/trac 179

Author's personal copy

There will naturally exist many lots with less severeheterogeneities than Lot #1, both in future soybeanshipments and in innumerable other types of ‘‘similar’’lots and materials. If the specific heterogeneity of a lotwas known, the approach outlined in this study wouldsuffice to scale the appropriate number of increments (Q)to the sampling task. Alas, the problem often encoun-tered is a marked reluctance to change any existingsampling procedure, far less assessing an existing pro-cedure. We hope that the present results clearly showthe advantages of an informed approach over this atti-tude. It does not take much to be able to estimate thespecific heterogeneity of any 1-D lot – TOS� variographicexperiment is all that is ever needed (Part II [2]).

This will be at the cost of taking 100 increments andanalyzing these individually for the purpose of charac-terizing the lot type in question once and for always. Thisis the price necessary for establishment of any lot-dependent TOS-certified sampling procedure – noshortcuts are possible. After such a specific heterogeneitycharacterization, there will only remain the problem-dependent, lot-type-optimized minimum sampling andanalytical costs involved in all routine analysis thereaf-ter. These costs are not unacceptable, from whateverpoint of view. For process sampling with automatedcross-stream sampling, it is irrelevant how many incre-ments are employed – the automated sampling systemwill oblige and will aggregate the Q increments into onecomposite sample at virtually zero cost per incrementunit. There will always be laboratory costs involved inmass reduction, but this step is carried out in all ana-lytical tasks anyway, and is often also automated, so theprimary sample mass is also an almost irrelevant issue.

There is one other way to circumnavigate the abovedilemma if one per force does not wish to perform avariographic experiment. One must then assume theworst-case scenario from KeLDA (Lot #1) and thereforeuse Q = 42 increments (or higher) for composite sam-pling. Because of KeLDA�s documented range of hetero-geneities, the worst case stands a very good chance ofbeing conservative in all but the most extreme cases. Theabove cost arguments also apply in this case. Because ofusing automated sampling and mass-reduction equip-ment (TOS-certified procedures), there is no possibleargument against composite sampling, regardless of Q.

There exist no other options for avoiding the dilemmaof unknown lot heterogeneity. It is equally irresponsibleto assume a random distribution or to rely on a specificsampling procedure that is not TOS-certified. Argumentsmight still crop up regarding ‘‘certain’’ materials and lottypes – which are claimed to be ‘‘sufficiently’’ homoge-neous to avoid the demands for lot-dependent charac-terization. Upon reflection, it is irresponsible to try tododge this issue: For not paying the cost of a singlevariographic experiment, one is willing to run the risk of‘‘responsible decisions’’ having to be based upon an

undocumented, very probably unrepresentative data-base – decisions regarding legal compliance(s), publichealth, food safety, contamination, toxicity, trading –irresponsible indeed!

1.2.1. Need for proper skills, training and competence. Oneof the objectives of this study is to promote a morecomprehensive understanding of the necessary knowl-edge, training and competence behind responsible sam-pling. This requires convergence of theoretical TOSprinciples and practical international sampling guide-lines as used today for routine testing and quantitation.It is important to understand that what is defined as TSEin TOS is but one of two components (the other beingTAE) of what is termed ‘‘risk’’ in the context of larger-scoped testing and monitoring programs, conceived andexecuted for routine testing in order to check for com-pliance with legislative requirements. By ‘‘risk’’ is gen-erally meant the unavoidable uncertainty alwaysassociated with any experimental results. This risk willnever be zero. It can take on almost any magnitude if nospecial attention is paid to the sampling issues – and itcan be reduced to a required level only if the principlesfrom TOS and the empirical lessons from the presentstudy are heeded. Below, we discuss these issues in de-tail.

As far as the physical aspects of sampling are con-cerned, there are two types of consideration that mayinfluence the choice of a sampling-and-analysis protocoldevelopment: statistical or non-statistical. Statisticalconsiderations include the acceptance of estimation re-sults with respect to a pre-selected level of tolerable‘‘risk’’ or ‘‘uncertainty’’. It is generally understood thatthe lower the tolerable uncertainty, the more laboriousand costly sampling and analysis will have to be. It ishere essential to distinguish between pure statistical is-sues and the kind of sampling error distributional issuesdealt with in the TOS, which are not identical, indeed farfrom it [1–8]. It is critical to distinguish between thefollowing aspects:(1) a sampling bias, which can be reduced or com-

pletely eliminated, but only by following TOS princi-ples (statistics has nothing to offer in this context),but which nevertheless is very often eliminatedfrom serious considerations due to alleged ‘‘practi-cal and economical reasons’’; and,

(2) the remaining sampling variance (imprecision),which essentially can be treated statistically.

These two aspects are only clearly discriminated inTOS� definition of representativity. As shown within thepresent framework, it is possible to derive complete,objective and reliable estimates of the necessary sam-pling plans and their elements (sampling rate, number ofsamples, number of composite samples, and analyticalefforts). On this TOS basis, and on this basis alone, if thetolerable uncertainty level is too high, or the initial

Trends Trends in Analytical Chemistry, Vol. 32, 2012

180 http://www.elsevier.com/locate/trac

Author's personal copy

budget available for the investigation is too low, thesampling and analysis protocol can be modified andoptimized as needed with minimum additional costs.

Non-statistical considerations include precisely thesetypes of financial, labor efforts and time constraints thatare often pointed out – with the seemingly unavoidableverdict ‘‘not practical’’ – or ‘‘not economical’’, however.Unfortunately, such arguments often dominate ordownright rule current design of sampling protocols, atall levels from local sampling to international standard-ization committees, with the consequence that morerelaxed, approximate sampling protocols with large risksand uncertainties are routinely used.

While it is not the responsibility of science to defineacceptable risk thresholds, science would be remiss if itdid not elucidate the critical, often fatal, consequences oftoo flexible an acceptance of such non-statistical issuesalone. Instead, the present work proposes a solid theo-retical framework – the TOS – and a practical tool –variographic analysis – to estimate TSE (and GEE) reli-ably for any sampling scheme applicable to any kind ofmaterial and for any proposed sampling procedure. Anexample of how to work systematically along theseprinciples is given by Esbensen et al. [4].

Application of variographic analysis to GMO samplingwithin the framework of the TOS is offered here as a rolemodel to develop ‘‘appropriate sampling plans’’, or toevaluate the accuracy, the precision and the represen-tativity of procedures and standards that already exist.The proposed approach is equally applicable to all typesof ‘‘similar materials’’ [e.g., commodities in which thecritical interest is on trace and ultra-trace constituents(e.g. mycotoxins) with a similar non-uniform, stronglyirregular distribution]. An early study concerning afla-toxins showed that uncertainty of sampling could bereliably estimated by using TOS, Minkkinen [5,6]. Thereis also an interesting parallel to equally ‘‘very difficult’’sampling issues in the field of biomass conversion, or bio-energy production, based on primary or secondary forestresources {i.e. Thy et al. [7]}. Within the geosciences,there are also many parallels regarding trace-elementgeochemistry, which is governed by exactly the samekind of heterogeneity as has been treated here.

The application of TOS to GMO testing, or to anyanalyte/commodity with similar distribution properties,allows not only derivation of reliable estimates of bothGEE and TSE for any sampling procedure, but also esti-mation of the minimum unavoidable total error (as-sessed as the nugget effect), which can be used as anobjective benchmark for decision making based onalternative risk acceptances. The application of vario-graphic analysis can be applied as a baseline study – butcan also be carried out as a pilot study for character-ization of the empirical heterogeneity versus the TSE,whenever sampling campaigns for new types of lots ornew sampling procedures are planned.

1.2.2. Systematic variographics – optimization of samplingplans. When establishing a sampling plan for food-con-taminant detection – indeed when setting up any prob-lem-dependent sampling plan – the objective is to showhow (official) monitoring and control can be claimed to beefficient i.e. whether it is fit for purpose. It is interestingthat statements on this are not tied in with a specificsampling procedure or plan alone (there are always aninfinite variety of alternative sampling plans and proce-dures) – this issue is a function of the interaction of thespecific sampling procedure and the empirical lot heter-ogeneity. This latter is often the most influential deter-minant, because, in process sampling, very little can bedone to re-constitute the lot material, while all elements inthe sampling system are, in principle and in practice, fullyopen to serious, TOS-derived design, operation andmaintenance. The sin that is most often committed here isthe sin of omission i.e. unwillingness to change or modifyan existing sampling system or unwillingness to invest thevery small effort to educate oneself with respect to TOS.

We here summarize the key elements that must becarried out in order to be able to demonstrate fitness forpurpose, and thus the efficiency, of any process-samplingsystem. The approach does not apply to commodityinspection and food safety alone (GMO, pollutants orotherwise), but is completely generic for all significantlyheterogeneous materials:(1) A sufficient variographic experiment (60–100

increments for a first pilot study) must be appliedto characterize and to quantify the lot heterogene-ity, TAE and TSE.

(2) Var(FSE + GSE) estimates can in favorable instancesbe obtained as the variance of f.ex. 10 repeatedpoint estimates from the sampling target, but thisis contingent upon diligent elimination of ISE. If thislatter cannot be obtained, recourse can still be ta-ken to variographic characterization of the existingsampling procedure, however less than representa-tive this may be. Much can be gained if the TSE isrevealed to be higher than the anticipated level(or higher than a given authoritative threshold) –because it is now known that improvements onthe basic sampling procedure are unavoidable andmandatory (ISE elimination, further GSE reduc-tion).

(3) Alternatively, var(FSE) can in principle be estimatedfrom the famous ‘‘Gy�s formula’’ {see [5,6,8–11] fordetails}. This procedure is mainly used because it al-lows derivation of an estimate of the minimum sizeof individual samples:

MS ¼ C � d3=varðFSEÞin which C is a material constant (constant for agiven grain-size distribution state) and d is the topdiameter of the material (termed d95). In the specific

Trends in Analytical Chemistry, Vol. 32, 2012 Trends

http://www.elsevier.com/locate/trac 181

Author's personal copy

case of soybean, an equation derived for a model ofrandom binomial mixtures {Part I, Equation 6 [1]}can be used. However, such estimates will be ofvar(FSE) only– with no account taken of GSE. Withknown large (very large) heterogeneities at themacro-scale (shipload scale), it would in all likeli-hood be prudent also to expect significant hetero-geneity at meso-scales corresponding to theincrement volume, in which case such sample-massestimates will structurally be too low.

(4) The variogram allows characterization of the so-called long-range heterogeneity, and can be usedto determine an optimized number of increments(Q) to composite (this vital information can beachieved by analyzing the original variogram incre-ments only).

(5) The sampling plan can be based on either a knownworst-case empirical heterogeneity – or, but onlyunder well-specified conditions, an average-caseheterogeneity. This must be based on a fully realis-tic evaluation of the consequences of possible wrongdecisions, which determines the acceptable uncer-tainty level and is in principle a political decision.

(6) Variogram simulations (the effects on TSE of vary-ing Q and r) allow interaction with regulatory ordecision-making authorities. Once a valid vario-graphic study has been performed, it is possible toanswer all questions of the type: ‘‘What is the effectof a changed Q?’’ or ‘‘What is the effect of a differentsampling rate, r?’’. Similarly, regarding interactingalternatives of (Q, r), examples of which can befound in [4–6,8–10].

(7) If the acceptable uncertainty level is fixed (risk), thesampling plan (protocol) can be optimized for mini-mum costs provided that the unit prices of the sam-pling and analytical operations are known. In thisoptimization, representativity must never be com-promised and must always be Priority Number 1– no exceptions allowed. What would be the pur-pose of submitting ‘‘samples’’ for the full laboratorytreatment, with attendant costs, if it cannot beascertained that the samples are representative?

Estimation of TSE as a function of the disparate lot-distribution properties illustrated by KeLDA data clearlyunderlines that sampling errors resulting from inappro-priate sampling plans, which overlook the issue of het-erogeneity, will invariably be inflated and can reach intothe extreme (e.g., nearly 250% in the KeLDA case). Thisis unacceptable under any circumstances. Samplingplans and procedures must be commensurate with theheterogeneity levels of the lot(s) in question.

It is critical to notice that, if a target uncertainty of thelot mean (or tolerable saL ) is specified, the requirednumber of samples to be taken from the lot depends onlyon lot heterogeneity, and not on the size of the lot. Thisflies in the face of almost the complete gamut of existing

standards on sampling, which overwhelmingly stipu-lates a different number of ‘‘samples’’ more or less inlinear proportion to the total mass of the lot. This mis-understanding is a hallmark of ‘‘home-grown statistics’’,which can be observed in some existing and proposedstandards.

In the present context, a thought experiment amplifiesthis point. If the lot concentration is aL = 1% and themaximum tolerable limit for the lower confidenceinterval of the mean at 95% confidence level is 0.1%units, the limiting Q value of 12 increments was ob-tained in the case of an ideal, but non-realistic binomialrandom mixture. But, if the GMO material happens to belocated in one parcel alone (containing 100% GMO),then 100 increments are needed using systematic (orstratified) sampling to obtain the correct result thataL = 1%. If, e.g., only 20 increments are taken, the mostlikely result is that the shipment is 100% non-GMO, but,occasionally, one of the increments will contain 100%GMO, giving the lot mean aL = (1/20) 100% = 5% GMO.

When the required number of increments was esti-mated from the lot with the highest heterogeneity,Q = 42 samples were required, a very strong testamentto the need for empirical heterogeneity-derived samplingplans.

The KeLDA case is particularly interesting in severalways. Not only does it disprove unequivocally all hith-erto assumed general assumptions of ‘‘randomness’’with respect to GMO sampling, but it also provides acomparatively rare example of complete TAE estimatesand of a detailed investigation of the various analyticalerrors contributing to its magnitude. At the critical GMOconcentration level (0.9%), the relative standard devia-tion of TAE is 11.6%. As shown above, the relativestandard deviation of sampling (TSE) depends on sam-pling frequency and on the specific compositing schemeopted for – but is anyhow much larger than the contri-bution of TAE. This makes it highly advantageous to usecomposite sampling to reduce the cost of analysis with-out affecting much the combined uncertainty that in-cludes both sampling and analysis.

Even if our variographic analysis of the KeLDA datasetdemonstrates that both TSE and TAE are large, it alsoshows that TSE can still be orders of magnitude largerthan TAE. This gives a very clear message to the scien-tific, technological, industrial and metrological commu-nities. Efforts to reduce uncertainties in GMO testing andmonitoring plans should focus on optimizing samplingprotocols rather than on further advances of the ana-lytical precision, since any further improvements ofanalytical procedures will be futile if the most criticalsampling issues are not fully addressed. This is a call fordocumented representative GMO sampling, specificallyfor universal use of correct (unbiased) sampling proce-dures. This is emphasized in the TOS, as the principle ofpreventive sampling correctness. No other existing

Trends Trends in Analytical Chemistry, Vol. 32, 2012

182 http://www.elsevier.com/locate/trac

Author's personal copy

guidelines or standards focus to any satisfactory degreeon this issue. Because of its extensive focus on counter-acting practical procedures, TOS furnishes the onlypossible guarantee that unequivocally leads to elimina-tion of any sampling bias.

1.3. GeneralizationBy virtue of the highly heterogeneous KeLDA Lot #1,unlikely to be surpassed by anything but a most extremecounterexample (actually, an even more extreme exam-ple), the required minimum number of increments to meetthe stringent EU GMO quantitation demands has beencalculated as 42. There would appear to be a significantcarrying-over applicability for other lot and materialoccurrences with ‘‘similar’’ heterogeneity characteristics.There would be nothing lost, were this number taken to be50 – there is no additional work involved, since processsampling is mostly automated anyway.

2. Conclusions

(1) A set of 15 randomly selected, independent largesoybean lots with unknown GM content and distri-bution characteristics were investigated at theirport of entry into the EU. KeLDA constitute a bench-mark database to investigate the effects of samplingof 3-D GMO distributions in large kernel lots. Re-sults from the present variographic re-analysis arevalid for testing or quantitation of other contami-nants or trace constituents in similar heterogeneouslots.

(2) The effects of ‘‘fit-for-purpose’’ sampling schemesrelevant for contemporary EU regulation GMO test-ing (critical concentration threshold aL = 1.0%GMO; legal labeling threshold = 0.9% GMO) wereassessed in terms of TAE, TSE and total costs.

(3) GMO quantitation is associated with a significantTAE at all concentration levels of legal and norma-tive interest. For 1% GMO: srðTAEÞ = 11.4%. For 10times smaller trace concentrations (aL = 0.1%),the analytical error is doubled sr(TAE) = 22.4%. Evenso:

(4) Large GEE in GMO quantitation are overwhelm-ingly due to inflated TSE, not to TAE. All efforts tofurther reduce ‘‘GMO analytical uncertainties’’ arefutile, and efforts should instead be directed towardsimproving sampling plans, competence and educa-tion.

(5) KeLDA data were used to assess the merits of com-posite sampling (i.e. aggregating 100 primaryincrements into sets of alternative composite (bulk)samples, as well as the effect of varying the numberof primary increments per lot, based on realisticsampling/analysis scenarios). Taking the actual

TAE and resultant measurement uncertainty intoaccount, with 0.9% GMO as the legal decision limitfor labeling, a minimum of 42 increments have tobe composited in order to characterize heteroge-neous lots with distribution patterns similar to themaximally adverse case of KeLDA Lot #1 (50 as ageneral measure is more prudent still).

(6) Significant savings in analytical costs as comparedto analyzing all primary increments, either byaggregating 100 increments into 10 bulk samplesor 50 primary increments into 5 bulk samples foranalysis, resulted in only small, fully acceptablereductions in the overall precision of the bulkGMO determinations. Composite sampling is univer-sally the better option relative to grab sampling – infull agreement with 60 years of experience with inTOS.

(7) Latent information in process-data series can bebrought about by careful problem-dependent inter-pretation of the corresponding variograms, andTSE can be estimated directly from the basic vario-gram. The power of this approach can be obtainedat no additional sampling or analytical cost – it ispossible to simulate all conceivable samplingschemes that might be contemplated to improveany process-sampling procedure. All process-sam-pling schemes are fully characterized by two param-eters only: sampling rate (r) and number ofincrements per sample (Q) to be employed in com-posite sampling in order to reduce TSE. Esbensenet al. [4] describe in some detail how to work sys-tematically with this perspective on variogramanalysis.

(8) A general approach for deriving fit-for-purpose, reli-able TSE, TAE and GEE has been devised for all lotsthat are to be sampled for the first time (i.e. whenthe lot heterogeneity is unknown). TOS variograph-ic analysis principles (and available software) allowfor universal estimation of the crucial TSE, as longas 60–100 primary increments are sampled in arepresentative fashion and analyzed individually.There is only a need for such a comprehensivebenchmark study for a completely new commodityand/or concerning a completely new sampling pro-cedure.

(9) Application to GMO testing and monitoring wasused as an exemplar to provide a contemporary, rel-evant framework to illustrate how reliable sampling(designs, plans, and procedures) can be developed,revised and optimized. However, application ofTOS principles is universal and can be carried overwithout loss of generality to any other process-sam-pling scenario and any other material with similarheterogeneity characteristics. Spatial and composi-tional heterogeneity is the sole invariant

Trends in Analytical Chemistry, Vol. 32, 2012 Trends

http://www.elsevier.com/locate/trac 183

Author's personal copy

characteristic of all types of lots with respect to sam-pling, irrespective of other material characteristics,forms and lot sizes. This understanding is criticalfor any meaningful definition of ‘‘fit for purpose’’or ‘‘appropriate sampling plans’’, which, from thepresent work, means that Priority Number 1 is tominimize TSE as function of the specific heterogene-ity of the particular lot in question – with the asso-ciated analytical costs as second priority. It isnonsensical to reverse this priority, which wouldmean sacrificing representativity on the altar of eco-nomic convenience in sampling. If one is not inter-ested in representative data, and if one is not willingto pay for a scientifically credible, reliable samplingplan – well, then there is no reason to sample in thefirst place: one might just as well consult a randomnumber table scaled to [0,100]% GMO, which is in-deed the most inexpensive alternative imaginable.(We are grateful to Francis Pitard for permissionto quote this, one of his more illuminative samplingdistinctions).

The principles promulgated here serve as a first at-tempt towards a truly matrix-independent samplingstandard, a so-called horizontal sampling standard, fordynamic 1-D lots [i.e. a sampling standard driven onlyby the effects from heterogeneity and how to counteractthese effects optimally (codified in TOS)]. There is a sig-nificant amount of work ahead in order to ensure thatthis insight will appear in revised forms of present reg-ulatory standards and norm-giving documents for morerestricted materials or material classes, all also purport-ing to deal with representative sampling. This work willnot be easy, but it is hoped that the present contributioncan serve to inspire it.

AcknowledgementsP.M. expresses his gratitude to the Finnish CulturalFoundation for the grant that has helped him to partic-

ipate in this research. Aalborg University, Campus Esb-jerg provided Guest Professor stipends to P.M. in theperiod 2007–2009, also gratefully acknowledged. Allthree authors wish to thank the KeLDA consortium andthe Molecular Biology and Genomics Unit of the JointResearch Centre of the European Commission for theavailability of the original KeLDA data.

Appendix AThe toolbox VARIOGRA (P. Minkkinen) has been used toperform the fundamental variogram calculations above.A freeware program, VARIO (ACABS Research Group),performs identical analysis as well as furnishing agraphical platform for evaluating optional combinationsof Q and r.

References[1] K.H. Esbensen, C. Paoletti, P. Minkkinen, Trends Anal. Chem. 32

(2012). doi:10.1016/j.trac.2011.09.008.

[2] P. Minkkinen, K.H. Esbensen, C. Paoletti, Trends Anal. Chem. 32

(2012). doi:10.1016/j.trac.2011.12.001.

[3] FAOSTAT, 2009 (http://faostat.fao.org/site/535/default.aspx#an-

cor).

[4] K.H. Esbensen, H.H. Friis-Petersen, L. Petersen, J.B. Holm-Nielsen,

P.P. Mortensen, Chemom. Intell. Lab. Syst. 88 (2007) 41.

[5] P. Minkkinen, Chemom. Intell. Lab. Syst. 74 (2004) 85.

[6] P. Minkkinen, Anal. Chim. Acta 196 (1987) 237.

[7] P. Thy, B.M. Jenkins, K.H. Esbensen, Biomass Bioenergy 33

(2009) 1513.

[8] P.M. Gy, Sampling for Analytical Purposes, John Wiley & Sons,

Chichester, West Sussex, UK, 1998.

[9] F.F. Pitard, Pierre Gy�s Sampling Theory and Sampling Practice.

Heterogeneity, Sampling Correctness, and Statistical Process

Control, 2nd Edition., CRC Press, Boca Raton, FL, USA, 1993.

[10] K.H. Esbensen, P. Minkkinen (Editors), Special Issue: 50 Years of

Pierre Gy�s Theory of Sampling, Proc. First World Conf. Sampling

Blending (WCSB1), Tutorials on Sampling: Theory and Practise,

Chemom. Intell. Lab. Syst. 74 (2004) 236.

[11] P.L. Smith, A Primer for Sampling Solids, Liquids and Gases -

Based on the Seven Sampling Errors of Pierre Gy, ASA SIAM,

Philadelphia, PA, USA, 2001.

Trends Trends in Analytical Chemistry, Vol. 32, 2012

184 http://www.elsevier.com/locate/trac


Recommended